【24h】

Accurate Detection of Very Sparse Sequence Motifs

机译:精确地检测非常稀疏的序列图案

获取原文

摘要

Protein sequence alignments are more reliable the shorter the evolutionary distance. Here, we align distantly related proteins using many closely spaced intermediate sequences as stepping stones. Such transitive alignments can be generated between anytwo proteins in a connected set, whether they are direct or indirect sequence neighbours in the underlying library of pairwise alignments. We have implemented a greedy algorithm, MaxFlow, using a novel consistency score to estimate the relative likelihood of alternative paths of transitive alignment. In contrast to traditional profile models of amino acid preferences, MaxFlow models the probability that two positions are structurally equivalent and retains high information content across large distances in sequence space. Thus, MaxFlow is able to identify sparse and narrow active-site sequence signatures which are embedded in high-entropy sequence segments in the structure-based multiple alignment of large diverse enzyme superfamilies. In a challenging benchmark, MaxFlow yields better reliability and double coverage compared to available sequence alignment software. This promises to increase information returns from functional and structural genomics, where reliable sequence alignment is a bottleneckto transferring the functional or structural characterization of model proteins to entire protein families and superfamilies.
机译:蛋白质序列比对更可靠,较短的进化距离。在这里,我们使用许多紧密间隔的中间序列对准与垫脚石的近距离相关的蛋白质。可以在连接集中的任何蛋白质中之间产生这种遗传学对准,无论它们是在线对齐的基础库中的直接还是间接序列邻居。我们已经实现了一种贪婪的算法,MAXFlow,使用新的一致性分数来估计传递对准的替代路径的相对可能性。与氨基酸偏好的传统型材模型相比,MAXFLOW模型两个位置在结构上等同的概率并在序列空间中跨越大距离保持高信息内容。因此,MaxFlow能够识别稀疏和窄的有效点序列签名,该序列嵌入在基于结构的大型酶超小心的基于结构的多个对准中的高熵序列段中。在一个具有挑战性的基准测试中,与可用的序列对齐软件相比,MAXFlow产生更好的可靠性和双重覆盖。这承诺从功能和结构基因组学增加信息返回,其中可靠的序列对准是将模型蛋白质的功能或结构表征转移到整个蛋白质家族和超小心的瓶颈。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号