首页> 美国卫生研究院文献>Bioinformatics >A unified statistical model to support local sequence order independent similarity searching for ligand-binding sites and its application to genome-based drug discovery
【2h】

A unified statistical model to support local sequence order independent similarity searching for ligand-binding sites and its application to genome-based drug discovery

机译:统一的统计模型支持局部序列独立于相似性的配体结合位点搜索及其在基于基因组的药物发现中的应用

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

Functional relationships between proteins that do not share global structure similarity can be established by detecting their ligand-binding-site similarity. For a large-scale comparison, it is critical to accurately and efficiently assess the statistical significance of this similarity. Here, we report an efficient statistical model that supports local sequence order independent ligand–binding-site similarity searching. Most existing statistical models only take into account the matching vertices between two sites that are defined by a fixed number of points. In reality, the boundary of the binding site is not known or is dependent on the bound ligand making these approaches limited. To address these shortcomings and to perform binding-site mapping on a genome-wide scale, we developed a sequence-order independent profile–profile alignment (SOIPPA) algorithm that is able to detect local similarity between unknown binding sites a priori. The SOIPPA scoring integrates geometric, evolutionary and physical information into a unified framework. However, this imposes a significant challenge in assessing the statistical significance of the similarity because the conventional probability model that is based on fixed-point matching cannot be applied. Here we find that scores for binding-site matching by SOIPPA follow an extreme value distribution (EVD). Benchmark studies show that the EVD model performs at least two-orders faster and is more accurate than the non-parametric statistical method in the previous SOIPPA version. Efficient statistical analysis makes it possible to apply SOIPPA to genome-based drug discovery. Consequently, we have applied the approach to the structural genome of Mycobacterium tuberculosis to construct a protein–ligand interaction network. The network reveals highly connected proteins, which represent suitable targets for promiscuous drugs.>Contact:
机译:不能共享整体结构相似性的蛋白质之间的功能关系可以通过检测其配体结合位点相似性来建立。对于大规模比较,准确有效地评估这种相似性的统计意义至关重要。在这里,我们报告了一种有效的统计模型,该模型支持局部序列顺序无关的配体-结合位点相似性搜索。大多数现有的统计模型仅考虑由固定数量的点定义的两个站点之间的匹配顶点。实际上,结合位点的边界未知或取决于结合的配体,从而限制了这些方法。为了解决这些缺点并在全基因组范围内进行结合位点作图,我们开发了一种序列顺序独立的图谱-图谱比对(SOIPPA)算法,该算法能够事先检测未知结合位点之间的局部相似性。 SOIPPA评分将几何,进化和物理信息整合到一个统一的框架中。但是,由于无法应用基于定点匹配的常规概率模型,因此在评估相似性的统计显着性方面提出了重大挑战。在这里,我们发现SOIPPA的结合位点匹配分数遵循极值分布(EVD)。基准研究表明,与以前的SOIPPA版本中的非参数统计方法相比,EVD模型的执行速度至少快2阶,并且更准确。高效的统计分析使将SOIPPA应用于基于基因组的药物发现成为可能。因此,我们将这种方法应用于结核分枝杆菌的结构基因组,以构建蛋白质-配体相互作用网络。该网络揭示了高度连接的蛋白质,这些蛋白质代表了混杂药物的合适靶标。>联系方式:

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号