首页> 外文会议>Asia-Pacific Bioinformatics Conference >Integrating multiple protein-protein interaction networks to prioritize disease genes: a Bayesian regression approach
【24h】

Integrating multiple protein-protein interaction networks to prioritize disease genes: a Bayesian regression approach

机译:将多种蛋白质 - 蛋白质相互作用网络集成到优先级疾病基因:贝叶斯回归方法

获取原文

摘要

Background: The identification of genes responsible for human inherited diseases is one of the most challenging tasks in human genetics. Recent studies based on phenotype similarity and gene proximity have demonstrated great success in prioritizing candidate genes for human diseases. However, most of these methods rely on a single protein-protein interaction (PPI) network to calculate similarities between genes, and thus greatly restrict the scope of application of such methods. Meanwhile, independently constructed and maintained PPI networks are usually quite diverse in coverage and quality, making the selection of a suitable PPI network inevitable but difficult. Methods: We adopt a linear model to explain similarities between disease phenotypes using gene proximities that are quantified by diffusion kernels of one or more PPI networks. We solve this model via a Bayesian approach, and we derive an analytic form for Bayes factor that naturally measures the strength of association between a query disease and a candidate gene and thus can be used as a score to prioritize candidate genes. This method is intrinsically capable of integrating multiple PPI networks. Results: We show that gene proximities calculated from PPI networks imply phenotype similarities. We demonstrate the effectiveness of the Bayesian regression approach on five PPI networks via large scale leave-one-out cross-validation experiments and summarize the results in terms of the mean rank ratio of known disease genes and the area under the receiver operating characteristic curve (AUC). We further show the capability of our approach in integrating multiple PPI networks. Conclusions: The Bayesian regression approach can achieve much higher performance than the existing CIPHER approach and the ordinary linear regression method. The integration of multiple PPI networks can greatly improve the scope of application of the proposed method in the inference of disease genes.
机译:背景:负责人类遗传性疾病的基因是人类遗传学中最具挑战性的任务之一。基于表型相似性和基因接近的最近的研究表明,在优先考虑候选人疾病的候选基因的巨大成功。然而,大多数这些方法依赖于单一蛋白质 - 蛋白质相互作用(PPI)网络来计算基因之间的相似性,从而大大限制了这些方法的应用范围。同时,独立构建和维护的PPI网络通常在覆盖和质量方面都是多样化的,使得选择合适的PPI网络不可避免但困难。方法:我们采用线性模型来解释疾病表型之间的相似性,所述基因邻近邻位通过一种或多种PPI网络的扩散核定量量化。我们通过贝叶斯方法解决了这一模型,我们推导了贝叶斯因子的分析形式,自然测量查询疾病和候选基因之间关联的强度,因此可以用作优先考虑候选基因的分数。该方法本质上能够集成多个PPI网络。结果:我们展示了从PPI网络计算的基因邻近暗示表型相似性。我们通过大规模休假交叉验证实验证明了贝叶斯回归方法对五个PPI网络的有效性,并在接收器操作特征曲线下的平均秩率和接收器下区域的平均等级比的结果总结了结果( AUC)。我们进一步展示了我们对多个PPI网络集成的方法的能力。结论:贝叶斯回归方法可以实现比现有密码方法和普通线性回归方法更高的性能。多个PPI网络的集成可以大大提高所提出的方法在疾病基因推理中的应用范围。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号