【24h】

Gene prioritization via weighted Kendall rank aggregation

机译:通过加权Kendall秩聚合进行基因优先排序

获取原文

摘要

Gene prioritization is a class of methods for discovering genes implicated in the onset and progression of a disease. As candidate genes are ranked based on similarity to known disease genes according to different set of criteria, the overall aggregation of these ranked datasets is a vital step of the prioritization procedure. Aggregation of different lists of ordered genes is accomplished either via classical order statistics analysis or via combinatorial ordinal data fusion. We propose a novel approach to combinatorial gene prioritization via Linear Programming (LP) optimization and use the recently introduced weighted Kendall τ distance to assess similarities between rankings. The weighted Kendall τ distance allows for constructing aggregates that have higher accuracy at the top of the ranking, usually tested experimentally, and it can also accommodate ties in rankings and handle negative outliers. In addition, the Kendall distance does not use quantitative data which in many instances may be unreliable. We illustrate the performance of the prioritization method on a set of test genes pertaining to the Bardet-Biedl syndrome, schizophrenia, and HIV and show that the combinatorial method matches or outperforms state-of-the art algorithms such as ToppGene.
机译:基因优先排序是发现与疾病的发作和发展有关的基因的一类方法。由于候选基因是根据与已知疾病基因的相似性根据不同的标准集进行排名的,因此这些排名的数据集的总体汇总是确定优先次序的重要步骤。通过经典的顺序统计分析或通过组合顺序数据融合来完成不同顺序基因列表的聚集。我们提出了一种通过线性规划(LP)优化进行组合基因优先排序的新方法,并使用最近引入的加权Kendallτ距离来评估排名之间的相似性。加权的Kendallτ距离允许构建排名最高的聚集体,通常通过实验对其进行测试,并且可以容纳排名中的联系并处理负离群值。此外,肯德尔距离不使用定量数据,在许多情况下这可能是不可靠的。我们说明了关于一组与Bardet-Biedl综合征,精神分裂症和HIV有关的测试基因的优先排序方法的性能,并显示了组合方法匹配或胜过了诸如ToppGene之类的最新算法。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号