【24h】

FASTCHI: AN EFFICIENT ALGORITHM FOR ANALYZING GENE-GENE INTERACTIONS

机译:FASTCHI:一种分析基因-基因相互作用的有效算法

获取原文
获取原文并翻译 | 示例

摘要

Recent advances in high-throughput genotyping have inspired increasing research interests in genome-wide association study for diseases. To understand underlying biological mechanisms of many diseases, we need to consider simultaneously the genetic effects across multiple loci. The large number of SNPs often makes multilocus association study very computationally challenging because it needs to explicitly enumerate all possible SNP combinations at the genome-wide scale. Moreover, with the large number of SNPs correlated, permutation procedure is often needed for properly controlling family-wise error rates. This makes the problem even more computationally demanding, since the test procedure needs to be repeated for each permuted data. In this paper, we present FastChi, an exhaustive yet efficient algorithm for genome-wide two-locus chi-square test. FastChi utilizes an upper bound of the two-locus chi-square test, which can be expressed as the sum of two terms -both are efficient to compute: the first term is based on the single-locus chi-square test for the given phenotype; and the second term only depends on the genotypes and is independent of the phenotype. This upper bound enables the algorithm to only perform the two-locus chi-square test on a small number of candidate SNP pairs without the risk of missing any significant ones. Since the second part of the upper bound only needs to be precomputed once and stored for subsequence uses, the advantage is more prominent in large permutation tests. Extensive experimental results demonstrate that our method is an order of magnitude faster than the brute force alternative.
机译:高通量基因分型的最新进展激发了人们对疾病的全基因组关联研究的研究兴趣。要了解许多疾病的潜在生物学机制,我们需要同时考虑多个基因座的遗传效应。大量的SNP通常使多基因座关联研究在计算上具有挑战性,因为它需要在基因组范围内明确枚举所有可能的SNP组合。此外,由于有大量的SNP相关,通常需要使用置换程序来适当地控制家族错误率。这使得问题在计算上更加苛刻,因为需要针对每个排列的数据重复测试过程。在本文中,我们介绍了FastChi,这是一种用于全基因组两基因座卡方检验的详尽而有效的算法。 FastChi利用两基因座卡方检验的上限,可以将其表示为两个项的总和-两者都可以高效地进行计算:第一项基于给定表型的单基因座卡方检验;第二项仅取决于基因型,与表型无关。此上限使算法仅对少量候选SNP对执行两基因座卡方检验,而不会丢失任何重要的SNP对。由于上限的第二部分只需要预先计算一次并存储供子序列使用,因此在大型置换测试中,该优势更为突出。大量的实验结果表明,我们的方法比蛮力替代方法快一个数量级。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号