首页> 美国卫生研究院文献>other >A Computationally Efficient Hypothesis Testing Method for Epistasis Analysis using Multifactor Dimensionality Reduction
【2h】

A Computationally Efficient Hypothesis Testing Method for Epistasis Analysis using Multifactor Dimensionality Reduction

机译:基于多因素降维的上位分析的计算有效假设检验方法

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

Multifactor dimensionality reduction (MDR) was developed as a nonparametric and model-free data mining method for detecting, characterizing, and interpreting epistasis in the absence of significant main effects in genetic and epidemiologic studies of complex traits such as disease susceptibility. The goal of MDR is to change the representation of the data using a constructive induction algorithm to make nonadditive interactions easier to detect using any classification method such as naïve Bayes or logistic regression. Traditionally, MDR constructed variables have been evaluated with a naïve Bayes classifier that is combined with 10-fold cross validation to obtain an estimate of predictive accuracy or generalizability of epistasis models. Traditionally, we have used permutation testing to statistically evaluate the significance of models obtained through MDR. The advantage of permutation testing is that it controls for false-positives due to multiple testing. The disadvantage is that permutation testing is computationally expensive. This is in an important issue that arises in the context of detecting epistasis on a genome-wide scale. The goal of the present study was to develop and evaluate several alternatives to large-scale permutation testing for assessing the statistical significance of MDR models. Using data simulated from 70 different epistasis models, we compared the power and type I error rate of MDR using a 1000-fold permutation test with hypothesis testing using an extreme value distribution (EVD). We find that this new hypothesis testing method provides a reasonable alternative to the computationally expensive 1000-fold permutation test and is 50 times faster. We then demonstrate this new method by applying it to a genetic epidemiology study of bladder cancer susceptibility that was previously analyzed using MDR and assessed using a 1000-fold permutation test.
机译:多因素降维(MDR)被开发为一种无参数且无模型的数据挖掘方法,用于在复杂性状(例如疾病易感性)的遗传和流行病学研究中没有明显主要影响的情况下检测,表征和解释上位性。 MDR的目标是使用构造性归纳算法来更改数据的表示形式,以使使用任何分类方法(例如朴素贝叶斯或逻辑回归)更容易检测到非加性相互作用。传统上,MDR构造变量是通过朴素的贝叶斯分类器进行评估的,该分类器与10倍交叉验证相结合,以获得对上位性模型预测准确性或通用性的估计。传统上,我们使用置换测试来统计评估通过MDR获得的模型的重要性。置换测试的优势在于它可以控制由于多重测试而导致的假阳性。缺点是排列测试的计算量很大。这是在全基因组范围内检测上位性的背景下出现的重要问题。本研究的目的是开发和评估大规模置换测试的几种替代方法,以评估MDR模型的统计意义。使用从70种不同上位性模型模拟的数据,我们比较了使用1000倍置换检验的MDR的功效和I型错误率与使用极值分布(EVD)的假设检验。我们发现,这种新的假设检验方法为计算上昂贵的1000倍置换检验提供了合理的替代方法,并且速度提高了50倍。然后,我们将这种新方法应用于膀胱癌易感性的遗传流行病学研究,该研究先前使用MDR进行了分析,并使用1000倍置换测试进行了评估。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号