首页> 美国卫生研究院文献>Bioinformatics >IGESS: a statistical approach to integrating individual-level genotype data and summary statistics in genome-wide association studies
【2h】

IGESS: a statistical approach to integrating individual-level genotype data and summary statistics in genome-wide association studies

机译:IGESS:一种将个体水平基因型数据和汇总统计信息整合到全基因组关联研究中的统计方法

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

>Motivation: Results from genome-wide association studies (GWAS) suggest that a complex phenotype is often affected by many variants with small effects, known as ‘polygenicity’. Tens of thousands of samples are often required to ensure statistical power of identifying these variants with small effects. However, it is often the case that a research group can only get approval for the access to individual-level genotype data with a limited sample size (e.g. a few hundreds or thousands). Meanwhile, summary statistics generated using single-variant-based analysis are becoming publicly available. The sample sizes associated with the summary statistics datasets are usually quite large. How to make the most efficient use of existing abundant data resources largely remains an open question. >Results: In this study, we propose a statistical approach, IGESS, to increasing statistical power of identifying risk variants and improving accuracy of risk prediction by integrating individual level genotype data and summary statistics. An efficient algorithm based on variational inference is developed to handle the genome-wide analysis. Through comprehensive simulation studies, we demonstrated the advantages of IGESS over the methods which take either individual-level data or summary statistics data as input. We applied IGESS to perform integrative analysis of Crohns Disease from WTCCC and summary statistics from other studies. IGESS was able to significantly increase the statistical power of identifying risk variants and improve the risk prediction accuracy from 63.2% (±0.4%) to 69.4% (±0.1%) using about 240 000 variants. >Availability and implementation: The IGESS software is available at . >Contact: or >Supplementary information: are available at Bioinformatics online.
机译:>动机:全基因组关联研究(GWAS)的结果表明,复杂的表型通常会受到许多影响较小的变异的影响,这些变异被称为“多基因性”。通常需要成千上万的样本才能确保以较小的影响识别这些变异的统计能力。但是,通常情况下,研究小组只能获得对样本量有限(例如几百或几千)的个体水平基因型数据的访问权限。同时,使用基于单变量的分析生成的摘要统计信息已公开可用。与摘要统计数据集关联的样本大小通常非常大。如何最大程度地利用现有丰富的数据资源仍然是一个悬而未决的问题。 >结果:在这项研究中,我们提出了一种统计方法IGESS,以通过整合个体水平基因型数据和汇总统计信息来提高识别风险变体的统计能力并提高风险预测的准确性。开发了一种基于变分推理的有效算法来处理全基因组分析。通过全面的模拟研究,我们证明了IGESS优于将个人级别数据或摘要统计数据作为输入的方法的优势。我们应用IGESS对WTCCC进行的克罗恩病进行综合分析,并对其他研究进行汇总统计。 IGESS能够显着提高识别风险变体的统计能力,并使用约240 000个变体将风险预测准确性从63.2%(±0.4%)提高到69.4%(±0.1%)。 >可用性和实现:IGESS软件位于。 >联系方式:或>补充信息:可在在线生物信息学中获得。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号