首页> 美国卫生研究院文献>Genetics >A Novel and Fast Approach for Population Structure Inference Using Kernel-PCA and Optimization

【2h】

A Novel and Fast Approach for Population Structure Inference Using Kernel-PCA and Optimization

机译：一种新的基于核主成分分析和优化的人口结构推断方法

代理获取

本网站仅为用户提供外文OA文献查询和代理获取服务，本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文，但由于OA文献来源多样且变更频繁，仍可能出现获取不到、文献不完整或与标题不符等情况，如果获取不到我们将提供退款服务。请知悉。

页面导航

摘要
著录项
相似文献
相关主题

摘要

Population structure is a confounding factor in genome-wide association studies, increasing the rate of false positive associations. To correct for it, several model-based algorithms such as ADMIXTURE and STRUCTURE have been proposed. These tend to suffer from the fact that they have a considerable computational burden, limiting their applicability when used with large datasets, such as those produced by next generation sequencing techniques. To address this, nonmodel based approaches such as sparse nonnegative matrix factorization (sNMF) and EIGENSTRAT have been proposed, which scale better with larger data. Here we present a novel nonmodel-based approach, population structure inference using kernel-PCA and optimization (PSIKO), which is based on a unique combination of linear kernel-PCA and least-squares optimization and allows for the inference of admixture coefficients, principal components, and number of founder populations of a dataset. PSIKO has been compared against existing leading methods on a variety of simulation scenarios, as well as on real biological data. We found that in addition to producing results of the same quality as other tested methods, PSIKO scales extremely well with dataset size, being considerably (up to 30 times) faster for longer sequences than even state-of-the-art methods such as sNMF. PSIKO and accompanying manual are freely available at .

机译：群体结构是全基因组关联研究中的一个混杂因素，增加了假阳性关联的比率。为了对其进行校正，已经提出了几种基于模型的算法，例如ADMIXTURE和STRUCTURE。它们倾向于遭受这样的事实，即它们具有相当大的计算负担，当与大型数据集（如由下一代测序技术产生的数据集）一起使用时，限制了它们的适用性。为了解决这个问题，已经提出了基于非模型的方法，例如稀疏非负矩阵分解（sNMF）和EIGENSTRAT，它们可以随着较大的数据更好地扩展。在这里，我们介绍一种新颖的基于模型的非模型方法，即使用核PCA和优化（PSIKO）进行总体结构推断，该方法基于线性核PCA和最小二乘优化的独特组合，并且可以推断混合系数，组成部分以及数据集的创建者总数。在各种模拟场景以及真实生物数据上，已将PSIKO与现有的领先方法进行了比较。我们发现，除了产生与其他测试方法相同质量的结果外，PSIKO可以很好地缩放数据集大小，与甚至诸如sNMF之类的最新方法相比，更长的序列也可以显着（高达30倍）更快地扩展。 PSIKO和随附的手册可在上免费获得。

著录项

期刊名称 Genetics
作者
Andrei-Alin Popescu; Andrea L. Harper; Martin Trick; Ian Bancroft; Katharina T. Huber;
展开▼
作者单位

展开▼
年(卷),期 2014(198),4
年度 2014
页码 1421–1431
总页数 17
原文格式 PDF
正文语种
中图分类遗传学;
关键词
admixture inference kernel-PCA population structure genome-wide association studies Q-matrix;

机译：混合推断;核-PCA;种群结构;全基因组关联研究;Q-矩阵;

相似文献

外文文献
中文文献
专利

1. A Novel and Fast Approach for Population Structure Inference Using Kernel-PCA and Optimization [J] . Popescu Andrei-Alin, Harper Andrea L., Trick Martin, Genetics: A Periodical Record of Investigations Bearing on Heredity and Variation . 2014,第4期

机译：基于核-PCA和优化的人口结构推断新方法
2. A Novel and Fast Approach for Population Structure Inference Using Kernel-PCA and Optimization [J] . Andrea L. Harper, Andrei-Alin Popescu, Ian Bancroft, Genetics: A Periodical Record of Investigations Bearing on Heredity and Variation . 2014,第4期

机译：基于核-PCA和优化的人口结构推断新方法
3. fastSTRUCTURE: Variational Inference of Population Structure in Large SNP Data Sets [J] . Anil Raj, Jonathan K. Pritchard, Matthew Stephens Genetics: A Periodical Record of Investigations Bearing on Heredity and Variation . 2014,第2期

机译：fastSTRUCTURE：大型SNP数据集中人口结构的变异推断
4. Sequence Analysis Based Adaptive Hierarchical Clustering Approach for Admixture Population Structure Inference [C] . Jun Wang, Xiaoyan Liu Advances in electric and electronics . 2012

机译：基于序列分析的混合层次结构自适应层次聚类方法
5. Inferences about the evolutionary history of geographically structured populations: The intersection of population genetics, biogeography and systematics. [D] . Davison, Daniel. 2006

机译：关于地理结构种群进化史的推论：种群遗传学，生物地理学和系统学的交叉点。
6. fastSTRUCTURE: Variational Inference of Population Structure in Large SNP Data Sets [O] . Anil Raj, Matthew Stephens, Jonathan K. Pritchard 2014

机译：fastSTRUCTURE：大型SNP数据集中人口结构的变异推断
7. A Novel and Fast Approach for Population Structure Inference Using Kernel-PCA and Optimization (PSIKO) [O] . Popescu, Andrei-Alin, Harper, Andrea, Trick, Martin, 2014

机译：基于核-PCA和优化（PSIKO）的人口结构推断的新方法
8. Inference in Multivariate Normal Populations with Structure. Part 1: Inference on Variance when Correlations Are Known [R] . Styan, G. P. H. 1968

机译：具有结构的多元正规种群的推断。第1部分：已知相关性时的方差推断

A Novel and Fast Approach for Population Structure Inference Using Kernel-PCA and Optimization

摘要

著录项

相似文献

相关主题

期刊订阅