首页> 美国卫生研究院文献>American Journal of Human Genetics >Estimation of Haplotype Frequencies Linkage-Disequilibrium Measures and Combination of Haplotype Copies in Each Pool by Use of Pooled DNA Data
【2h】

Estimation of Haplotype Frequencies Linkage-Disequilibrium Measures and Combination of Haplotype Copies in Each Pool by Use of Pooled DNA Data

机译:使用合并的DNA数据估算每个池中的单倍型频率连锁不平衡量和单倍型副本的组合

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

Inference of haplotypes is important for many genetic approaches, including the process of assigning a phenotype to a genetic region. Usually, the population frequencies of haplotypes, as well as the diplotype configuration of each subject, are estimated from a set of genotypes of the subjects in a sample from the population. We have developed an algorithm to infer haplotype frequencies and the combination of haplotype copies in each pool by using pooled DNA data. The input data are the genotypes in pooled DNA samples, each of which contains the quantitative genotype data from one to six subjects. The algorithm infers by the maximum-likelihood method both frequencies of the haplotypes in the population and the combination of haplotype copies in each pool by an expectation-maximization algorithm. The algorithm was implemented in the computer program LDPooled. We also used the bootstrap method to calculate the standard errors of the estimated haplotype frequencies. Using this program, we analyzed the published genotype data for the SAA (n=156), MTHFR (n=80), and NAT2 (n=116) genes, as well as the smoothelin gene (n=102). Our study has shown that the frequencies of major (frequency >0.1 in a population) haplotypes can be inferred rather accurately from the pooled DNA data by the maximum-likelihood method, although with some limitations. The estimated D and D′ values had large variations except when |D| values were >0.1. The estimated linkage-disequilibrium measure ρ2 for 36 linked loci of the smoothelin gene when one- and two-subject pool protocols were used suggested that the gross pattern of the distribution of the measure can be reproduced using the two-subject pool data.
机译:单倍型的推断对许多遗传方法都很重要,包括将表型分配给遗传区域的过程。通常,单倍型的人群频率以及每个受试者的双倍型构型是根据人群样本中受试者的一组基因型估算的。我们已经开发了一种算法,可以通过使用合并的DNA数据来推断每个池中的单体型频率和单体型副本的组合。输入数据是合并的DNA样本中的基因型,每个样本都包含一到六个对象的定量基因型数据。该算法通过最大似然法通过期望最大化算法推断总体中单倍型的频率以及每个池中单倍型拷贝的组合。该算法在计算机程序LDPooled中实现。我们还使用了自举方法来计算估计的单倍型频率的标准误差。使用该程序,我们分析了SAA(n = 156),MTHFR(n = 80)和NAT2(n = 116)基因以及Smoothelin基因(n = 102)的已公开基因型数据。我们的研究表明,通过最大似然法可以从合并的DNA数据相当准确地推断出主要单倍型的频率(在人群中,频率> 0.1),尽管有一些限制。除了| D |时,估计的D和D'值具有较大的变化。值> 0.1。当使用一对象和两对象池方案时,估计的柔滑素基因的36个连接基因座的连锁不平衡量度ρ 2 表明,该度量值分布的总体模式可以使用两主体池数据。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号