...
首页> 外文期刊>BMC proceedings. >Regularized group regression methods for genomic prediction: Bridge, MCP, SCAD, group bridge, group lasso, sparse group lasso, group MCP and group SCAD
【24h】

Regularized group regression methods for genomic prediction: Bridge, MCP, SCAD, group bridge, group lasso, sparse group lasso, group MCP and group SCAD

机译:用于基因组预测的正则化组回归方法:桥梁,MCP,SCAD,组桥梁,组套索,稀疏组套索,组MCP和组SCAD

获取原文
   

获取外文期刊封面封底 >>

       

摘要

Background Genomic prediction is now widely recognized as an efficient, cost-effective and theoretically well-founded method for estimating breeding values using molecular markers spread over the whole genome. The prediction problem entails estimating the effects of all genes or chromosomal segments simultaneously and aggregating them to yield the predicted total genomic breeding value. Many potential methods for genomic prediction exist but have widely different relative computational costs, complexity and ease of implementation, with significant repercussions for predictive accuracy. We empirically evaluate the predictive performance of several contending regularization methods, designed to accommodate grouping of markers, using three synthetic traits of known accuracy. Methods Each of the competitor methods was used to estimate predictive accuracy for each of the three quantitative traits. The traits and an associated genome comprising five chromosomes with 10000 biallelic Single Nucleotide Polymorphic (SNP)-marker loci were simulated for the QTL-MAS 2012 workshop. The models were trained on 3000 phenotyped and genotyped individuals and used to predict genomic breeding values for 1020 unphenotyped individuals. Accuracy was expressed as the Pearson correlation between the simulated true and the estimated breeding values. Results All the methods produced accurate estimates of genomic breeding values. Grouping of markers did not clearly improve accuracy contrary to expectation. Selecting the penalty parameter with replicated 10-fold cross validation often gave better accuracy than using information theoretic criteria. Conclusions All the regularization methods considered produced satisfactory predictive accuracies for most practical purposes and thus deserve serious consideration in genomic prediction research and practice. Grouping markers did not enhance predictive accuracy for the synthetic data set considered. But other more sophisticated grouping schemes could potentially enhance accuracy. Using cross validation to select the penalty parameters for the methods often yielded more accurate estimates of predictive accuracy than using information theoretic criteria.
机译:背景技术基因组预测现已被广泛认为是一种有效的,具有成本效益的方法,并且使用分散在整个基因组中的分子标记来估算育种价值,在理论上是有充分根据的方法。预测问题需要同时估计所有基因或染色体片段的影响,并将它们聚合以产生预测的总基因组育种值。存在许多潜在的基因组预测方法,但相对计算成本,复杂性和易于实施性差异很大,对预测准确性有重大影响。我们使用三种已知准确度的综合特征,以经验评估几种竞争性正则化方法的预测性能,这些方法旨在适应标记物的分组。方法使用每种竞争者方法来评估三个定量特征各自的预测准确性。为QTL-MAS 2012研讨会模拟了性状和一个包含五个染色体的基因组,该五个染色体带有10000个双等位基因单核苷酸多态性(SNP)标记基因座。在3000个表型和基因型个体上对模型进行了训练,并用于预测1020个非表型个体的基因组育种值。准确度表示为模拟真实值与估计育种值之间的皮尔逊相关性。结果所有方法均能准确估算基因组育种值。标记的分组并没有明显改善与预期相反的准确性。与使用信息理论标准相比,通过重复10倍交叉验证选择惩罚参数通常可以提供更好的准确性。结论所考虑的所有正则化方法都能为大多数实际目的提供令人满意的预测准确性,因此在基因组预测研究和实践中应认真考虑。分组标记不能提高所考虑的合成数据集的预测准确性。但是其他更复杂的分组方案可能会提高准确性。与使用信息理论标准相比,使用交叉验证为方法选择惩罚参数通常可以得出更准确的预测准确性估计。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号