首页> 外文期刊>Genetics, selection, evolution >Predicting complex traits using a diffusion kernel on genetic markers with an application to dairy cattle and wheat data
【24h】

Predicting complex traits using a diffusion kernel on genetic markers with an application to dairy cattle and wheat data

机译:使用遗传标记上的扩散核预测复杂性状,并将其应用于奶牛和小麦数据

获取原文
           

摘要

Background Arguably, genotypes and phenotypes may be linked in functional forms that are not well addressed by the linear additive models that are standard in quantitative genetics. Therefore, developing statistical learning models for predicting phenotypic values from all available molecular information that are capable of capturing complex genetic network architectures is of great importance. Bayesian kernel ridge regression is a non-parametric prediction model proposed for this purpose. Its essence is to create a spatial distance-based relationship matrix called a kernel. Although the set of all single nucleotide polymorphism genotype configurations on which a model is built is finite, past research has mainly used a Gaussian kernel. Results We sought to investigate the performance of a diffusion kernel, which was specifically developed to model discrete marker inputs, using Holstein cattle and wheat data. This kernel can be viewed as a discretization of the Gaussian kernel. The predictive ability of the diffusion kernel was similar to that of non-spatial distance-based additive genomic relationship kernels in the Holstein data, but outperformed the latter in the wheat data. However, the difference in performance between the diffusion and Gaussian kernels was negligible. Conclusions It is concluded that the ability of a diffusion kernel to capture the total genetic variance is not better than that of a Gaussian kernel, at least for these data. Although the diffusion kernel as a choice of basis function may have potential for use in whole-genome prediction, our results imply that embedding genetic markers into a non-Euclidean metric space has very small impact on prediction. Our results suggest that use of the black box Gaussian kernel is justified, given its connection to the diffusion kernel and its similar predictive performance.
机译:背景技术可以说,基因型和表型可能以功能性形式联系在一起,而这些功能性形式是定量遗传学中标准的线性加性模型无法很好地解决的。因此,开发统计学习模型以从能够捕获复杂遗传网络结构的所有可用分子信息中预测表型值非常重要。贝叶斯核岭回归是为此目的提出的非参数预测模型。其本质是创建一个称为核的基于空间距离的关系矩阵。尽管建立模型的所有单核苷酸多态性基因型配置的集合是有限的,但过去的研究主要使用高斯核。结果我们试图研究扩散核的性能,该核是专为模拟离散标记输入而开发的,使用的是荷斯坦牛和小麦数据。该内核可以看作是高斯内核的离散化。在荷斯坦数据中,扩散核的预测能力与基于非空间距离的加性基因组关系核的预测能力相似,但在小麦数据中则优于后者。但是,扩散核和高斯核之间的性能差异可以忽略不计。结论结论是,至少对于这些数据,扩散核捕获总遗传方差的能力并不比高斯核更好。尽管扩散核作为基础函数的选择可能具有用于全基因组预测的潜力,但我们的结果暗示将遗传标记嵌入非欧几里德度量空间对预测的影响很小。我们的结果表明,考虑到黑盒高斯核与扩散核的联系及其相似的预测性能,使用黑匣子高斯核是合理的。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号