首页> 外文期刊>BMC Bioinformatics >Reliable genomic strategies for species classification of plant genetic resources
【24h】

Reliable genomic strategies for species classification of plant genetic resources

机译:物种植物遗传资源种类的可靠基因组策略

获取原文
           

摘要

To address the need for easy and reliable species classification in plant genetic resources collections, we assessed the potential of five classifiers (Random Forest, Neighbour-Joining, 1-Nearest Neighbour, a conservative variety of 3-Nearest Neighbours and Naive Bayes) We investigated the effects of the number of accessions per species and misclassification rate on classification success, and validated theirs generic value results with three complete datasets. We found the conservative variety of 3-Nearest Neighbours to be the most reliable classifier when varying species representation and misclassification rate. Through the analysis of the three complete datasets, this finding showed generic value. Additionally, we present various options for marker selection for classification taks such as these. Large-scale genomic data are increasingly being produced for genetic resources collections. These data are useful to address species classification issues regarding crop wild relatives, and improve genebank documentation. Implementation of a classification method that can improve the quality of bad datasets without gold standard training data is considered an innovative and efficient method to improve gene bank documentation.
机译:为了满足植物遗传资源收集中容易可靠的物种分类的需求,我们评估了五分类机的潜力(随机森林,邻近,1最接近邻居,我们调查的保守派各种各样的3 - 最近邻居和天真贝父)每种物种的次数和错误分类率对分类成功的影响,并通过三个完整的数据集验证了它们的通用值结果。当不同物种表示和错误分类率时,我们发现保守派的3到最近邻居是最可靠的分类器。通过对三个完整数据集的分析,这一发现显示了通用价值。此外,我们为分类TAK提供了各种选择标记选择。对于遗传资源集合,越来越多地生产大规模的基因组数据。这些数据可用于解决有关作物野生亲属的物种分类问题,并改善基因库文档。实现可以提高没有黄金标准培训数据的坏数据集质量的分类方法被认为是改进基因库文档的创新和有效的方法。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号