首页> 外文期刊>PLoS One >Figures of merit and statistics for detecting faulty species identification with DNA barcodes: A case study in Ramaria and related fungal genera
【24h】

Figures of merit and statistics for detecting faulty species identification with DNA barcodes: A case study in Ramaria and related fungal genera

机译:用DNA条形码检测有缺陷物种鉴定的优点和统计数据:ramaria和相关真菌属的案例研究

获取原文
           

摘要

DNA barcoding can identify biological species and provides an important tool in diverse applications, such as conserving species and identifying pathogens, among many others. If combined with statistical tests, DNA barcoding can focus taxonomic scrutiny onto anomalous species identifications based on morphological features. Accordingly, we put nonparametric tests into a taxonomic context to answer questions about our sequence dataset of the formal fungal barcode, the nuclear ribosomal internal transcribed spacer (ITS). For example, does DNA barcoding concur with annotated species identifications significantly better if expert taxonomists produced the annotations? Does species assignment improve significantly if sequences are restricted to lengths greater than 500 bp? Both questions require a figure of merit to measure of the accuracy of species identification, typically provided by the probability of correct identification (PCI). Many articles on DNA barcoding use variants of PCI to measure the accuracy of species identification, but do not provide the variants with names, and the absence of explicit names hinders the recognition that the different variants are not comparable from study to study. We provide four variant PCIs with a name and show that for fixed data they follow systematic inequalities. Despite custom, therefore, their comparison is at a minimum problematic. Some popular PCI variants are particularly vulnerable to errors in species annotation, insensitive to improvements in a barcoding pipeline, and unable to predict identification accuracy as a database grows, making them unsuitable for many purposes. Generally, the Fractional PCI has the best properties as a figure of merit for species identification. The fungal genus Ramaria provides unusual taxonomic difficulties. As a case study, it shows that a good taxonomic background can be combined with the pertinent summary statistics of molecular results to improve the identification of doubtful samples, linking both disciplines synergistically.
机译:DNA条形码可以鉴定生物物种,并在不同的应用中提供了一个重要的工具,例如保护物种和识别病原体,其中许多其他应用程序。如果结合统计测试,DNA条形码可以将分类学审查基于形态学特征的异常物种鉴定。因此,我们将非参数测试放入分类上下文中,以回答关于正式真菌条形码的序列数据集的问题,核核糖体内转录的间隔物(其)。例如,如果专家分类学家生产注释如果序列仅限于大于500 bp的长度,则物种分配是否显着提高?这两个问题都需要一个值来衡量物种鉴定的准确性,通常由正确识别(PCI)的概率提供。许多关于DNA条形码的文章使用PCI的变体来测量物种识别的准确性,但不提供名称的变体,并且没有明确的名称阻碍了不同的变体与学习的研究不可比较的识别。我们提供四个Variant PCI的名称,并显示用于固定数据,他们遵循系统不等式。尽管定制,因此,它们的比较是最低问题的。一些流行的PCI变体特别容易受到物种注释中的错误,对条形码管道的改进不敏感,并且由于数据库增长而无法预测识别准确性,使它们不适合许多目的。通常,分数PCI具有最佳属性作为物种识别的优点。真菌属雷加米亚拉达州提供了不寻常的分类困难。作为一个案例研究,它表明,良好的分类系统可以与分子结果相关的概要统计数据相结合,以改善令人情调的样本的鉴定,同时连接两个学科。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号