首页> 美国卫生研究院文献>Ecology and Evolution >Toward accurate molecular identification of species in complex environmental samples: testing the performance of sequence filtering and clustering methods
【2h】

Toward accurate molecular identification of species in complex environmental samples: testing the performance of sequence filtering and clustering methods

机译:为了对复杂环境样品中的物种进行准确的分子鉴定:测试序列过滤和聚类方法的性能

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

Metabarcoding has the potential to become a rapid, sensitive, and effective approach for identifying species in complex environmental samples. Accurate molecular identification of species depends on the ability to generate operational taxonomic units (OTUs) that correspond to biological species. Due to the sometimes enormous estimates of biodiversity using this method, there is a great need to test the efficacy of data analysis methods used to derive OTUs. Here, we evaluate the performance of various methods for clustering length variable 18S amplicons from complex samples into OTUs using a mock community and a natural community of zooplankton species. We compare analytic procedures consisting of a combination of (1) stringent and relaxed data filtering, (2) singleton sequences included and removed, (3) three commonly used clustering algorithms (mothur, UCLUST, and UPARSE), and (4) three methods of treating alignment gaps when calculating sequence divergence. Depending on the combination of methods used, the number of OTUs varied by nearly two orders of magnitude for the mock community (60–5068 OTUs) and three orders of magnitude for the natural community (22–22191 OTUs). The use of relaxed filtering and the inclusion of singletons greatly inflated OTU numbers without increasing the ability to recover species. Our results also suggest that the method used to treat gaps when calculating sequence divergence can have a great impact on the number of OTUs. Our findings are particularly relevant to studies that cover taxonomically diverse species and employ markers such as rRNA genes in which length variation is extensive.
机译:元条形码有可能成为识别复杂环境样品中物种的一种快速,灵敏和有效的方法。物种的准确分子鉴定取决于生成与生物物种相对应的操作分类单位(OTU)的能力。由于有时使用这种方法对生物多样性的估计很多,因此非常需要测试用于导出OTU的数据分析方法的有效性。在这里,我们评估了使用模拟群落和浮游动物自然群落将复杂样本中的长度可变的18S扩增子聚类到OTU的各种方法的性能。我们比较分析程序,这些程序由以下各项组成:(1)严格和宽松的数据过滤,(2)包含和删除的单例序列,(3)三种常用的聚类算法(mothur,UCLUST和UPARSE)以及(4)三种方法计算序列差异时如何处理比对间隙。根据所使用方法的组合,模拟社区的OTU数量变化了近两个数量级(60-5068 OTU),而自然社区的OTU数量变化了三个数量级(22-22191 OTU)。宽松过滤的使用和单例的包含大大增加了OTU数,而没有增加恢复物种的能力。我们的结果还表明,在计算序列差异时用于处理缺口的方法可能会对OTU的数量产生很大影响。我们的发现与涵盖分类学上多样化的物种并使用长度变化广泛的标记(例如rRNA基因)的研究特别相关。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号