Resolving Ambiguity of Species Limits and Concatenation in Multilocus Sequence Data forthe Construction of Phylogenetic Supermatrices

Douglas Chesters; Alfried P. Vogler

首页> 外文期刊>Systematic Biology >Resolving Ambiguity of Species Limits and Concatenation in Multilocus Sequence Data forthe Construction of Phylogenetic Supermatrices

【24h】

Resolving Ambiguity of Species Limits and Concatenation in Multilocus Sequence Data forthe Construction of Phylogenetic Supermatrices

机译：解决多基因座序列数据中物种限制和连接的歧义，以构建系统进化超矩阵

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Public DNA databases are becoming too large and too complex for manual methods to generate phylogenetic supermatrices from multiple gene sequences. Delineating the terminals based on taxonomic labels is no longer practical because species identifications are frequently incomplete and gene trees are incongruent with Linnaean binomials, which results in uncertainty about how to combine species units among unlinked loci. We developed a procedure that minimizes the problem of forming multilocus species units in a large phylogenetic data set using algorithms from graph theory. An initial step established sequence clusters for each locus that broadly correspond to the species level. These clusters frequently include sequences labeled with various binomials and specimen identifiers that create multiple alternatives for concatenation. To choose among these possibilities, we minimize taxonomic conflict among the species units globally in the data set using a multipartite heuristic algorithm. The procedure was applied to all available GenBank data for Coleoptera (beetles) including > 10 500 taxon labels and > 23 500 sequences of 4 loci, which were grouped into 11 241 clusters or divergent singletons by the BlastClust software. Within each cluster, unidentified sequences could be assigned to a species name through the association with fully identified sequences, resulting in 510 new identifications (13.9% of total unidentified sequences) of which nearly half were "trans-locus" identifications by clusteringof sequences at a secondary locus. The limits of DNA-based clusters were inconsistent with the Linnaean binomials for 1518 clusters (13.5%) that contained more than one binomial or split a single binomial among multiple clusters. By applying a scoring scheme for full and partial name matches in pairs of clusters, a maximum weight set of 7366 global species units was produced. Varying the match weights for partial matches had little effect on the number of units, although if partial matches were disallowed, the number increased greatly. Trees from the resulting supermatrices generally produced tree topologies in good agreement with the higher taxonomy of Coleoptera, with fewer terminals compared with trees generated according to standard filtering of sequences using species labels. The study illustrates a strategy for assembling the tree-of-life from an ever more complex primary database.

机译：公共DNA数据库变得太大和太复杂，以至于无法通过手动方法从多个基因序列中生成系统发育超矩阵。基于分类标签来描述终端不再可行，因为物种识别常常不完整，并且基因树与Linnaean二项式不符，这导致如何在未链接的基因座中组合物种单位的不确定性。我们使用图论算法开发了一种程序，该程序可最大程度地减少在大型系统发育数据集中形成多基因座物种单位的问题。第一步是为每个基因座建立广泛对应于物种水平的序列簇。这些簇通常包括标有各种二项式和标本标识符的序列，这些序列为串联创建了多种选择。要在这些可能性中进行选择，我们使用多部分启发式算法将数据集中全局物种单元之间的分类冲突最小化。该程序已应用于鞘翅目（甲虫）的所有可用GenBank数据，包括> 10 500个分类单元标签和> 23 500个4个基因座序列，通过BlastClust软件将它们分组为11 241个簇或发散单例。在每个簇中，可以通过与完全识别的序列相关联将未识别的序列分配给物种名称，从而产生510个新的鉴定（占总未鉴定序列的13.9％），其中近一半是通过将序列聚类到一个“跨位点”进行鉴定。次要位置。基于DNA的聚类的限制与1518个聚类的Linnaean二项式不一致（13.5％），后者包含多个二项式或在多个聚类中拆分单个二项式。通过对成对的全名和部分名匹配应用计分方案，产生了7366个全球物种单位的最大权重集。更改部分比赛的比赛权重对单位数量的影响很小，尽管如果不允许部分比赛，则数量会大大增加。从产生的超级矩阵中获得的树通常产生的树形拓扑与鞘翅目的更高分类法非常吻合，与根据使用物种标签的标准序列过滤生成的树相比，树的终端更少。这项研究说明了一种从越来越复杂的主数据库中组装生命树的策略。

著录项

来源
《Systematic Biology》 |2013年第3期|共11页
作者
Douglas Chesters; Alfried P. Vogler;
展开▼
作者单位

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类生物分类学;
关键词
BlastClust; data mining; graph theory; incongruence; multipartite matching; species delimitation; supermatrix;

机译：BlastClust;数据挖掘;图论;不一致;多部分匹配;物种划界;超矩阵;

相似文献

外文文献
中文文献
专利

1. Resolving Ambiguity of Species Limits and Concatenation in Multilocus Sequence Data forthe Construction of Phylogenetic Supermatrices [J] . Douglas Chesters, Alfried P. Vogler Systematic Biology . 2013,第3期

机译：解决多基因座序列数据中物种限制和连接的歧义，以构建系统进化超矩阵
2. Combining complete chloroplast genome sequences with target loci data and morphology to resolve species limits in Triplostegia (Caprifoliaceae) [J] . Niu Yan-Ting, Jabbour Florian, Barrett Russell L., Molecular phylogenetics and evolution . 2018,第期

机译：将完全叶绿体基因组序列与目标基因座数据和形态结合，以解决三位（Caprifoliaceae）的物种限制
3. Combining complete chloroplast genome sequences with target loci data and morphology to resolve species limits in Triplostegia (Caprifoliaceae) [J] . Niu Yan-Ting, Jabbour Florian, Barrett Russell L., Molecular phylogenetics and evolution . 2018,第期

机译：将完全叶绿体基因组序列与目标基因座数据和形态结合，以解决三位（Caprifoliaceae）的物种限制
4. Phylogenetic Relationships among Tibet Rubus (Rosaceae) Species Inferred from Multiple Chloroplast and Nuclear DNA Sequences [C] . Yuanyuan Li, Yan Wang, Furong Guo, International Conference on Biotechnology and Bioengineering . 2019

机译：多种叶绿体和核DNA序列推断的西藏钩骨（Rosaceae）物种中的系统发育关系
5. Species trees and species delimitation with multilocus data and coalescent-based methods: Resolving the speciation history of the Liolaemus darwinii group (Squamata: Tropiduridae) [D] . Camargo Bentaberry, Arley. 2011

机译：物种树木和物种与多层数据和基于聚合的方法的划界：解决达尔什姆斯达尔什里尼群体（Squamata：Tropiduridae）的形态历史
6. Correction: Selection of Orthologous Genes for Construction of a Highly Resolved Phylogenetic Tree and Clarification of the Phylogeny of Trichosporonales Species [O] . -1

机译：更正：直系同源基因的选择用于构建高度分辨的系统发育树并阐明单孢菌属物种的系统发育
7. Evaluating the phylogenetic signal limit from mitogenomes, slow evolving nuclear genes, and the concatenation approach. New insights into the Lacertini radiation using fast evolving nuclear genes and species trees [O] . Mendes Joana, Harris David James, Carranza Salvador, 2016

机译：评估有丝分裂基因组，缓慢发展的核基因和级联方法的系统发育信号极限。使用快速进化的核基因和树种对Lacertini辐射的新见解

Resolving Ambiguity of Species Limits and Concatenation in Multilocus Sequence Data forthe Construction of Phylogenetic Supermatrices

摘要

著录项

相似文献

相关主题

期刊订阅