首页> 外文期刊>Systematic Biology >Resolving arthropod phylogeny: exploring phylogenetic signal within 41 kb of protein-coding nuclear gene sequence
【24h】

Resolving arthropod phylogeny: exploring phylogenetic signal within 41 kb of protein-coding nuclear gene sequence

机译:解决节肢动物的系统发育:探索41 kb蛋白质编码核基因序列内的系统发育信号

获取原文
获取原文并翻译 | 示例
           

摘要

This study attempts to resolve relationships among and within the four basal arthropod lineages (Pancrustacea, Myriapoda, Euchelicerata, Pycnogonida) and to assess the widespread expectation that remaining phylogenetic problems will yield to increasing amounts of sequence data. Sixty-eight regions of 62 protein-coding nuclear genes (approximately 41 kilobases (kb)/taxon) were sequenced for 12 taxonomically diverse arthropod taxa and a tardigrade outgroup. Parsimony, likelihood, and Bayesian analyses of total nucleotide data generally strongly supported the monophyly of each of the basal lineages represented by more than one species. Other relationships within the Arthropoda were also supported, with support levels depending on method of analysis and inclusion/exclusion of synonymous changes. Removing third codon positions, where the assumption of base compositional homogeneity was rejected, altered the results. Removing the final class of synonymous mutations--first codon positions encoding leucine and arginine, which were also compositionally heterogeneous--yielded a data set that was consistent with a hypothesis of base compositional homogeneity. Furthermore, under such a data-exclusion regime, all 68 gene regions individually were consistent with base compositional homogeneity. Restricting likelihood analyses to nonsynonymous change recovered trees with strong support for the basal lineages but not for other groups that were variably supported with more inclusive data sets. In a further effort to increase phylogenetic signal, three types of data exploration were undertaken. (1) Individual genes were ranked by their average rate of nonsynonymous change, and three rate categories were assigned--fast, intermediate, and slow. Then, bootstrap analysis of each gene was performed separately to see which taxonomic groups received strong support. Five taxonomic groups were strongly supported independently by two or more genes, and these genes mostly belonged to the slow or intermediate categories, whereas groups supported only by a single gene region tended to be from genes of the fast category, arguing that fast genes provide a less consistent signal. (2) A sensitivity analysis was performed in which increasing numbers of genes were excluded, beginning with the fastest. The number of strongly supported nodes increased up to a point and then decreased slightly. Recovery of Hexapoda required removal of fast genes. Support for Mandibulata (Pancrustacea + Myriapoda) also increased, at times to "strong" levels, with removal of the fastest genes. (3) Concordance selection was evaluated by clustering genes according to their ability to recover Pancrustacea, Euchelicerata, or Myriapoda and analyzing the three clusters separately. All clusters of genes recovered the three concordance clades but were at times inconsistent in the relationships recovered among and within these clades, a result that indicates that the a priori concordance criteria may bias phylogenetic signal in unexpected ways. In a further attempt to increase support of taxonomic relationships, sequence data from 49 additional taxa for three slow genes (i.e., EF-1 alpha, EF-2, and Pol II) were combined with the various 13-taxon data sets. The 62-taxon analyses supported the results of the 13-taxon analyses and provided increased support for additional pancrustacean clades found in an earlier analysis including only EF-1 alpha, EF-2, and Pol II.
机译:这项研究试图解决四种节肢动物节肢动物谱系(Pancrustacea,Myriapoda,Euchelicerata,Pycnogonida)之间和之内的关系,并评估人们普遍的期望,即仍然存在的系统发育问题将导致序列数据量的增加。对62个蛋白质编码核基因的68个区域(大约41 kb / kb)进行了测序,确定了12种分类学上不同的节肢动物类群和一个节肢动物群。总核苷酸数据的简约性,似然性和贝叶斯分析通常强烈支持由一个以上物种代表的每个基础谱系的单性。节肢动物内部的其他关系也得到了支持,支持水平取决于分析方法以及同义变化的包含/排除。去除第三个密码子位置(假设基本组成均一性的假设被拒绝)会改变结果。去除最后一类同义突变-编码亮氨酸和精氨酸的第一个密码子位置在组成上也是异质的-产生了一个与碱基组成同质性假设相一致的数据集。此外,在这种数据排除方案下,所有68个基因区域分别与碱基组成同质性一致。将似然性分析限制在非同义变化恢复树中,这些树对基础谱系有强大的支持,但对那些包含更多数据集的可变支持的其他组则没有。为了进一步增加系统发生信号,进行了三种类型的数据探索。 (1)各个基因按其非同义变化的平均速率进行排序,并分配了三个速率类别-快,中和慢。然后,分别对每个基因进行bootstrap分析,以查看哪些分类组得到了强有力的支持。五个分类学组得到两个或更多个基因的独立强烈支持,这些基因大多属于慢速或中间类别,而仅由单个基因区域支持的组则倾向于来自快速类别的基因,理由是快速基因提供了一个快速分类。信号不一致。 (2)进行敏感性分析,从最快的开始,排除越来越多的基因。受强烈支持的节点数增加到一个点,然后略有减少。六足纲的恢复需要去除快速基因。通过去除最快的基因,对下颌骨(Pancrustacea + Myriapoda)的支持有时也增加到“强”水平。 (3)通过根据基因对Pancrustacea,Euchelicerata或Myriapoda的恢复能力对基因进行聚类并分别分析三个聚类来评估一致性选择。所有基因簇都恢复了三个一致性进化枝,但有时在这些进化枝之间和内部恢复的关系不一致,这一结果表明先验一致性标准可能以意想不到的方式偏向了系统发生信号。为了进一步增加对分类学关系的支持,将来自三个慢基因(即EF-1 alpha,EF-2和Pol II)的49个其他分类单元的序列数据与各种13个分类单元数据集进行了组合。 62类分类分析支持13类分类分析的结果,并为早期分析中仅包括EF-1 alpha,EF-2和Pol II的其他泛甲类进化枝提供了更多支持。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号