首页> 外文期刊>Systematic Biology >The Devil in the Details: Interactions between the Branch-Length Prior and Likelihood Model Affect Node Support and Branch Lengths in the Phylogeny of the Psoraceae
【24h】

The Devil in the Details: Interactions between the Branch-Length Prior and Likelihood Model Affect Node Support and Branch Lengths in the Phylogeny of the Psoraceae

机译:细节中的魔鬼:分支长度先验与可能性模型之间的相互作用影响节点的支持度和银耳科系统发育中的分支长度

获取原文
获取原文并翻译 | 示例
           

摘要

In popular use of Bayesian phylogenetics, a default branch-length prior is almost universally applied without knowing how a different prior would have affected the outcome. We performed Bayesian and maximum likelihood (ML) inference of phylogeny based on empirical nucleotide sequence data from a family of lichenized ascomycetes, the Psoraceae, the morphological delimitation of which has been controversial. We specifically assessed the influence of the combination of Bayesian branch-length prior and likelihood model on the properties of the Markov chain Monte Carlo tree sample, including node support, branch lengths, and taxon stability. Data included two regions of the mitochondrial ribosomal RNA gene, the internal transcribed spacer region of the nuclear ribosomal RNA gene, and the protein-coding largest subunit of RNA polymerase II. Data partitioning was performed using Bayes' factors, whereas the best-fitting model of each partition was selected using the Bayesian information criterion (BIC). Given the data and model, short Bayesian branch-length priors generate higher numbers of strongly supported nodes as well as short and topologically similar trees sampled from parts of tree space that are largely unexplored by the ML bootstrap. Long branch-length priors generate fewer strongly supported nodes and longer and more dissimilar trees that are sampled mostly from inside the range of tree space sampled by the ML bootstrap. Priors near the ML distribution of branch lengths generate the best marginal likelihood and the highest frequency of "rogue" (unstable) taxa. The branch-length prior was shown to interact with the likelihood model. Trees inferred under complex partitioned models are more affected by the stretching effect of the branch-length prior. Fewer nodes are strongly supported under a complex model given the same branch-length prior. Irrespective of model, internal branches make up a larger proportion of total tree length under the shortest branch-length priors compared with longer priors. Relative effects on branch lengths caused by the branch-length prior can be problematic to downstream phylogenetic comparative methods making use of the branch lengths. Furthermore, given the same branch-length prior, trees are on average more dissimilar under a simple unpartitioned model compared with a more complex partitioned models. The distribution of ML branch lengths was shown to better fit a gamma or Pareto distribution than an exponential one. Model adequacy tests indicate that the best-fitting model selected by the BIC is insufficient for describing data patterns in 5 of 8 partitions. More general substitution models are required to explain the data in three of these partitions, one of which also requires nonstationarity. The two mitochondrial ribosomal RNA gene partitions need heterotachous models. We found no significant correlations between, on the one hand, the amount of ambiguous data or the smallest branch-length distance to another taxon and, on the other hand, the topological stability of individual taxa. Integrating over several exponentially distributed means under the best-fitting model, node support for the family Psoraceae, including Psora, Protoblastenia, and the Micarea sylvicola group, is approximately 0.96. Support for the genus Psora is distinctly lower, but we found no evidence to contradict the current classification.
机译:在贝叶斯系统发生学的流行使用中,几乎普遍采用默认分支长度先验而不知道不同先验会如何影响结果。我们根据来自地衣芽孢子囊科,伞形科的经验核苷酸序列数据进行了系统发育的贝叶斯和最大似然(ML)推断,其形态学界已引起争议。我们专门评估了贝叶斯分支长度先验和似然模型的组合对Markov链蒙特卡罗树样本的属性(包括节点支持,分支长度和分类群稳定性)的影响。数据包括线粒体核糖体RNA基因的两个区域,核糖体RNA基因的内部转录间隔区和RNA聚合酶II的最大蛋白编码亚基。使用贝叶斯因子进行数据分区,而使用贝叶斯信息标准(BIC)选择每个分区的最佳拟合模型。在给定数据和模型的情况下,较短的贝叶斯分支长度先验会生成更多数量的强支持节点,以及从树型空间中大部分未由ML引导程序探查的短且拓扑相似的树。长分支长度的先验会生成较少的受强力支持的节点,以及更长且更多的不相似树,这些树主要是从ML引导程序采样的树空间范围内采样的。分支长度的ML分布附近的先验会生成最佳边缘可能性和“流氓”(不稳定)分类单元的最高频率。分支长度先验被证明与似然模型相互作用。在复杂的分区模型下推断出的树木受分支长度先验的拉伸效应的影响更大。在给定相同分支长度的情况下,在复杂模型下强烈支持更少的节点。无论哪种模型,与更长的先验先验相比,在最短的分支先验先验下,内部分支在总树长中所占比例更大。由分支长度先验引起的对分支长度的相对影响对于利用分支长度的下游系统发育比较方法可能是成问题的。此外,给定相同的分支长度先验,与更复杂的分区模型相比,简单的未分区模型下的树平均而言更为相似。 ML分支长度的分布显示出比指数分布更适合伽玛或帕累托分布。模型充分性测试表明,BIC选择的最适合模型不足以描述8个分区中的5个分区的数据模式。需要使用更通用的替换模型来解释这些分区中的三个分区中的数据,其中之一也需要非平稳性。两个线粒体核糖体RNA基因分区需要异孔模型。我们一方面发现模糊数据量或到另一个分类单元的最小分支长度距离,另一方面,单个分类单元的拓扑稳定性之间没有显着相关性。在最合适的模型下,通过对几种指数分布的方法进行整合,对包括Psora,Protoblastenia和Micarea sylvicola组在内的伞形科的节点支持约为0.96。对Psora属的支持明显较低,但是我们发现没有证据与当前分类相矛盾。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号