首页> 外文期刊>BMC Evolutionary Biology >BMGE (Block Mapping and Gathering with Entropy): a new software for selection of phylogenetic informative regions from multiple sequence alignments
【24h】

BMGE (Block Mapping and Gathering with Entropy): a new software for selection of phylogenetic informative regions from multiple sequence alignments

机译:BMGE(带有熵的块映射和聚集):一种用于从多个序列比对中选择系统发育信息区的新软件

获取原文
           

摘要

Background The quality of multiple sequence alignments plays an important role in the accuracy of phylogenetic inference. It has been shown that removing ambiguously aligned regions, but also other sources of bias such as highly variable (saturated) characters, can improve the overall performance of many phylogenetic reconstruction methods. A current scientific trend is to build phylogenetic trees from a large number of sequence datasets (semi-)automatically extracted from numerous complete genomes. Because these approaches do not allow a precise manual curation of each dataset, there exists a real need for efficient bioinformatic tools dedicated to this alignment character trimming step. Results Here is presented a new software, named BMGE (Block Mapping and Gathering with Entropy), that is designed to select regions in a multiple sequence alignment that are suited for phylogenetic inference. For each character, BMGE computes a score closely related to an entropy value. Calculation of these entropy-like scores is weighted with BLOSUM or PAM similarity matrices in order to distinguish among biologically expected and unexpected variability for each aligned character. Sets of contiguous characters with a score above a given threshold are considered as not suited for phylogenetic inference and then removed. Simulation analyses show that the character trimming performed by BMGE produces datasets leading to accurate trees, especially with alignments including distantly-related sequences. BMGE also implements trimming and recoding methods aimed at minimizing phylogeny reconstruction artefacts due to compositional heterogeneity. Conclusions BMGE is able to perform biologically relevant trimming on a multiple alignment of DNA, codon or amino acid sequences. Java source code and executable are freely available at ftp://ftp.pasteur.fr/pub/GenSoft/projects/BMGE/ webcite .
机译:背景技术多个序列比对的质量在系统发育推断的准确性中起着重要作用。已经表明,去除歧义对准的区域,以及其他偏差的来源,例如高度可变的(饱和)字符,可以改善许多系统发育重建方法的总体性能。当前的科学趋势是根据从大量完整基因组中自动提取的大量序列数据集(半)构建系统发育树。由于这些方法不允许对每个数据集进行精确的手动管理,因此确实需要专用于此对齐字符修剪步骤的有效生物信息学工具。结果在这里展示了一个新的软件,名为BMGE(带熵的块映射和聚集),该软件旨在选择多序列比对中适合系统发育推断的区域。对于每个字符,BMGE计算与熵值紧密相关的分数。用BLOSUM或PAM相似度矩阵对这些类似熵的得分进行加权计算,以区分每个对齐字符的生物学预期和意料之外的变异性。分数高于给定阈值的连续字符集被认为不适合进行系统发育推断,然后将其删除。仿真分析表明,由BMGE进行的字符修剪可生成导致准确树的数据集,尤其是包括远距离相关序列的比对时。 BMGE还实施了修整和编码方法,旨在最大程度地减少由于组成异质性造成的系统发育重建伪影。结论BMGE能够对DNA,密码子或氨基酸序列的多重比对进行生物学相关的修饰。 Java源代码和可执行文件可从ftp://ftp.pasteur.fr/pub/GenSoft/projects/BMGE/webcite免费获得。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号