首页> 外文会议>2013 Fifth international conference on computational and information sciences >Comparisons among the Novel Measurements Based on Chi Square Criterion for Sequence Dissimilarity and Their Applications to Phylogeny
【24h】

Comparisons among the Novel Measurements Based on Chi Square Criterion for Sequence Dissimilarity and Their Applications to Phylogeny

机译:基于卡方标准的序列差异性新测量方法的比较及其在系统发育中的应用

获取原文
获取原文并翻译 | 示例

摘要

In this paper, some new measurements based on Chi square test are presented. Protein sequences are characterized by the frequency of occurrence of the 20 amino acids. Each frequency of a protein sequence is deemed as a sample for various populations. The value of Chi square in the Chi square test is used to measure the dissimilarity of each pair of the protein sequence. Furthermore, some transformations based on the Chi square value are done to measure the dissimilarity. For example, taking the unbalance of the length of protein sequences into consideration, we standardize the length of protein sequence as 1000 with the same frequency of amino acid. Other transformations are listed as follow, such as, the P value according to the Chi square distribution and the normal distribution quantile according to the P value. Based on the data for the Eutherian orders using concatenated H-stranded amino acid sequences, we compare the phylogeny trees with these measurements for sequence dissimilarity. In line with the results, some phylogeny trees are agreed with the commonly accepted one for the Eutherians.
机译:本文提出了一些基于卡方检验的新方法。蛋白质序列的特征在于20个氨基酸的出现频率。蛋白质序列的每个频率均视为各种人群的样本。卡方检验中的卡方值用于测量每对蛋白质序列的相异性。此外,基于卡方值进行了一些转换以测量相异性。例如,考虑到蛋白质序列长度的不平衡,我们将氨基酸序列相同的蛋白质序列长度标准化为1000。其他转换如下所示,例如,根据卡方分布的P值和根据P值的正态分位数。基于使用串联的H链氨基酸序列的Eutherian阶数据,我们将系统发育树与这些测量结果进行了序列相似性比较。与结果相吻合的是,一些系统发育树与欧洲共同体公认的树相一致。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号