【24h】

MINING SPECIES-SPECIFIC SUBSEQUENCES IN BACTERIA TRANSCRIPTION TERMINATORS

机译:细菌转录终止子中特定于矿种的子序列

获取原文
获取原文并翻译 | 示例

摘要

Transcription Terminators (TT) play an important role in bacterial RNA transcription. Some bacteria are known to have Species-Specific Subsequences (SSS) in their TTs, which might contain useful clues to bacterial evolution. Given DNA sequences for known TTs, how to effectively find the SSS is an interesting yet not well-studied computational problem. In this paper, we present a three-step method to identify the SSS by using Support Vector Machines (SVM). Firstly, we find out all frequent subsequences by using generalized suffix trees. Secondly, we use the subsequences as features to vectorize the DNA sequences and trains a SVM classifier. Finally, we output from the SVM classifier the SSS based on a defined measure of subsequence specificity. Our experiments show that the SSS are very close to the known SSS. We conclude that our method exhibits a novel application of classification to biology and is applicable to similar problems.
机译:转录终止子(TT)在细菌RNA转录中起重要作用。已知某些细菌的TT中具有特定物种的子序列(SSS),这可能包含细菌进化的有用线索。给定已知TT的DNA序列,如何有效找到SSS是一个有趣但尚未充分研究的计算问题。在本文中,我们提出了一种使用支持​​向量机(SVM)识别SSS的三步法。首先,我们使用广义后缀树找出所有频繁的子序列。其次,我们使用子序列作为特征来向量化DNA序列并训练SVM分类器。最后,我们基于定义的子序列特异性度量从SVM分类器输出SSS。我们的实验表明,SSS与已知的SSS非常接近。我们得出的结论是,我们的方法在生物学上展现了分类的新颖应用,并适用于类似的问题。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号