...
首页> 外文期刊>Frontiers in Genetics >Improved Pre-miRNAs Identification Through Mutual Information of Pre-miRNA Sequences and Structures
【24h】

Improved Pre-miRNAs Identification Through Mutual Information of Pre-miRNA Sequences and Structures

机译:通过pre-miRNA序列和结构的相互信息,改进pre-miRNA的鉴定

获取原文
           

摘要

Playing critical roles as post-transcriptional regulators, microRNAs (miRNAs) are a family of short non-coding RNAs that are derived from longer transcripts called precursor miRNAs (pre-miRNAs). Experimental methods to identify pre-miRNAs are expensive and time-consuming, which presents the need for computational alternatives. In recent years, the accuracy of computational methods to predict pre-miRNAs has been increasing significantly. However, there are still several drawbacks. First, these methods usually only consider base frequencies or sequence information while ignoring the information between bases. Second, feature extraction methods based on secondary structures usually only consider the global characteristics while ignoring the mutual influence of the local structures. Third, methods integrating high-dimensional feature information is computationally inefficient. In this study, we have proposed a novel mutual information-based feature representation algorithm for pre-miRNA sequences and secondary structures, which is capable of catching the interactions between sequence bases and local features of the RNA secondary structure. In addition, the feature space is smaller than that of most popular methods, which makes our method computationally more efficient than the competitors. Finally, we applied these features to train a support vector machine model to predict pre-miRNAs and compared the results with other popular predictors. As a result, our method outperforms others based on both 5-fold cross-validation and the Jackknife test.
机译:microRNA(miRNA)作为转录后调节剂起着至关重要的作用,是一类短的非编码RNA,它们来源于称为前体miRNA(pre-miRNA)的较长转录本。鉴定pre-miRNA的实验方法既昂贵又费时,这提出了计算替代方案的需求。近年来,预测pre-miRNA的计算方法的准确性已大大提高。但是,仍然存在一些缺点。首先,这些方法通常只考虑碱基频率或序列信息,而忽略碱基之间的信息。其次,基于二级结构的特征提取方法通常只考虑全局特征,而忽略局部结构的相互影响。第三,整合高维特征信息的方法在计算上效率低下。在这项研究中,我们为pre-miRNA序列和二级结构提出了一种新颖的基于互信息的特征表示算法,该算法能够捕获RNA二级结构的序列碱基和局部特征之间的相互作用。此外,特征空间小于大多数流行方法的特征空间,这使我们的方法在计算上比竞争者更有效。最后,我们将这些功能应用于训练支持向量机模型以预测pre-miRNA,并将结果与​​其他流行的预测器进行比较。结果,基于5倍交叉验证和Jackknife检验,我们的方法优于其他方法。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号