首页> 中文期刊> 《计算机科学》 >一种改进的microRNA预测模型集成方法

一种改进的microRNA预测模型集成方法

         

摘要

现有的microRNA预测方法往往存在数据集类不平衡和适用物种单一的问题.针对以上问题,所做主要工作如下:1)提出基于序列熵的分层采样算法,该算法可在保持样本总体分布的基础上,采样生成正样本和负样本数量平衡的训练集;2)提出基于信噪比和相关性的特征选择,用于缩小训练集规模,以达到提高训练速度的目的;3)提出DS-GA算法,用于缩短SVM分类器参数的优化时间,达到减少过拟合的目的;4)结合集成学习的思想,经采样、特征选择、分类器参数优化3个步骤,建立了一种物种间通用的microRNA预测模型.实验表明,该模型有效解决了类不平衡问题,且不局限于单一物种,对混合物种的测试集预测取得了较好效果.%The existing microRNA prediction methods often present the problems of imbalance data set class and single applicable species.In order to solve the above problems,the main work is as follows.Firstly,a hierarchical sampling algorithm based on sequence entropy was proposed,which can generate a training set enhancing balance positive and negative samples based on the overall distribution of the samples.Secondly,a feature selection algorithm based on signal-tonoise ratio and correlation was designed to reduce the scale of training set and achieve the purpose of improving training speed.Thirdly,the DS-GA was proposed to shorten the optimization time of SVM classifier parameters and avoid the over-fitting problem.At last,based on the idea of ensemble learning,a common microRNA prediction model was established by sampling,feature selection and classifier parameter optimization.Experiments show that the model solves the problem of imbalance effectively,it is not limited to a single species and achieves better results for the hybrid species test set prediction.

著录项

相似文献

  • 中文文献
  • 外文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号