首页> 美国卫生研究院文献>other >Protein subnuclear localization based on a new effective representation and intelligent kernel linear discriminant analysis by dichotomous greedy genetic algorithm
【2h】

Protein subnuclear localization based on a new effective representation and intelligent kernel linear discriminant analysis by dichotomous greedy genetic algorithm

机译:基于新有效表示法的蛋白质亚核定位和二分贪婪遗传算法的智能核线性判别分析

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

A wide variety of methods have been proposed in protein subnuclear localization to improve the prediction accuracy. However, one important trend of these means is to treat fusion representation by fusing multiple feature representations, of which, the fusion process takes a lot of time. In view of this, this paper novelly proposed a method by combining a new single feature representation and a new algorithm to obtain good recognition rate. Specifically, based on the position-specific scoring matrix (PSSM), we proposed a new expression, correlation position-specific scoring matrix (CoPSSM) as the protein feature representation. Based on the classic nonlinear dimension reduction algorithm, kernel linear discriminant analysis (KLDA), we added a new discriminant criterion and proposed a dichotomous greedy genetic algorithm (DGGA) to intelligently select its kernel bandwidth parameter. Two public datasets with Jackknife test and KNN classifier were used for the numerical experiments. The results showed that the overall success rate (OSR) with single representation CoPSSM is larger than that with many relevant representations. The OSR of the proposed method can reach as high as 87.444% and 90.3361% for these two datasets, respectively, outperforming many current methods. To show the generalization of the proposed algorithm, two extra standard datasets of protein subcellular were chosen to conduct the expending experiment, and the prediction accuracy by Jackknife test and Independent test is still considerable.
机译:在蛋白质亚核定位中已经提出了各种各样的方法来提高预测精度。然而,这些手段的一个重要趋势是通过融合多个特征表示来处理融合表示,其中融合过程需要大量时间。有鉴于此,本文提出了一种新颖的方法,该方法将新的单特征表示与新算法相结合以获得良好的识别率。具体来说,基于位置特定评分矩阵(PSSM),我们提出了一种新的表达方式,即相关位置特定评分矩阵(CoPSSM)作为蛋白质特征表示。在经典的非线性降维算法内核线性判别分析(KLDA)的基础上,我们添加了一个新的判别准则,并提出了一种二分贪婪遗传算法(DGGA)来智能地选择其内核带宽参数。数值实验使用了两个带有Jackknife检验和KNN分类器的公共数据集。结果表明,单次表示CoPSSM的总体成功率(OSR)大于多次相关表示的成功率。对于这两个数据集,该方法的OSR分别可以达到87.444%和90.3361%,优于许多当前方法。为了说明该算法的一般性,选择了两个额外的蛋白质亚细胞标准数据集进行扩展实验,通过Jackknife检验和Independent检验的预测准确性仍然很高。

著录项

  • 期刊名称 other
  • 作者

    Shunfang Wang; Yaoting Yue;

  • 作者单位
  • 年(卷),期 -1(13),4
  • 年度 -1
  • 页码 e0195636
  • 总页数 20
  • 原文格式 PDF
  • 正文语种
  • 中图分类
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号