首页> 外文会议>International work-conference on bioinformatics and biomedical engineering >Prediction of Calmodulin-Binding Proteins Using Short-Linear Motifs
【24h】

Prediction of Calmodulin-Binding Proteins Using Short-Linear Motifs

机译:使用短线性主题预测钙调蛋白结合蛋白。

获取原文

摘要

Prediction of Calmodulin-binding (CaM-binding) proteins plays a very important role in the fields of biology and biochemistry, because Calmodulin binds and regulates a multitude of protein targets affecting different cellular processes. Short linear motifs (SLiMs), on the other hand, have been effectively used as features for analyzing protein-protein interactions, though their properties have not been used in the prediction of CaM-binding proteins. In this study, we propose a new method for prediction of CaM-binding proteins based on both the total and average scores of SLiMs in protein sequences using a new scoring method, which we call Sliding Window Scoring (SWS) as features for the prediction. A dataset of 194 manually curated human CaM-binding proteins and 193 Mitochondrial proteins have been obtained and used for testing the proposed model. Multiple EM for Motif Elucidation (MEME) has been used to obtain new motifs from each of the positive and negative datasets individually (the SM approach) and from the combined negative and positive datasets (the CM approach). Moreover, the wrapper criterion with Random Forest for feature selection (FS) has been applied followed by classification using different algorithms such as k-nearest neighbor (k-NN), support vector machine (SVM), and Random Forest (RF), on a 3-fold cross-validation setup. Our proposed method shows promising prediction results and demonstrates how information contained in SLiMs is highly relevant for prediction of CaM-binding proteins.
机译:钙调蛋白结合(CaM结合)蛋白的预测在生物学和生物化学领域起着非常重要的作用,因为钙调蛋白结合并调节影响不同细胞过程的多种蛋白质靶标。另一方面,短线性基序(SLiMs)已有效地用作分析蛋白质-蛋白质相互作用的特征,尽管它们的性质尚未用于预测CaM结合蛋白。在这项研究中,我们提出了一种新的基于CaLi结合蛋白预测的新方法,该方法基于蛋白质序列中SLiMs的总得分和平均得分,并使用一种新的评分方法,我们将其称为滑动窗口评分(SWS)作为预测的特征。已经获得了194个人工管理的人CaM结合蛋白和193个线粒体蛋白的数据集,并将其用于测试所提出的模型。多个用于基元阐明的EM(MEME)已用于分别从每个正和负数据集(SM方法)以及从组合的负和正数据集(CM方法)获得新的图案。此外,已应用具有随机森林特征选择(FS)的包装标准,然后使用不同的算法(例如k最近邻(k-NN),支持向量机(SVM)和随机森林(RF))进行分类。 3倍交叉验证设置。我们提出的方法显示出有希望的预测结果,并证明SLiMs中包含的信息如何与CaM结合蛋白的预测高度相关。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号