首页> 外文会议>IEEE International Conference on Bioinformatics and Biomedicine >DeepCrystal: A Deep Learning Framework for Sequence-based Protein Crystallization Prediction
【24h】

DeepCrystal: A Deep Learning Framework for Sequence-based Protein Crystallization Prediction

机译:DeepCrystal:用于基于序列的蛋白质结晶预测的深度学习框架

获取原文

摘要

Protein structure determination has primarily been performed using X-ray crystallography. To overcome the expensive cost, high attrition rate and series of trial-and-error settings, many in-silico methods have been developed to predict crystallization propensities of proteins based on their sequences. However, majority of these methods build predictors by extracting features from protein sequences which is computationally expensive and can potentially explode the feature space. We propose, DeepCrystal, a deep learning framework for sequence-based protein crystallization prediction. It uses deep learning to identify proteins which can produce diffraction quality crystals without the need to manually engineer additional biochemical and structural features from sequence. Our model is based on Convolutional Neural Networks (CNNs) which can exploit frequently occurring k-mers and sets of k-mers from the protein sequences to discriminate diffraction quality crystals from non-crystallizable ones. Our model outperforms previous sequence-based protein crystallization predictors in terms of recall, F-score, accuracy and MCC on three independent test sets. DeepCrystal achieves an average improvement of 1.4 %, 12.1% in recall, when compared to its closest competitors, Crysalis II and Crysf respectively. In addition, DeepCrystal attains an average improvement of 2.1%, 6.0% for F-score, 1.9%, 3.9% for accuracy and 3.8%, 7.0% for MCC respectively w.r.t. Crysalis II and Crysf on independent test sets. The standalone source code and models are available at https://github.com/elbasir/DeepCrystal and a web-server is also available at https://deeplearning-protein.qcri.org.
机译:蛋白质结构测定主要是使用X射线晶体学进行的。为了克服昂贵的成本,高磨损率和一系列的试误误差设置,已经开发了许多硅基方法,以预测基于其序列的蛋白质的结晶施力。然而,这些方法中的大多数通过从计算昂贵的蛋白质序列中提取特征来构建预测因子,并且可能爆炸特征空间。我们提出,DeepCrystal,一种用于序列的蛋白质结晶预测的深层学习框架。它使用深度学习来鉴定可以产生衍射质量晶体的蛋白质,而无需手动从序列中手动工程额外的生物化学和结构特征。我们的模型基于卷积神经网络(CNNS),其可以经常出现来自蛋白质序列的k-mers和k-mers,以区分来自不可脱模的衍射质量晶体。我们的模型在三个独立测试集上召回,F分,准确性和MCC方面优于先前的基于序列的蛋白质结晶预测器。与其最接近的竞争机,Crysalis II和Crysf相比,DeepCrystal达到了1.4%,召回的12.1%。此外,DeepCryStal的平均提高2.1%,比评分为2.1%,准确度为1.9%,3.9%,分别为3.8%,7.0%,为MCC为W.R.T.独立测试集Crysalis II和CrySF。独立的源代码和模型可在https://github.com/elbasir/deepcrystal和web-server,https://deeplearning-protein.qcri.org提供。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号