首页> 外文会议>IAPR international conference on pattern recognition in bioinformatics >Ensemble Neural Networks Scoring Functions for Accurate Binding Affinity Prediction of Protein-Ligand Complexes
【24h】

Ensemble Neural Networks Scoring Functions for Accurate Binding Affinity Prediction of Protein-Ligand Complexes

机译:集成神经网络评分功能,可对蛋白质-配体复合物进行准确的结合亲和力预测。

获取原文

摘要

Accurately predicting the binding affinities (BAs) of large sets of protein-ligand complexes (PLCs) is a key challenge in computational biomolecular science, with applications in drug discovery, chemical biology, and structural biology. Since a scoring function (SF) is used to score, rank, and identify drug leads, the fidelity with which it predicts the affinity of a ligand candidate for a protein's binding site has a significant bearing on the accuracy of virtual screening. Despite intense efforts in developing conventional SFs, which are either force-field based, knowledge-based, or empirical, their limited predictive power has been a major roadblock toward cost-effective drug discovery. Therefore, in this work, we present two novel SFs employing a large ensemble of neural networks (NNs) in conjunction with a diverse set of physicochemical and geometrical features characterizing PLCs to predict BA. We build the first ensemble SF, BgN-Score, using the bootstrap aggregation technique, in which hundreds of neural networks are fitted independently to sets of PLCs sampled randomly without replacement from the training data. BgN-Score calculates the final BA score by computing the average of the predictions of the constituent NNs. The boosting approach is used to construct the second ensemble SF, BsN-Score, which is based on an additive stagewise fitting of NNs to random sets of PLCs sampled from the training data. The calculated BA of BsN-Score is a weighted sum of the predictions of the individual NNs in the ensemble. These ensemble SFs are trained on 1105 high-quality PLCs retrieved from the 2007 version of PDBbind. Each PLC in this set is characterized using physicochemical features employed by the empirical SFs X-Score (6 features) and AffiScore (30 features) and calculated by GOLD (14 features), and geometrical features used in the machine-learning SF RF-Score (36 features).
机译:准确预测大组蛋白质-配体复合物(PLC)的结合亲和力(BAs)是计算生物分子科学中的一个关键挑战,并在药物发现,化学生物学和结构生物学中得到应用。由于使用评分功能(SF)来对药物线索进行评分,排名和鉴定,因此预测配体候选物对蛋白质结合位点的亲和力的逼真度对虚拟筛选的准确性具有重要影响。尽管在开发基于力场,知识或经验的常规SF方面付出了巨大的努力,但其有限的预测能力一直是寻求具有成本效益的药物的主要障碍。因此,在这项工作中,我们提出了两个新颖的SF,它们采用了大型的神经网络集成,并结合了表征PLC预测BA的多种理化和几何特征。我们使用引导聚合技术构建了第一个集合SF BgN-Score,其中将数百个神经网络独立地装配到随机采样的PLC集合中,而无需从训练数据中进行替换。 BgN-Score通过计算组成NN的预测平均值来计算最终BA分数。增强方法用于构建第二集合SF BsN分数,该分数基于对训练数据采样的随机PLC组的NN的累加逐步拟合。 BsN-Score的计算得出的BA是集合中各个NN的预测的加权和。这些集成的SF接受了从2007版PDBbind中检索到的1105个高质量PLC的培训。该组中的每个PLC均使用经验SF X-Score(6个特征)和AffiScore(30个特征)并通过GOLD(14个特征)计算的物理化学特征以及在机器学习SF RF-Score中使用的几何特征进行表征。 (36个功能)。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号