首页> 外文会议>Research in Computational Molecular Biology; Lecture Notes in Bioinformatics; 4453 >Minimizing and Learning Energy Functions for Side-Chain Prediction
【24h】

Minimizing and Learning Energy Functions for Side-Chain Prediction

机译:最小化和学习能量函数以进行侧链预测

获取原文
获取原文并翻译 | 示例

摘要

Side-chain prediction is an important subproblem of the general protein folding problem. Despite much progress in side-chain prediction, performance is far from satisfactory. As an example, the ROSETTA program that uses simulated annealing to select the minimum energy conformations, correctly predicts the first two side-chain angles for approximately 72% of the buried residues in a standard data set. Is further improvement more likely to come from better search methods, or from better energy functions? Given that exact minimization of the energy is NP hard, it is difficult to get a systematic answer to this question.In this paper, we present a novel search method and a novel method for learning energy functions from training data that are both based on Tree Reweighted Belief Propagation (TRBP). We find that TRBP can find the global optimum of the ROSETTA energy function in a few minutes of computation for approximately 85% of the proteins in a standard benchmark set. TRBP can also effectively bound the partition function which enables using the Conditional Random Fields (CRF) framework for learning.Interestingly, finding the global minimum does not significantly improve sidechain prediction for an energy function based on ROSETTA's default energy terms (less than 0.1%), while learning new weights gives a significant boost from 72% to 78%. Using a recently modified ROSETTA energy function with a softer Lennard-Jones repulsive term, the global optimum does improve prediction accuracy from 77% to 78%. Here again, learning new weights improves side-chain modeling even further to 80%. Finally, the highest accuracy (82.6%) is obtained using an extended rotamer library and CRF learned weights. Our results suggest that combining machine learning with approximate inference can improve the state-of-the-art in side-chain prediction.
机译:侧链预测是一般蛋白质折叠问题的重要子问题。尽管在侧链预测方面取得了很大进展,但性能仍然不能令人满意。例如,使用模拟退火选择最小能量构象的ROSETTA程序可以正确预测标准数据集中约72%的埋入残基的前两个侧链角。更好的搜索方法或更好的能量函数更有可能带来进一步的改进吗?考虑到能量的精确最小化是NP困难的,因此很难得到这个问题的系统答案。本文提出了一种新颖的搜索方法和一种从训练数据中学习能量函数的新方法,这两种方法都基于Tree重加权信仰传播(TRBP)。我们发现,TRBP可以在几分钟的计算中找到ROSETTA能量函数的全局最优值,这对于标准基准集中的大约85%的蛋白质而言。 TRBP还可以有效地绑定分区函数,从而可以使用条件随机场(CRF)框架进行学习。有趣的是,根据ROSETTA的默认能量项(小于0.1%),找到全局最小值不会显着改善能量函数的侧链预测,而学习新的权重则从72%显着提高到78%。通过使用最近修改的ROSETTA能量函数和一个较弱的Lennard-Jones排斥项,全局最优值的确将预测精度从77%提高到78%。同样,学习新的权重可以将侧链建模进一步提高到80%。最终,使用扩展的旋转异构体库和CRF学习权重,可以获得最高的准确性(82.6%)。我们的结果表明,将机器学习与近似推理相结合可以改善侧链预测的最新技术。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号