...
首页> 外文期刊>Proteins: Structure, Function, and Genetics >DNABind: A hybrid algorithm for structure-based prediction of DNA-binding residues by combining machine learning- and template-based approaches
【24h】

DNABind: A hybrid algorithm for structure-based prediction of DNA-binding residues by combining machine learning- and template-based approaches

机译:DNABind:一种混合算法,通过结合基于机器学习和基于模板的方法来基于结构的DNA结合残基预测

获取原文
获取原文并翻译 | 示例
           

摘要

Accurate prediction of DNA-binding residues has become a problem of increasing importance in structural bioinformatics. Here, we presented DNABind, a novel hybrid algorithm for identifying these crucial residues by exploiting the complementarity between machine learning- and template-based methods. Our machine learning-based method was based on the probabilistic combination of a structure-based and a sequence-based predictor, both of which were implemented using support vector machines algorithms. The former included our well-designed structural features, such as solvent accessibility, local geometry, topological features, and relative positions, which can effectively quantify the difference between DNA-binding and nonbinding residues. The latter combined evolutionary conservation features with three other sequence attributes. Our template-based method depended on structural alignment and utilized the template structure from known protein-DNA complexes to infer DNA-binding residues. We showed that the template method had excellent performance when reliable templates were found for the query proteins but tended to be strongly influenced by the template quality as well as the conformational changes upon DNA binding. In contrast, the machine learning approach yielded better performance when high-quality templates were not available (about 1/3 cases in our dataset) or the query protein was subject to intensive transformation changes upon DNA binding. Our extensive experiments indicated that the hybrid approach can distinctly improve the performance of the individual methods for both bound and unbound structures. DNABind also significantly outperformed the state-of-art algorithms by around 10% in terms of Matthews's correlation coefficient. The proposed methodology could also have wide application in various protein functional site annotations. DNABind is freely available at http://mleg.cse.sc.edu/DNABind/.
机译:DNA结合残基的准确预测已成为结构生物信息学中越来越重要的问题。在这里,我们介绍了DNABind,这是一种通过利用基于机器学习和基于模板的方法之间的互补性来识别这些关键残基的新型混合算法。我们基于机器学习的方法基于基于结构的预测器和基于序列的预测器的概率组合,二者均使用支持向量机算法来实现。前者包括我们精心设计的结构特征,例如溶剂可及性,局部几何形状,拓扑特征和相对位置,可以有效地量化DNA结合残基和非结合残基之间的差异。后者将进化保守性特征与其他三个序列属性结合在一起。我们基于模板的方法依赖于结构比对,并利用已知蛋白质-DNA复合物中的模板结构来推断DNA结合残基。我们发现,当为查询蛋白找到可靠的模板时,模板方法具有出色的性能,但受模板质量以及DNA结合构象变化的强烈影响。相反,当没有高质量的模板时(在我们的数据集中大约为1/3),或者查询蛋白在结合DNA时发生强烈的转化变化时,机器学习方法会产生更好的性能。我们广泛的实验表明,混合方法可以显着提高单个方法对结合和未结合结构的性能。就Matthews的相关系数而言,DNABind的性能也比现有技术高出约10%。所提出的方法还可以在各种蛋白质功能位点注释中广泛应用。 DNABind可从http://mleg.cse.sc.edu/DNABind/免费获得。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号