Remote protein homology detection using a modularity-based approach

机译：使用基于模块的方法进行远程蛋白质同源性检测

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Clustering homologous proteins is one of the important tasks in functional genomics. Homologous proteins may share common functions. Annotating proteins of unknown function by transferring annotations from their homologues of known annotations is one of the efficient ways to predict protein function. We use a modularity-based method called CD for grouping together homologous proteins. The method employs a global heuristic search strategy to find the partitioning of the weighted adjacency graph with the largest modularity. The weighted adjacency graph is constructed by the sigmodal transformation of all pairwise sequence similarities between all protein sequences in a given dataset. The method has been extensively tested on several subsets from the superfamily level of the SCOP (Structural Classification of Proteins) database, where some homologous proteins have very low sequence similarity. Compared with a widely used method MCL, we observe that the number of clusters obtained by CD is closer to the number of superfamilies in the dataset, the value of the F-measure given by CD is 10% better than MCL on average, and CD is more tolerant to noise to the sequence similarity. Our results indicate that CD is ideally suitable for clustering homologous proteins when sequence similarity is low.

机译：聚类同源蛋白是功能基因组学中的重要任务之一。同源蛋白质可以共享常见功能。通过将注释从已知注释的同源物转移注释来注释未知功能是预测蛋白质功能的有效方法之一。我们使用称为CD的基于模块化的方法，用于将同源蛋白分组。该方法采用全球启发式搜索策略，以找到具有最大模块化的加权邻接图的划分。加权邻接图是由给定数据集中的所有蛋白质序列之间的所有成对序列相似性的Sigmodal转化构成。该方法已广泛测试了来自SCOP（蛋白质的结构分类）数据库的超家族水平的几个子集上，其中一些同源蛋白质具有非常低的序列相似性。与广泛使用的方法MCL相比，我们观察到通过CD获得的簇数更接近数据集中的超自然数量，CD给出的F测量值平均比MCL更好，CD和CD均优于MCL。对序列相似度更容易耐受噪声。我们的结果表明，当序列相似性低时，CD非常适合聚类同源蛋白。

著录项

来源
《2011 International Conference on Information Science and Technology》|2011年|p.1287-1291|共5页
会议地点
作者
Mei Juan; Ji Zhao; Xiaojian Yang; Weican Zhou;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类计算机的应用;
关键词

相似文献

外文文献
中文文献
专利

1. REVEALING REMOTE PROTEIN HOMOLOGY WITH SEQUENCE SIMILARITY AND A MODULARITY-BASED APPROACH [J] . Juan Mei, Xiaojian Yang, Weican Zhou Rivista di Biologia . 2011,第1期

机译：具有序列相似性和基于模块的方法揭示远程蛋白质同源性
2. A dissimilarity-based multiple instance learning approach for protein remote homology detection [J] . Mensi Antonella, Bicego Manuele, Lovato Pietro, Pattern recognition letters . 2019,第Deca期

机译：基于差异的蛋白质多态同源检测的多实例学习方法
3. Protein Remote Homology Detection Based on an Ensemble Learning Approach [J] . Junjie Chen, Bingquan Liu, Dong Huang BioMed research international . 2016,第8期

机译：基于集合学习方法的蛋白质远程同源性检测
4. Remote protein homology detection using a modularity-based approach [C] . Mei Juan, Ji Zhao, Xiaojian Yang, International Conference on Information Science and Technology . 2011

机译：使用基于模块化的方法进行远程蛋白质同源性检测
5. Remote Homology Detection in Proteins Using Graphical Models. [D] . Daniels, Noah Manus. 2013

机译：使用图形模型对蛋白质进行远程同源性检测。
6. Protein Remote Homology Detection Based on an Ensemble Learning Approach [O] . Junjie Chen, Bingquan Liu, Dong Huang 2006

机译：基于集成学习方法的蛋白质远程同源性检测
7. SVM-HUSTLE--an iterative semi-supervised machine learning approach for pairwise protein remote homology detection [O] . A. R. Shah, C. S. Oehmen, B.-J. Webb-Robertson 2008

机译：SVM-HUSTLE - 一种迭代半监督机器学习方法，用于成对蛋白质遥控器检测

Remote protein homology detection using a modularity-based approach

摘要

著录项

相似文献

相关主题

期刊订阅