...
首页> 外文期刊>BMC Bioinformatics >Validation and functional annotation of expression-based clusters based on gene ontology
【24h】

Validation and functional annotation of expression-based clusters based on gene ontology

机译:基于基因本体的基于表达的聚类验证和功能注释

获取原文
           

摘要

Background The biological interpretation of large-scale gene expression data is one of the paramount challenges in current bioinformatics. In particular, placing the results in the context of other available functional genomics data, such as existing bio-ontologies, has already provided substantial improvement for detecting and categorizing genes of interest. One common approach is to look for functional annotations that are significantly enriched within a group or cluster of genes, as compared to a reference group. Results In this work, we suggest the information-theoretic concept of mutual information to investigate the relationship between groups of genes, as given by data-driven clustering, and their respective functional categories. Drawing upon related approaches (Gibbons and Roth, Genome Research 12:1574-1581, 2002), we seek to quantify to what extent individual attributes are sufficient to characterize a given group or cluster of genes. Conclusion We show that the mutual information provides a systematic framework to assess the relationship between groups or clusters of genes and their functional annotations in a quantitative way. Within this framework, the mutual information allows us to address and incorporate several important issues, such as the interdependence of functional annotations and combinatorial combinations of attributes. It thus supplements and extends the conventional search for overrepresented attributes within a group or cluster of genes. In particular taking combinations of attributes into account, the mutual information opens the way to uncover specific functional descriptions of a group of genes or clustering result. All datasets and functional annotations used in this study are publicly available. All scripts used in the analysis are provided as additional files.
机译:背景技术大规模基因表达数据的生物学解释是当前生物信息学中最重要的挑战之一。特别地,将结果置于其他可用功能基因组学数据(例如现有的生物本体论)的背景下,已经为检测和分类目标基因提供了实质性的改进。一种常见的方法是寻找与参考组相比在一组基因或一组基因中显着丰富的功能注释。结果在这项工作中,我们提出了互信息的信息理论概念,以研究由数据驱动的聚类所给出的基因组及其各自功能类别之间的关系。利用相关方法(Gibbons和Roth,Genome Research 12:1574-1581,2002),我们试图量化个体属性足以在多大程度上表征给定基因组或基因簇的特征。结论我们表明,相互信息为定量评估基因组或簇及其功能注释之间的关系提供了系统的框架。在此框架内,相互信息使我们能够解决和合并几个重要问题,例如功能注释的相互依存性和属性的组合组合。因此,它补充并扩展了在基因组或基因簇中对过度代表的属性的常规搜索。特别是考虑到属性的组合,相互信息为揭示一组基因或聚类结果的特定功能描述开辟了道路。这项研究中使用的所有数据集和功能注释都是公开可用的。分析中使用的所有脚本均作为其他文件提供。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号