...
首页> 外文期刊>The Plant Cell >The potential of text mining in data integration and network biology for plant research: a case study on Arabidopsis.
【24h】

The potential of text mining in data integration and network biology for plant research: a case study on Arabidopsis.

机译:文本挖掘在数据集成和网络生物学中用于植物研究的潜力:以拟南芥为例的研究。

获取原文
获取原文并翻译 | 示例
           

摘要

Despite the availability of various data repositories for plant research, a wealth of information currently remains hidden within the biomolecular literature. Text mining provides the necessary means to retrieve these data through automated processing of texts. However, only recently has advanced text mining methodology been implemented with sufficient computational power to process texts at a large scale. In this study, we assess the potential of large-scale text mining for plant biology research in general and for network biology in particular using a state-of-the-art text mining system applied to all PubMed abstracts and PubMed Central full texts. We present extensive evaluation of the textual data for Arabidopsis thaliana, assessing the overall accuracy of this new resource for usage in plant network analyses. Furthermore, we combine text mining information with both protein-protein and regulatory interactions from experimental databases. Clusters of tightly connected genes are delineated from the resulting network, illustrating how such an integrative approach is essential to grasp the current knowledge available for Arabidopsis and to uncover gene information through guilt by association. All large-scale data sets, as well as the manually curated textual data, are made publicly available, hereby stimulating the application of text mining data in future plant biology studies.
机译:尽管有各种用于植物研究的数据库,但目前生物分子文献中仍隐藏着大量信息。文本挖掘提供了通过自动处理文本来检索这些数据的必要手段。但是,直到最近,才以具有足够的计算能力来实施大规模文本处理的高级文本挖掘方法。在这项研究中,我们使用适用于所有PubMed摘要和PubMed Central全文的最新文本挖掘系统,评估了一般文本挖掘对于植物生物学研究尤其是网络生物学的潜力。我们目前对拟南芥的文本数据进行广泛的评估,评估这种新资源在植物网络分析中的使用的整体准确性。此外,我们将文本挖掘信息与来自实验数据库的蛋白质-蛋白质和调节相互作用结合在一起。从结果网络中描绘出紧密连接的基因簇,说明了这种整合方法对于掌握拟南芥可用的现有知识以及通过关联有罪感来揭示基因信息是必不可少的。所有大规模数据集以及手动编辑的文本数据都可以公开获得,从而促进了文本挖掘数据在未来植物生物学研究中的应用。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号