首页> 外文会议>International Symposium on Knowledge Exploration in Life Science Informatics(KELSI 2004) >Analysis of Protein/Protein Interactions Through Biomedical Literature: Text Mining of Abstracts vs. Text Mining of Full Text Articles
【24h】

Analysis of Protein/Protein Interactions Through Biomedical Literature: Text Mining of Abstracts vs. Text Mining of Full Text Articles

机译:通过生物医学文献分析蛋白质/蛋白质相互作用:摘要文本挖掘与文本挖掘全文文章

获取原文

摘要

The challenge of knowledge management in the pharmaceutical industry is twofold. First it has to address the integration of sequence data with the vast and growing body of data from functional analysis of genes with the information in huge historical archival databases. Second, as the number of biomedical publications exponentially increases (Medline now contains more than 13 million records), researchers require assistance in order to broaden their vision and comprehension of scientific domains. Analogous to data mining in the sense that it uncovers relationships in information, text mining uncovers relationships in a text collection and leverages the creativity of the knowledge worker in the exploration of these relationships and in the discovery of new knowledge. We describe herein a text mining method to automatically detect protein interactions which are described across a large amount of scientific publications. This method relies on natural language processing to identify protein names, their synonyms and the various interactions they can bear with other proteins. We have then compared text mining analysis on abstracts to the same kind of analysis on full text articles to assess how much information is lost when only abstracts are processed. Our results show that: l)LexiQuest Mine is a very versatile and accurate tool when mining biomedical literature to analyze interactions between proteins. 2)Mining only abstracts can be sufficient and time saving for applications that do not require a high level of detail on a large scale whereas mining full text articles is to be chosen for more exhaustive applications designed to address a specific issue. Availability: LexiQuest Mine is available for commercial licensing from SPSS, Inc.
机译:制药行业知识管理的挑战是双重的。首先,它必须通过从基因的功能分析与巨大的历史档案数据库中的信息来解决序列数据的集成。其次,随着生物医学出版物的数量指数增加(MEDLINE现在含有超过1300万条记录),研究人员需要帮助,以扩大他们对科学域的愿景和理解。类似于数据挖掘的意义上,它在信息中揭示了信息中的关系,文本挖掘在文本收集中揭开了关系的关系,并利用了知识工作人员在探索这些关系中以及发现新知识中的创造力。我们在此描述了一种文本挖掘方法,以自动检测蛋白质相互作用,这些方法在大量的科学出版物中描述。这种方法依赖于自然语言处理来识别蛋白质名称,它们的同义词和它们可以与其他蛋白质承受的各种相互作用。然后,我们对全文文章的摘要进行了对摘要的文本挖掘分析,以评估仅处理摘要时丢失多少信息。我们的结果表明:L)雷赛斯矿井是一种非常多功能和准确的工具,当采矿生物医学文献分析蛋白质之间的相互作用。 2)除了在大型细节上不需要高水平的详细信息,挖掘仅采矿摘要可能是足够的,并且挖掘全文文章是为了选择旨在解决特定问题的更详尽的应用程序。可用性:Lexiquest Ine可用于SPSS,Inc。的商业许可

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号