Evaluation of text data mining for database curation: lessons learned from the KDD Challenge Cup

机译：评估数据库策策的文本数据挖掘：从KDD挑战杯中吸取的经验教训

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Motivation: The biological literature is a major repository of knowledge. Many biological databases draw much of their content from a careful curation of this literature. However, as the volume of literature increases, the burden of curation increases. Text mining may provide useful tools to assist in the curation process. To date, the lack of standards has made it impossible to determine whether text mining techniques are sufficiently mature to be useful. Results: We report on a Challenge Evaluationtask that we created for the Knowledge Discovery and Data Mining (KDD) Challenge Cup. We provided a training corpus of 862 articles consisting of journal articles curated in FlyBase, along with the associated lists of genes and gene products, as well asthe relevant data fields from FlyBase. For the test, we provided a corpus of 213 new ('blind') articles; the 18 participating groups provided systems that flagged articles for curation, based on whether the article contained experimental evidence for gene expression products. We report on the evaluation results and describe the techniques used by the top performing groups.

机译：动机：生物学文献是一个主要知识库。许多生物数据库从这种文献的仔细策策中吸引了他们的大部分内容。然而，随着文献量增加，策委的负担增加。文本挖掘可以提供有用的工具来协助策策过程。迄今为止，缺乏标准使得不可能确定文本挖掘技术是否足够成熟以有用。结果：我们报告了我们为知识发现和数据挖掘（KDD）挑战杯创建的挑战评估摊。我们提供了862篇文章的培训语料库，包括在Flybase中策划的日记文章，以及相关的基因和基因产品列表，以及来自Flybase的相关数据字段。对于测试，我们提供了213个新的（'盲人'）文章的语料库;根据本文是否含有基因表达产品的实验证据，18个参与组提供了标记用于策委文章的系统。我们报告评估结果，并描述了顶部执行组使用的技术。

著录项

来源
《International Conference on Intelligent Systems for Molecular biology》|2003年||共9页
会议地点
作者
Alexander S. Yen; Lynette Hirschman; Alexander A. Morgan;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类 Q811.4-532;
关键词
text mining; evaluation; curation; genomics; data management;

机译：文本挖掘;评估;策委;基因组学;数据管理;

相似文献

外文文献
中文文献
专利

1. Reply: The evaluation of data mining methods for the simultaneous and systematic detection of safety signals in large databases: lessons to be learned. [J] . Levine JG, Tonning JM, Szarfman A British Journal of Clinical Pharmacology . 2007,第1期

机译：答复：评估在大型数据库中同时和系统地检测安全信号的数据挖掘方法：要汲取的教训。
2. Reply: The evaluation of data mining methods for the simultaneous and systematic detection of safety signals in large databases: lessons to be learned [J] . Jonathan G British Journal of Clinical Pharmacology . 2006,第1期

机译：回复：评估大型数据库中同时进行系统安全信号的数据挖掘方法的评估：经验教训
3. LESSONS LEARNED FROM THE ECML/PKDD DISCOVERY CHALLENGE ON THE ATHEROSCLEROSIS RISK FACTORS DATA [J] . Petr Berka, Jan Rauch, Marie Tomeckova Computing and informatics . 2007,第3期

机译：从ECML / PKDD发现挑战中汲取的关于动脉粥样硬化风险因素的经验教训
4. Evaluation of text data mining for database curation: lessons learned from the KDD Challenge Cup [C] . Alexander S. Yen, Lynette Hirschman, Alexander A. Morgan International Conference on Intelligent Systems for Molecular biology . 2003

机译：评估数据库策策的文本数据挖掘：从KDD挑战杯中吸取的经验教训
5. Using Text Mining to Accelerate Automatic Curation of Biomedical Databases [D] . Jain, Suvir. 2015

机译：使用Text Mining来加速生物医学数据库的自动策序
6. Reply: The evaluation of data mining methods for the simultaneous and systematic detection of safety signals in large databases: lessons to be learned [O] . Jonathan G Levine, Joseph M Tonning, Ana Szarfman 2006

机译：答复：评估同时挖掘和挖掘大型数据库中安全信号的数据挖掘方法：应汲取的教训
7. Evaluation of text data mining for database curation: lessons learned from the KDD Challenge Cup [O] . Yeh, Alexander S., Hirschman, Lynette, Morgan, Alexander A. 2003

机译：评估数据库管理的文本数据挖掘：经验教训来自KDD挑战杯

Evaluation of text data mining for database curation: lessons learned from the KDD Challenge Cup

摘要

著录项

相似文献

相关主题

期刊订阅