首页> 外文会议>Workshop on biomedical natural language processing >Not all links are equal: Exploiting Dependency Types for the Extraction of Protein-Protein Interactions from Text
【24h】

Not all links are equal: Exploiting Dependency Types for the Extraction of Protein-Protein Interactions from Text

机译:并非所有链接都是等:利用依赖类型,以提取来自文本的蛋白质 - 蛋白质相互作用

获取原文

摘要

The extraction of protein-protein interactions (PPIs) reported in scientific publications is one of the most studied topics in Text Mining in the Life Sciences, as such algorithms can substantially decrease the effort for databases curators. The currently best methods for this task are based on analyzing the dependency tree (DT) representation of sentences. Many approaches exploit only topological features and thus do not yet fully exploit the information contained in DTs. We show that incorporating the grammatical information encoded in the types of the dependencies in DTs noticeably improves extraction performance by using a pattern matching approach. We automatically infer a large set of linguistic patterns using only information about interacting proteins. Patterns are then refined based on shallow linguistic features and the semantics of dependency types. Together, these lead to a total improvement of 17.2 percent points in F_1, as evaluated on five publicly available PPI corpora. More than half of that improvement is gained by properly handling dependency types. Our method provides a general framework for building task-specific relationship extraction methods that do not require annotated training data. Furthermore, our observations offer methods to improve upon relation extraction approaches.
机译:在科学出版物报道的蛋白质 - 蛋白质相互作用(生产者价格指数)的提取是在文本挖掘生命科学研究最多的话题之一,因为这种算法可以大大降低对数据库策展人的努力。此任务的最佳当前方法是基于分析句子的依赖关系树(DT)表示。许多方法只利用拓扑特征,因此并没有充分利用载有酒瘾的信息。我们发现,在纳入该类型DTS中依赖的编码语法信息明显使用模式匹配方法提高萃取性能。我们使用约相互作用的蛋白质只有信息自动推断出一大套的语言模式。那么模式是基于浅层语言特征和依赖类型的语义细化。总之,这些导致的17.2%点F_1共改善,在五个公开可用的PPI语料进行评估。超过半数的改善是通过正确处理依赖型上涨。我们的方法为构建不需要注释的训练数据的特定任务的关系提取方法的总体框架。此外,我们的观察提供了方法,提高在关系抽取方法。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号