【24h】

Dependency Parsing with Noisy Multi-annotation Data

机译:用嘈杂的多注释数据解析依赖关系

获取原文

摘要

In the past few years, performance of dependency parsing has been improved by large margin on closed-domain benchmark datasets. However, when processing real-life texts, parsing performance degrades dramatically. Besides the domain adaptation technique, which has made slow progress due to its intrinsic difficulty, one straightforward way is to annotate a certain scale of syntactic data given a new source of texts. However, it is well known that annotating data is time and effort consuming, especially for the complex syntactic annotation. Inspired by the progress in crowdsourcing, this paper proposes to annotate noisy multi-annotation syntactic data with non-experts annotators. Each sentence is independently annotated by multiple annotators and the inconsistencies are retained. In this way, we can annotate data very rapidly since we can recruit many ordinary annotators. Then we construct and release three multi-annotation datasets from different sources.
机译:在过去几年中,闭合域基准数据集的大边距,依赖解析的性能已经得到改善。但是,在处理现实生活中,解析性能急剧下降。除了域适应技术外,由于其内在难度而取得了缓慢的进展,一种直接的方式是给出一定规模的句法数据给出了一个新的文本来源。然而,众所周知,注释数据是消耗的时间和努力,特别是对于复杂的语法注释。这篇论文提出了众所周境的进展,提议使用非专家注释器注释嘈杂的多元注释句法数据。每个句子由多个注释器独立注释,并且保留不一致。通过这种方式,我们可以非常迅速地向数据招募许多普通的注释器。然后我们构造和发布来自不同来源的三个多元注释数据集。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号