首页> 外文会议>Natural language processing Pacific Rim symposium >A Korean Part-of-Speech Tagging System for Minimizing Human Intervention and Maintaining Tagging Consistency
【24h】

A Korean Part-of-Speech Tagging System for Minimizing Human Intervention and Maintaining Tagging Consistency

机译:韩国术语标签系统,用于最大限度地减少人为干预和维持标记一致性

获取原文

摘要

A large POS(part-of-speech) tagged corpus plays an important role in many areas of natural language processing. Therefore, a POS tagged corpus should be made with high accuracy and consistency. In order to construct a tagged corpus with high accuracy, human intervention should be required. However, a large amount of cost is needed and it is not easy to maintain consistency between the tagging results when human intervention occurs. In this paper, we propose an efficient system which can tag a corpus by using lexical rules and a stochastic POS tagger. Lexical rules are acquired manually to minimize human efforts and improve tagging accuracy of automatic tagger. And also, by using lexical rules along with a stochastic tagger, we can maintain the consistency between the tagging results. Experimental results show that we can reduce human intervention, and the accuracy of an automatic tagger can be improved continuously as the more lexical rules are acquired.
机译:一个大的POS(篇名)标记的语料库在许多自然语言处理领域起着重要作用。因此,应高精度和一致性地制造POS标记的语料库。为了以高精度构建标记的语料库,应该需要人为干预。然而,需要大量成本,并且在人类干预发生时,在标记结果之间保持一致性并不容易。在本文中,我们提出了一种高效的系统,可以使用词汇规则和随机POS标记来标记语料库。手动获取词法规则,以最大限度地减少人力努力并提高自动标记器的标记精度。而且,通过使用词汇规则以及随机标记器,我们可以在标记结果之间保持一致性。实验结果表明,我们可以减少人为干预,并且可以连续地改善自动标签的准确性,因为获得了更多的词汇规则。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号