首页> 外文会议>Chinese lexical semantics workshop >Named Entity Recognition for Chinese Novels in the Ming-Qing Dynasties
【24h】

Named Entity Recognition for Chinese Novels in the Ming-Qing Dynasties

机译:明清中国小说的命名实体识别

获取原文

摘要

This paper presents a Named Entity Recognition (NER) system for Chinese classic novels in the Ming and Qing dynasties using the Conditional Random Fields (CRFs) method. An annotated corpus of four influential vernacular novels produced during this period is used as both training and testing data. In the experiment, three novels are used as training data and one novel is used as the testing data. Three sets of features are proposed for the CRFs model: (1) baseline feature set, that is, word/POS and bigram for different window sizes, (2) dependency head and dependency relationship, and (3) Wikipedia categories. The F-measures for these four books range from 67% to 80%. Experiments show that using the dependency head and relationship as well as Wikipedia categories can improve the performance of the NER system. Compared with the second feature set, the third one can produce greater improvement.
机译:本文利用条件随机场(CRF)方法,为明清中国古典小说提供了命名实体识别(NER)系统。在此期间制作的带注释的四本有影响力的乡土小说的语料库都用作训练和测试数据。在实验中,将三本小说用作训练数据,并将一本小说用作测试数据。为CRF模型提出了三组特征:(1)基线特征集,即针对不同窗口大小的word / POS和bigram;(2)依赖头和依赖关系;以及(3)Wikipedia类别。这四本书的F量度从67%到80%不等。实验表明,使用依赖项头和关系以及Wikipedia类别可以提高NER系统的性能。与第二个功能集相比,第三个功能集可以产生更大的改进。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号