首页> 外文会议>Advanced intelligent computing theories and applications : With aspects of artificial intelligence >Chinese Named Entity Recognition with a Sequence Labeling Approach: Based on Characters, or Based on Words?
【24h】

Chinese Named Entity Recognition with a Sequence Labeling Approach: Based on Characters, or Based on Words?

机译:基于序列标记方法的中文命名实体识别:基于字符还是基于单词?

获取原文
获取原文并翻译 | 示例

摘要

Named Entity Recognition (NER), an important problem of Natural Language Processing, is the basis for other applications, such as Data Mining and Relation Extraction. With a sequence labeling approach, this paper wants to answer which kind of tokens that should be taken as the graininess in NER task, characters or words. Meanwhile, we use not only local context features within a sentence, but also global knowledge features extracting from other occurrences of each word in the whole corpus. The results show that without the global features the person names and the location names have good result based on characters, but the organization names are more suitable based on words. When global features are added, the performance of based on words improved significantly.
机译:命名实体识别(NER)是自然语言处理的一个重要问题,是其他应用程序(例如数据挖掘和关系提取)的基础。通过序列标记方法,本文希望回答在NER任务,字符或单词中应采用哪种标记作为颗粒感。同时,我们不仅使用句子中的局部上下文特征,还使用从整个语料库中每个单词的其他出现处提取的全局知识特征。结果表明,在没有全局特征的情况下,基于字符的人员名称和位置名称具有较好的效果,但是基于单词的组织名称更合适。添加全局功能后,基于单词的性能将大大提高。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号