首页> 外文会议>IJCNLP 2011 >Named Entity Recognition in Chinese News Comments on the Web
【24h】

Named Entity Recognition in Chinese News Comments on the Web

机译:在中国的中国新闻评论中命名实体识别

获取原文

摘要

News comment is a new text genre in the Web 2. 0 era. Many people often write comments to express their opinions about recent news events or topics after they read news articles. Because news comments are freely written without checking, they are very different from formal news texts. In particular, named entities in news comments are usually composed of some wrongly written words, informal abbreviations or aliases, which brings great difficulties for machine detection and understanding. This paper addresses the task of named entity recognition in Chinese news comments on the Web. We propose to leverage the entity information in the referred news article to improve named entity recognition in the news comments. Three different schemes are investigated to find useful entities in the news article for new feature generation in the CRFs model. Finally, a dictionary-based correction step is employed to further improve the results. We manually labelled a benchmark dataset with 60 pieces of news and 6000 comments downloaded from a popular Chinese news portal — www.sina.com.cn. The experimental results on the dataset show that our method is effective for this special task.
机译:新闻评论是Web 2. 0时代的新文本类型。许多人经常在阅读新闻文章后撰写评论,以表达关于最近的新闻事件或主题的意见。因为新闻评论无需检查,因此与正式新闻文本非常不同。特别是,新闻评论中的命名实体通常由一些错误写的单词,非正式的缩写或别名组成,这给机器检测和理解带来了巨大的困难。本文讨论了网上中文新闻评论中命名实体识别的任务。我们建议利用引用的新闻文章中的实体信息来改进新闻评论中的命名实体认可。调查了三种不同的方案,以便在CRFS模型中的新功能生成新闻文章中找到有用的实体。最后,采用基于词典的校正步骤来进一步改善结果。我们手动标记了一个带有60件新闻的基准数据集和6000点评论,下载了一个流行的中国新闻门户网站 - www.sina.com.cn.数据集上的实验结果表明,我们的方法对这项特殊任务有效。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号