首页> 外文会议>Conference on computational natural language learning >Instance Selection Improves Cross-Lingual Model Training for Fine-Grained Sentiment Analysis
【24h】

Instance Selection Improves Cross-Lingual Model Training for Fine-Grained Sentiment Analysis

机译:实例选择改善了细粒情绪分析的交叉模型培训

获取原文

摘要

Scarcity of annotated corpora for many languages is a bottleneck for training finegrained sentiment analysis models that can tag aspects and subjective phrases. We propose to exploit statistical machine translation to alleviate the need for training data by projecting annotated data in a source language to a target language such that a supervised fine-grained sentiment analysis system can be trained. To avoid a negative influence of poor-quality translations, we propose a filtering approach based on machine translation quality estimation measures to select only high-quality sentence pairs for projection. We evaluate on the language pair German/English on a corpus of product reviews annotated for both languages and compare to in-target-language training. Projection without any filtering leads to 23 % F_1 in the task of detecting aspect phrases, compared to 41 % F_1 for in-target-language training. Our approach obtains up to 47 % F_1. Further, we show that the detection of subjective phrases is competitive to in-target-language training without filtering.
机译:用于许多语言的注释语料库的稀缺是培训FineGreated情绪分析模型的瓶颈,可以标记方面和主观短语。我们建议利用统计机器翻译来缓解通过将源语言中的注释数据投影到目标语言,使得可以培训监督的细粒度情绪分析系统。为避免质量差的翻译的负面影响,我们提出了一种基于机器翻译质量估算措施的过滤方法,为投影仅选择高质量的句子对。我们对语言对德语/英语进行了评估,在产品评论中为两种语言进行注释并与目标语言培训进行比较。没有任何过滤的投影导致检测方面短语的任务中的23%F_1,相比于目标语言培训41%F_1。我们的方法获得高达47%的F_1。此外,我们表明主观短语的检测在没有过滤的情况下对目标语言训练具有竞争力。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号