首页> 外文会议>Pacific Asia Conference on Language, Information and Computation >Automatic Evaluation of English-to-Korean and Korean-to-English Neural Machine Translation Systems by Linguistic Test Points
【24h】

Automatic Evaluation of English-to-Korean and Korean-to-English Neural Machine Translation Systems by Linguistic Test Points

机译:通过语言测试点自动评估英韩韩神经机器翻译系统

获取原文

摘要

BLEU is the most well-known automatic evaluation technology in assessing the performance of machine translation systems. However, BLEU does not know which parts of the NMT translation results are good or bad. This paper describes the automatic evaluation approach of NMT systems by linguistic test points. This approach allows automatic evaluation of each linguistic test point not shown in BLEU and provides intuitive insight into the strengths and flaws of NMT systems in handling various important linguistic test points. The linguistic test points used for automatic evaluation were 58 and consisted of 630 sentences. We conducted the evaluation of two bidirectional English/Korean NMT systems. BLEUs of English-to-Korean NMT systems were 0.0898 and 0.2081 respectively, and their automatic evaluations by linguistic test points were 58.35% and 77.31%, respectively. BLEUs of Korean-to-English NMT systems were 0.3939 and 0.4512 respectively, and their automatic evaluations by linguistic test points were 33.10% and 40.47%, respectively. This means that the automatic evaluation approach by linguistic test points has similar results as BLEU assessment. According to automatic evaluation by linguistic test points, we know that both English-to-Korean NMT systems and Korean-to-English NMT systems have strengths in polysemy translations, but has flaws in style translations and translations of sentences with complex syntactic structures.
机译:BLEU是评估机器翻译系统性能的最著名的自动评估技术。但是,BLEU不知道NMT翻译结果的哪些部分是好是坏。本文通过语言测试点描述了NMT系统的自动评估方法。这种方法可以自动评估BLEU中未显示的每个语言测试点,并直观地了解NMT系统在处理各种重要语言测试点时的优缺点。用于自动评估的语言测试点为58个,由630个句子组成。我们对两个双向的英语/韩文NMT系统进行了评估。英韩NMT系统的BLEU分别为0.0898和0.2081,通过语言测试点进行的自动评估分别为58.35%和77.31%。韩英NMT系统的BLEU分别为0.3939和0.4512,通过语言测试点进行的自动评估分别为33.10%和40.47%。这意味着通过语言测试点进行的自动评估方法具有与BLEU评估相似的结果。根据语言测试点的自动评估,我们知道英韩NMT系统和韩英NMT系统在多义翻译中都有优势,但是在样式翻译和具有复杂句法结构的句子翻译中存在缺陷。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号