首页> 外文期刊>Machine translation >A product and process analysis of post-editor corrections on neural, statistical and rule-based machine translation output
【24h】

A product and process analysis of post-editor corrections on neural, statistical and rule-based machine translation output

机译:编辑器对神经,统计和基于规则的机器翻译输出进行校正的产品和过程分析

获取原文
获取原文并翻译 | 示例
           

摘要

This paper presents a comparison of post-editing (PE) changes performed on English-to-Finnish neural (NMT), rule-based (RBMT) and statistical machine translation (SMT) output, combining a product-based and a process-based approach. A total of 33 translation students acted as participants in a PE experiment providing both post-edited texts and edit process data. Our product-based analysis of the post-edited texts shows statistically significant differences in the distribution of edit types between machine translation systems. Deletions were the most common edit type for the RBMT, insertions for the SMT, and word form changes as well as word substitutions for the NMT system. The results also show significant differences in the correctness and necessity of the edits, particularly in the form of a large number of unnecessary edits in the RBMT output. Problems related to certain verb forms and ambiguity were observed for NMT and SMT, while RBMT was more likely to handle them correctly. Process-based comparison of effort indicators shows a slight increase of keystrokes per word for NMT output, and a slight decrease in average pause length for NMT compared to RBMT and SMT in specific text blocks. A statistically significant difference was observed in the number of visits per sub-segment, which is lower for NMT than for RBMT and SMT. The results suggest that although different types of edits were needed to outputs from NMT, RBMT and SMT systems, the difference is not necessarily reflected in process-based effort indicators.
机译:本文比较了在英语到芬兰的神经(NMT),基于规则的(RBMT)和统计机器翻译(SMT)的输出上进行的编辑后(PE)更改的比较,结合了基于产品和基于过程的信息方法。共有33名翻译学生参加了体育实验,提供后编辑文本和编辑过程数据。我们对后编辑文本的基于产品的分析显示,机器翻译系统之间的编辑类型分布在统计上有显着差异。对于RBMT,删除是最常见的编辑类型,对于SMT,插入是最常见的编辑类型,对于NMT系统,单词的格式更改以及单词替换是最常见的编辑类型。结果还显示出编辑的正确性和必要性方面的显着差异,尤其是以RBMT输出中大量不必要的编辑形式。 NMT和SMT观察到与某些动词形式和歧义有关的问题,而RBMT更可能正确处理它们。基于过程的工作量指标比较显示,与特定文本块中的RBMT和SMT相比,NMT输出每个单词的击键次数略有增加,NMT的平均暂停长度略有减少。观察到每个子段的访问次数具有统计学意义的差异,NMT的访问次数低于RBMT和SMT的访问次数。结果表明,尽管需要对NMT,RBMT和SMT系统的输出进行不同类型的编辑,但差异不一定反映在基于过程的工作量指标中。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号