首页> 外文期刊>ACM transactions on Asian language information processing >Distortion Model Based on Word Sequence Labeling for Statistical Machine Translation
【24h】

Distortion Model Based on Word Sequence Labeling for Statistical Machine Translation

机译:基于词序标签的统计机器翻译失真模型

获取原文
获取原文并翻译 | 示例
           

摘要

This article proposes a new distortion model for phrase-based statistical machine translation. In decoding, a distortion model estimates the source word position to be translated next (subsequent position; SP) given the last translated source word position (current position; CP). We propose a distortion model that can simultaneously consider the word at the CP, the word at an SP candidate, the context of the CP and an SP candidate, relative word order among the SP candidates, and the words between the CP and an SP candidate. These considered elements are called rich context. Our model considers rich context by discriminating label sequences that specify spans from the CP to each SP candidate. It enables our model to learn the effect of relative word order among SP candidates as well as to learn the effect of distances from the training data. In contrast to the learning strategy of existing methods, our learning strategy is that the model learns preference relations among SP candidates in each sentence of the training data. This leaning strategy enables consideration of all of the rich context simultaneously. In our experiments, our model had higher BLUE and RIBES scores for Japanese-English, Chinese-English, and German-English translation compared to the lexical reordering models.
机译:本文提出了一种新的基于短语的统计机器翻译失真模型。在解码中,失真模型会根据给定的最后翻译源词位置(当前位置; CP),估计下一个要翻译的源词位置(后续位置; SP)。我们提出一种失真模型,该模型可以同时考虑CP处的单词,SP候选处的单词,CP和SP候选位置的上下文,SP候选之间的相对单词顺序以及CP和SP候选之间的单词。这些被考虑的元素称为丰富上下文。我们的模型通过区分指定从CP到每个SP候选者的跨度的标签序列来考虑丰富的上下文。它使我们的模型能够了解SP候选者之间相对词序的影响,以及从训练数据中了解距离的影响。与现有方法的学习策略相反,我们的学习策略是该模型学习训练数据中每个句子中SP候选者之间的偏好关系。这种倾斜策略可以同时考虑所有丰富的上下文。在我们的实验中,与词汇重新排序模型相比,我们的模型在日语-英语,中文-英语和德语-英语翻译中的BLUE和RIBES得分更高。

著录项

  • 来源
  • 作者单位

    National Institute of Information and Communications Technology and Kyoto University,NHK Science & Technology Research Laboratories, 1-10-11 Kinuta, Setagaya-ku, Tokyo 157-8510, Japan;

    Multilingual Translation Laboratory, National Institute of Information and Communications Technology, 3-5 Hikaridai, Keihanna Science City, Kyoto 619-0289, Japan;

    Multilingual Translation Laboratory, National Institute of Information and Communications Technology, 3-5 Hikaridai, Keihanna Science City, Kyoto 619-0289, Japan;

    Multilingual Translation Laboratory, National Institute of Information and Communications Technology, 3-5 Hikaridai, Keihanna Science City, Kyoto 619-0289, Japan;

    Department of Intelligence Science and Technology, Graduate School of Informatics, Kyoto University, Yoshida-honmachi, Sakyo-ku, Kyoto 606-8501, Japan;

  • 收录信息
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

    Distortion model; machine translation; reordering;

    机译:失真模型;机器翻译;重新排序;

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号