【24h】

On Using Very Large Target Vocabulary for Neural Machine Translation

机译:关于使用非常大的目标词汇进行神经机器翻译

获取原文

摘要

Neural machine translation, a recently proposed approach to machine translation based purely on neural networks, has shown promising results compared to the existing approaches such as phrase-based statistical machine translation. Despite its recent success, neural machine translation has its limitation in handling a larger vocabulary, as training complexity as well as decoding complexity increase proportionally to the number of target words. In this paper, we propose a method based on importance sampling that allows us to use a very large target vocabulary without increasing training complexity. We show that decoding can be efficiently done even with the model having a very large target vocabulary by selecting only a small subset of the whole target vocabulary. The models trained by the proposed approach are empirically found to match, and in some cases outperform, the baseline models with a small vocabulary as well as the LSTM-based neural machine translation models. Furthermore, when we use an ensemble of a few models with very large target vocabularies, we achieve performance comparable to the state of the art (measured by BLEU) on both the English→German and English→French translation tasks of WMT' 14.
机译:神经机器翻译,一种最近提出的纯粹基于神经网络的机器翻译方法,与现有的基于短语的统计机器翻译方法相比,已经显示出令人鼓舞的结果。尽管最近取得了成功,但是神经机器翻译在处理更大的词汇量方面存在局限性,因为训练复杂度以及解码复杂度与目标单词的数量成比例地增加。在本文中,我们提出了一种基于重要性采样的方法,该方法允许我们使用非常大的目标词汇表而不会增加训练的复杂性。我们表明,即使模型具有很大的目标词汇量,也可以通过仅选择整个目标词汇量的一小部分来有效地完成解码。从经验上发现,通过提议的方法训练的模型与带有少量词汇的基线模型以及基于LSTM的神经机器翻译模型相匹配,并且在某些情况下甚至优于。此外,当我们使用几个具有很大目标词汇量的模型的集合时,在WMT'14的英语→德语和英语→法语翻译任务上,我们都可以达到与现有技术水平相当的性能(由BLEU衡量)。

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号