Online Distilling from Checkpoints for Neural Machine Translation

机译：从检查点在线提取神经机器翻译

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Current predominant neural machine translation (NMT) models often have a deep structure with large amounts of parameters, making these models hard to train and easily suffering from over-fitting. A common practice is to utilize a validation set to evaluate the training process and select the best checkpoint. Average and ensemble techniques on checkpoints can lead to further performance improvement. However, as these methods do not affect the training process, the system performance is restricted to the checkpoints generated in the original training procedure. In contrast, we propose an online knowledge distillation method. Our method on-the-fly generates a teacher model from checkpoints, guiding the training process to obtain better performance. Experiments on several datasets and language pairs show steady improvement over a strong sclf-attention-based baseline system. We also provide analysis on data-limited setting against over-fitting. Furthermore, our method leads to an improvement on a machine reading experiment as well.

机译：当前的主要神经机器翻译（NMT）模型通常具有包含大量参数的深层结构，这使得这些模型难以训练并且容易遭受过度拟合的困扰。一种常见的做法是利用验证集评估培训过程并选择最佳检查点。检查点上的平均和集成技术可以进一步提高性能。但是，由于这些方法不影响训练过程，因此系统性能仅限于原始训练过程中生成的检查点。相比之下，我们提出了一种在线知识提炼方法。我们的实时方法从检查点生成教师模型，指导培训过程以获得更好的表现。在多个数据集和语言对上进行的实验表明，与基于sclf-attention的强大基线系统相比，该产品稳步改进。我们还提供针对数据限制设置的分析，以防过度拟合。此外，我们的方法还导致机器阅读实验的改进。

著录项

来源
《Conference on the North American Chapter of the Association for Computational Linguistics: Human Language Technologies》|2019年|1932-1941|共10页
会议地点
作者
Hao-Ran Wei; Shujian Huang; Ran Wang; Xin-yu Dai; Jiajun Chen;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词

相似文献

外文文献
中文文献
专利

1. Translation Mechanism of Neural Machine Algorithm for Online English Resources [J] . Yanping Ye Complexity . 2021,第a期

机译：神经机算法在线英语资源的翻译机理
2. Online learning for effort reduction in interactive neural machine translation [J] . Peris Alvaro, Casacuberta Francisco Computer speech and language . 2019,第NOVa期

机译：在线学习可减少交互式神经机器翻译中的工作量
3. Online learning for effort reduction in interactive neural machine translation [J] . Peris Alvaro, Casacuberta Francisco Computer speech and language . 2019,第Nova期

机译：在线学习努力减少互动神经机翻译
4. Online Distilling from Checkpoints for Neural Machine Translation [C] . Hao-Ran Wei, Shujian Huang, Ran Wang, Conference on the North American Chapter of the Association for Computational Linguistics: Human Language Technologies . 2019

机译：从在线蒸馏出神经电机翻译检查点
5. Larger-Context Neural Machine Translation [D] . Jean, Sébastien. 2021

机译：较大的上下文神经机翻译
6. The use of machine translation algorithm based on residual and LSTM neural network in translation teaching [O] . Beibei Ren 2020

机译：基于残差和LSTM神经网络的机器翻译算法在翻译教学中的使用
7. Online Neural Automatic Post-editing for Neural Machine Translation [O] . Matteo Negri, Marco Turchi, Nicola Bertoldi, 2018

机译：神经电机翻译在线神经自动编辑

Online Distilling from Checkpoints for Neural Machine Translation

摘要

著录项

相似文献

相关主题

期刊订阅