首页> 外文会议>Machine translation summit >Character-Aware Decoder for Translation into Morphologically Rich Languages
【24h】

Character-Aware Decoder for Translation into Morphologically Rich Languages

机译:字符识别解码器,可转换为形态丰富的语言

获取原文

摘要

Neural machine translation (NMT) systems operate primarily on words (or sub-words), ignoring lower-level patterns of morphology. We present a character-aware decoder designed to capture such patterns when translating into morphologically rich languages. We achieve character-awareness by augmenting both the softmax and embedding layers of an attention-based encoder-decoder model with convolutional neural networks that operate on the spelling of a word. To investigate performance on a wide variety of morphological phenomena, we translate English into 14 typologically diverse target languages using the TED multi-target dataset. In this low-resource setting, the character-aware decoder provides consistent improvements with BLEU score gains of up to +3.05. In addition, we analyze the relationship between the gains obtained and properties of the target language and find evidence that our model does indeed exploit morphological patterns.
机译:神经机器翻译(NMT)系统主要在单词(或子单词)上运行,而忽略了较低层次的形态学模式。我们提出了一种字符识别解码器,旨在将其转换为形态丰富的语言时捕获此类模式。我们通过使用基于单词拼写的卷积神经网络来扩展基于注意力的编码器-解码器模型的softmax和嵌入层,从而实现字符感知。为了研究各种形态现象的表现,我们使用TED多目标数据集将英语翻译成14种类型多样的目标语言。在这种低资源设置下,字符识别解码器的BLEU得分提高了+3.05,从而提供了一致的改进。此外,我们分析了获得的收益与目标语言的属性之间的关系,并找到证据表明我们的模型确实利用了形态学模式。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号