首页> 外文会议>Machine translation summit;Workshop on technologies for MT of low resource languages >Multilingual Multimodal Machine Translation for Dravidian Languages utilizing Phonetic Transcription
【24h】

Multilingual Multimodal Machine Translation for Dravidian Languages utilizing Phonetic Transcription

机译:利用语音转录的Dravidian语言的多语种多模态机器翻译

获取原文

摘要

Multimodal machine translation is the task of translating from a source text into the target language using information from other modalities. Existing multimodal datasets have been restricted to only highly resourced languages. In addition to that, these datasets were collected by manual translation of English descriptions from the Flickr30K dataset. In this work, we introduce MMDravi, a Multilingual Multimodal dataset for under-resourced Dravidian languages. It comprises of 30,000 sentences which were created utilizing several machine translation outputs. Using data from MMDravi and a phonetic transcription of the corpus, we build an Multilingual Multimodal Neural Machine Translation system (MMNMT) for closely related Dravidian languages to take advantage of multilingual corpus and other modalities. We evaluate our translations generated by the proposed approach with human-annotated evaluation dataset in terms of BLEU, METEOR, and TER metrics. Relying on multilingual corpora, phonetic transcription, and image features, our approach improves the translation quality for the under-resourced languages.
机译:多模式机器翻译是使用来自其他模式的信息将源文本翻译成目标语言的任务。现有的多峰数据集仅限于资源丰富的语言。除此之外,这些数据集是通过手动翻译Flickr30K数据集的英语描述而收集的。在这项工作中,我们介绍了MMDravi,这是一种针对资源贫乏的Dravidian语言的多语言多模式数据集。它包含30,000个句子,这些句子是使用几种机器翻译输出创建的。利用MMDravi的数据和语料库的语音转录,我们为紧密相关的Dravidian语言建立了多语种多模态神经机器翻译系统(MMNMT),以利用多语种语料库和其他模态。我们使用BLEU,METEOR和TER指标,通过人工注释的评估数据集评估通过建议方法生成的翻译。依靠多语言语料库,语音转录和图像功能,我们的方法提高了资源匮乏语言的翻译质量。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号