...
首页> 外文期刊>Procedia Computer Science >Employing Differentiable Neural Computers for Image Captioning and Neural Machine Translation
【24h】

Employing Differentiable Neural Computers for Image Captioning and Neural Machine Translation

机译:用于图像标题和神经电脑翻译的可微分神经计算机

获取原文
           

摘要

In the history of artificial neural networks, LSTMs have proved to be a high-performance architecture at sequential data learning. Although LSTMs are remarkable in learning sequential data but are limited in their ability to learn long-term dependencies and representation of certain data structures because of the lack of external memory. In this paper, we tackled two main tasks, one is language translation and other is image captioning. We approached the problem of language translation by leveraging the capabilities of the recently developed DNC architectures. Here we modified the DNC architecture by including dual neural controllers instead of one and an external memory module. Inside our controller, we employed a neural network with memory-augmentation which differs from the original differentiable neural computer, we implemented a dual controller’s system in which one controller is for encoding the query sequence whereas another controller is for decoding the translated sequences. During the encoding cycle, new inputs are read and the memory is updated accordingly. In the decoding cycle, the memory is protected from any writing from the decoding controller. Thus, the decoder phase generates a translated sequence at a time step. Therefore, the proposed dual controller neural network with memory-augmentation is then trained and tested on the Europarl dataset. For the image captioning task, our architecture is inspired by an end-to-end image captioning model where CNN’s output is passed to RNN as input only once and the RNN generates words depending on the input. We trained our DNC captioning model on 2015 MSCOCO dataset. In the end, we compared and shows the superiority of our architecture as compared to conventionally used LSTM and NTM architectures.
机译:在人工神经网络的历史中,LSTM已经证明是在顺序数据学习时成为一种高性能架构。虽然LSTMS在学习顺序数据中是显着的,但由于缺少外部存储器而有限地学习长期依赖性和某些数据结构的表示。在本文中,我们解决了两个主要任务,一个是语言翻译,另一个是图像标题。我们通过利用最近开发的DNC架构的能力来解决语言翻​​译问题。在这里,我们通过包括双神经控制器而不是一个和外部存储器模块来修改DNC架构。在我们的控制器内部,我们使用了一个具有内存增强的神经网络,其与原始可微分的神经计算机不同,我们实现了一个双控制器的系统,其中一个控制器用于编码查询序列,而另一个控制器用于解码翻译的序列。在编码周期期间,读取新输入并相应地更新存储器。在解码周期中,从解码控制器的任何写入中保护存储器。因此,解码器相位在时间步骤生成翻译序列。因此,然后在Europarl数据集上培训并测试具有内存增强的建议的双控制器神经网络。对于图像标题任务,我们的体系结构由端到端图像标题模型的启发,其中CNN的输出被传递给RNN,仅作为输入,并且RNN根据输入生成单词。我们在2015年MSCOCO DataSet培训了我们的DNC标题模型。最后,我们比较了与传统使用的LSTM和NTM架构相比我们架构的优越性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号