Please use this identifier to cite or link to this item:
https://idr.l3.nitk.ac.in/jspui/handle/123456789/14772
Title: | Employing Differentiable Neural Computers for Image Captioning and Neural Machine Translation |
Authors: | Sharma R. Kumar A. Meena D. Pushp S. |
Issue Date: | 2020 |
Citation: | Procedia Computer Science , Vol. 173 , , p. 234 - 244 |
Abstract: | In the history of artificial neural networks, LSTMs have proved to be a high-performance architecture at sequential data learning. Although LSTMs are remarkable in learning sequential data but are limited in their ability to learn long-term dependencies and representation of certain data structures because of the lack of external memory. In this paper, we tackled two main tasks, one is language translation and other is image captioning. We approached the problem of language translation by leveraging the capabilities of the recently developed DNC architectures. Here we modified the DNC architecture by including dual neural controllers instead of one and an external memory module. Inside our controller, we employed a neural network with memory-augmentation which differs from the original differentiable neural computer, we implemented a dual controller's system in which one controller is for encoding the query sequence whereas another controller is for decoding the translated sequences. During the encoding cycle, new inputs are read and the memory is updated accordingly. In the decoding cycle, the memory is protected from any writing from the decoding controller. Thus, the decoder phase generates a translated sequence at a time step. Therefore, the proposed dual controller neural network with memory-augmentation is then trained and tested on the Europarl dataset. For the image captioning task, our architecture is inspired by an end-to-end image captioning model where CNN's output is passed to RNN as input only once and the RNN generates words depending on the input. We trained our DNC captioning model on 2015 MSCOCO dataset. In the end, we compared and shows the superiority of our architecture as compared to conventionally used LSTM and NTM architectures. © 2020 The Authors. Published by Elsevier B.V. |
URI: | https://doi.org/10.1016/j.procs.2020.06.028 http://idr.nitk.ac.in/jspui/handle/123456789/14772 |
Appears in Collections: | 2. Conference Papers |
Files in This Item:
There are no files associated with this item.
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.