首页 | 本学科首页   官方微博 | 高级检索  
     

新的动态记忆网络的视觉问答
引用本文:王永琦.新的动态记忆网络的视觉问答[J].计算机应用研究,2020,37(10).
作者姓名:王永琦
作者单位:上海工程技术大学
基金项目:国家自然科学基金资助项目(1801286,61701295)
摘    要:视觉问答任务旨在给机器输入一幅图像和一相关问题,计算机能够准确作答。针对这一任务,对记忆和注意力机制的神经网络结构进行了深入研究,这类网络显示出问题回答所需的某些推理能力。在分析动态记忆网络的基础上,提出了一种新的动态记忆网络,对原来的DMN的内存和输入模块进行改进。结合这些变化,一个新的图像输入模块引入到视觉问答系统中。在DAQUAR-ALL、COCO-QA和VQA数据集上验证了该方法的有效性。实验结果表明,所提出的新动态记忆模型取得了很好的结果,比一些经典深度方法都更出色。

关 键 词:动态记忆网络    深度学习    视觉问答
收稿时间:2019/5/19 0:00:00
修稿时间:2020/9/7 0:00:00

New dynamic memory network for visual question answering
Affiliation:School of Electronic & Electrical Engineering
Abstract:The neural network structure with memory and attention mechanism shows some reasoning abilities needed for question answering. DMN is such an architecture, which achieves high accuracy in various language tasks. However, when there is no tag during training to support facts or whether it can be applied to other patterns such as images, does the architecture achieve strong results for question answering? Based on the analysis of DMN, this paper proposed several improvements of its memory and input module. Combining these changes, it introduced a new image input module to answer visual questions. The new dynamic memory model improved the latest technology on visual question answering data sets. The experimental results show that the proposed method achieves good results on visual question answering tasks.
Keywords:dynamic memory network(DMN)  deep learning  visual question answering(VQA)
点击此处可从《计算机应用研究》浏览原始摘要信息
点击此处可从《计算机应用研究》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号