首页 | 本学科首页   官方微博 | 高级检索  
     

基于多路语义图网络的图像自动问答
引用本文:乔有田,张海军,路明.基于多路语义图网络的图像自动问答[J].计算机应用研究,2023,40(2).
作者姓名:乔有田  张海军  路明
作者单位:扬州市职业大学,北京物资学院,北京航空航天大学
基金项目:北京市自然科学基金资助项目(4182037);北京社会科学基金资助项目(21XCB005);北京市教委科技计划资助项目(KM201810037001)
摘    要:基于视觉特征与文本特征融合的图像问答已经成为自动问答的热点研究问题之一。现有的大部分模型都是通过注意力机制来挖掘图像和问题语句之间的关联关系,忽略了图像区域和问题词在同一模态之中以及不同视角的关联关系。针对该问题,提出一种基于多路语义图网络的图像自动问答模型(MSGN),从多个角度挖掘图像和问题之间的语义关联。MSGN利用图神经网络模型挖掘图像区域和问题词细粒度的模态内模态间的关联关系,进而提高答案预测的准确性。模型在公开的图像问答数据集上的实验结果表明,从多个角度挖掘图像和问题之间的语义关联可提高图像问题答案预测的性能。

关 键 词:图像问答    多头注意力    自动问答    特征融合    跨模态分析
收稿时间:2022/6/23 0:00:00
修稿时间:2022/8/25 0:00:00

Image question answering based on multi-view semantic gragh network
Qiao You Tian,Zhang Hai Jun and Lu Ming.Image question answering based on multi-view semantic gragh network[J].Application Research of Computers,2023,40(2).
Authors:Qiao You Tian  Zhang Hai Jun and Lu Ming
Affiliation:Yangzhou Vocational University,,
Abstract:Recently, image question answering based on the fusion of visual features and text features has become one of the hot research issues of automatic question answering. Most of the existing models are based on the attention mechanism to explore the relationship between the image and the question sentence, which ignores the correlation between the image area and the question words in the same mode and different views. To solve these problems, this paper proposed an image question answering model(MSGN) based on multi-view semantic graph network, which could mine the semantic correlation between images and questions from multiple views. Meanwhile, it used the graph neural network model to mine the fine-grained intra and inter-modal correlation between image regions and question words. It carried out extensive experiments on public data sets. The experimental results show that the image automatic question answering model based on multi-view semantic graph network can improve the performance of image question answering.
Keywords:image question answering  multi-head attention model  automatic question answering  feature fusion  crossmodal analysis
点击此处可从《计算机应用研究》浏览原始摘要信息
点击此处可从《计算机应用研究》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号