首页 | 本学科首页   官方微博 | 高级检索  
     

图像特征注意力与自适应注意力融合的图像内容中文描述
引用本文:赵宏,孔东一. 图像特征注意力与自适应注意力融合的图像内容中文描述[J]. 计算机应用, 2021, 41(9): 2496-2503. DOI: 10.11772/j.issn.1001-9081.2020111829
作者姓名:赵宏  孔东一
作者单位:兰州理工大学 计算机与通信学院, 兰州 730050
基金项目:国家自然科学基金资助项目(51668043,61262016)。
摘    要:针对现有基于注意力机制的图像内容中文描述模型无法在关注信息不减弱和无缺失的条件下对重点内容进行注意力加强关注的问题,提出一种图像特征注意力与自适应注意力融合的图像内容中文描述模型.模型使用编解码结构,首先在编码器网络中提取图像特征,并通过图像特征注意力提取图像全部特征区域的注意力信息;然后使用解码器网络将带有注意力权重...

关 键 词:图像内容中文描述  注意力机制  深度学习  卷积神经网络  循环神经网络
收稿时间:2020-11-23
修稿时间:2021-03-11

Chinese description of image content based on fusion of image feature attention and adaptive attention
ZHAO Hong,KONG Dongyi. Chinese description of image content based on fusion of image feature attention and adaptive attention[J]. Journal of Computer Applications, 2021, 41(9): 2496-2503. DOI: 10.11772/j.issn.1001-9081.2020111829
Authors:ZHAO Hong  KONG Dongyi
Affiliation:School of Computer and Communication, Lanzhou University of Technology, Lanzhou Gansu 730050, China
Abstract:Aiming at the problem that the existing Chinese description models of image content based on attention mechanism cannot focus on the key content without weakening or missing attention information, a Chinese description model of image content based on fusion of image feature attention and adaptive attention was proposed. An encode-decode structure was used in this model. Firstly, the image features were extracted in the encoder network, and the attention information of all feature regions of the image was extracted by the image feature attention. Then, the decoder network was used to decode the image features with attention weights to generate hidden information, so as to ensure that the attention information was not weakened or missed. Finally, the visual sentry module of self-adaptive attention was used to focus on the key content in the image features again, so that the main content of the image was able to be extracted more accurately. Several evaluation indices including BLEU, METEOR, ROUGEL and CIDEr were used to verify the models, the proposed model was compared with the image description models based on self-adaptive attention or image feature attention only, and the proposed model had the evaluation value of CIDEr improved by 10.1% and 7.8% respectively. Meanwhile, compared with the baseline model Neural Image Caption (NIC) and the Bottom-Up and Top-Down (BUTD) attention based image description model, the proposed model had the evaluation index value of CIDEr increased by 10.9% and 12.1% respectively. Experimental results show that the image understanding ability of the proposed model is effectively improved, and the score of each evaluation index of the model is better than those of the comparison models.
Keywords:Chinese description of image content  attention mechanism  deep learning  Convolutional Neural Network (CNN)  Recurrent Neural Network (RNN)  
本文献已被 万方数据 等数据库收录!
点击此处可从《计算机应用》浏览原始摘要信息
点击此处可从《计算机应用》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号