首页 | 本学科首页   官方微博 | 高级检索  
     

可变尺寸循环注意力模型及应用研究
引用本文:吕冬健,王春立.可变尺寸循环注意力模型及应用研究[J].计算机工程与应用,2022,58(12):243-248.
作者姓名:吕冬健  王春立
作者单位:大连海事大学 信息科学技术学院,辽宁 大连 116026
摘    要:视觉注意力模型被应用于自动定位细粒度图片的局部区域以捕捉图片中有辨识度的特征并进行图片的分类任务,但是模型每次的输入图片尺寸是固定的而辨识度的特征区域大小是不确定的,因此模型不能够准确捕捉图片的全部特征造成分类准确率的下降。提出一种可变尺寸循环注意力模型,与之前的固定输入图片尺寸的循环注意力网络相比,模型通过优化注意力策略和尺寸生成策略,能够自主地学习下次输入图片的位置和尺寸,减少总输入图片面积,从而提高处理速度。实验结果表明,动态调整输入图片尺寸,在保持和视觉注意力模型相同识别准确率的情况下,可以显著减少计算总量,提高速度。

关 键 词:细粒度图像分类  强化学习  可变尺寸  

Variable Size for Recurrent Attention Model and Application Research
LYU Dongjian,WANG Chunli.Variable Size for Recurrent Attention Model and Application Research[J].Computer Engineering and Applications,2022,58(12):243-248.
Authors:LYU Dongjian  WANG Chunli
Affiliation:College of Information Science and Technology, Dalian Maritime University, Dalian, Liaoning 116026, China
Abstract:Visual attention model has been applied to image recognition tasks which autolocate discriminative local part of fine-grained image to capture different features, but input image size is fixed and the size of discriminative part is uncertain, so model cannot capture all features of image precisely and the classification accuracy is reduced. This paper proposes a variable size recurrent attention network(VSRAM), different from previous fixed input size, recurrent attention network(RAM), the VSRAM optimizes attention policy and size sampling policy to learn the position and size for next input image by itself, reduces total input image areas and increases processing speed. Experimental results show that, dynamically adjusting the size of input image can achieve the same recognition accuracy as RAM, but efficiently reduce the total input image area and increase speed.
Keywords:fine-grained image classification  reinforcement learning  variable size  
点击此处可从《计算机工程与应用》浏览原始摘要信息
点击此处可从《计算机工程与应用》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号