首页 | 本学科首页   官方微博 | 高级检索  
     

基于多通道视觉注意力的细粒度图像分类
引用本文:王培森,宋彦,戴礼荣. 基于多通道视觉注意力的细粒度图像分类[J]. 数据采集与处理, 2019, 34(1): 157-166
作者姓名:王培森  宋彦  戴礼荣
作者单位:中国科学技术大学语音及语言信息处理国家工程实验室,合肥,230027
基金项目:国家自然科学基金U1613211国家自然科学基金(U1613211)资助项目。
摘    要:视觉注意力机制在细粒度图像分类中得到了广泛的应用。现有方法多是构建一个注意力权重图对特征进行简单加权处理。对此,本文提出了一种基于可端对端训练的深度神经网络模型实现的多通道视觉注意力机制,首先通过多视觉注意力图描述对应于视觉物体的不同区域,然后提取对应高阶统计特性得到相应的视觉表示。在多个标准的细粒度图像分类测试任务中,基于多通道视觉注意的视觉表示方法均优于近年主流方法。

关 键 词:图像分类  细粒度图像分析  视觉注意力  图像表示  深度学习
收稿时间:2018-04-28
修稿时间:2018-08-14

Fine-Grained Image Classification with Multi-channel Visual Attention
Wang Peisen,Song Yan,Dai Lirong. Fine-Grained Image Classification with Multi-channel Visual Attention[J]. Journal of Data Acquisition & Processing, 2019, 34(1): 157-166
Authors:Wang Peisen  Song Yan  Dai Lirong
Affiliation:National Engineering Laboratory of Speech and Language Information Processing, University of Science and Technology of China, Hefei, 230027, China
Abstract:Visual attention mechanism has been commonly used in state-of-the-art fine-grained classification methods in recent years. However, most attention-based image classification systems only apply single-layer or part-specified attention feature, with simple multiplication-based attention applying method, which limits the information provided by the attention. This paper presents a multi-channel visual attention based fine-grained image classification system. Multi-channel attention features are extracted from the image and applied to low-level features, with subtraction of mean values corresponding to each layer of attention for high-order representation, making the model an end-to-end optimizable deep neural network architecture. On multiple commonly used fine-grained classification datasets, the presented method outperforms state-of-the-art methods with a large margin.
Keywords:image classification  fine-grained image analysis  visual attention  image representation  deep learning
点击此处可从《数据采集与处理》浏览原始摘要信息
点击此处可从《数据采集与处理》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号