首页 | 本学科首页   官方微博 | 高级检索  
     

结合双语义数据增强与目标定位的细粒度图像分类
引用本文:谭润,叶武剑,刘怡俊.结合双语义数据增强与目标定位的细粒度图像分类[J].计算机工程,2022,48(2):237-242+249.
作者姓名:谭润  叶武剑  刘怡俊
作者单位:广东工业大学 信息工程学院, 广州 514000
基金项目:广东省重点区域研究开发计划项目(2018B030338001);广东工业大学青年百人项目(220413548)。
摘    要:细粒度图像分类旨在对属于同一基础类别的图像进行更细致的子类划分,其较大的类内差异和较小的类间差异使得提取局部关键特征成为关键所在。提出一种结合双语义数据增强与目标定位的细粒度图像分类算法。为充分提取具有区分度的局部关键特征,在训练阶段基于双线性注意力池化和卷积块注意模块构建注意力学习模块和信息增益模块,分别获取目标局部细节信息和目标重要轮廓这2类不同语义层次的数据,以双语义数据增强的方式提高模型准确率。同时,在测试阶段构建目标定位模块,使模型聚焦于分类目标整体,从而进一步提高分类准确率。实验结果表明,该算法在CUB-200-2011、FGVC Aircraft和Stanford Cars数据集中分别达到89.5%、93.6%和94.7%的分类准确率,较基准网络Inception-V3、双线性注意力池化特征聚合方式以及B-CNN、RA-CNN、MA-CNN等算法具有更好的分类性能。

关 键 词:细粒度图像分类  数据增强  双线性网络  注意力学习  目标定位  
收稿时间:2020-11-26
修稿时间:2021-01-24

Fine-Grained Image Classification Combining Dual Semantic Data Augmentation and Target Location
TAN Run,YE Wujian,LIU Yijun.Fine-Grained Image Classification Combining Dual Semantic Data Augmentation and Target Location[J].Computer Engineering,2022,48(2):237-242+249.
Authors:TAN Run  YE Wujian  LIU Yijun
Affiliation:School of Information Engineering, Guangdong University of Technology, Guangzhou 514000, China
Abstract:Fine-grained image classification aims to classify images of the same basic category into more specific subcategories.These images are characterized by large intra-class differences and minor inter-class differences, so the extraction of local key features is crucial to fine-grained image classification.A fine-grained image classification algorithm combining dual semantic data augmentation and target location is proposed.To extract discriminative local key features, two modules are constructed in the training phase to obtain two types of data at different semantic levels.The attention learning module is constructed based on Bilinear Attention Pooling(BAP) to obtain local detail information of the target, and the information gain module is constructed based on Convolutional Block Attention Module(CBAM) to obtain the important contour of the target.Then the accuracy of the model can be improved in the way of dual semantic data augmentation.At the same time, a target location module is built in the testing phase to make the model focus on the overall classification target and further improve the classification accuracy.The experimental results show that the proposed model displays a classification accuracy of 89.5% on CUB-200-2011 dataset, 93.6% on FGVC Aircraft dataset and 94.7% on Stanford Cars dataset, delivering higher performance than benchmark network Inception-V3, Bilinear Attention Pooling(BAP) feature aggregation method, B-CNN, RA-CNN, MA-CNN and other algorithms.
Keywords:fine-grained image classification  data augmentation  bilinear network  attention learning  target location
本文献已被 维普 等数据库收录!
点击此处可从《计算机工程》浏览原始摘要信息
点击此处可从《计算机工程》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号