结合双语义数据增强与目标定位的细粒度图像分类 Fine-Grained Image Classification Combining Dual Semantic Data Augmentation and Target Location期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

结合双语义数据增强与目标定位的细粒度图像分类

引用本文：	谭润,叶武剑,刘怡俊.结合双语义数据增强与目标定位的细粒度图像分类[J].计算机工程,2022,48(2):237-242+249.

作者姓名：	谭润叶武剑刘怡俊

作者单位：	广东工业大学信息工程学院, 广州 514000

基金项目：	广东省重点区域研究开发计划项目(2018B030338001);广东工业大学青年百人项目(220413548)。

摘要：	细粒度图像分类旨在对属于同一基础类别的图像进行更细致的子类划分，其较大的类内差异和较小的类间差异使得提取局部关键特征成为关键所在。提出一种结合双语义数据增强与目标定位的细粒度图像分类算法。为充分提取具有区分度的局部关键特征，在训练阶段基于双线性注意力池化和卷积块注意模块构建注意力学习模块和信息增益模块，分别获取目标局部细节信息和目标重要轮廓这2类不同语义层次的数据，以双语义数据增强的方式提高模型准确率。同时，在测试阶段构建目标定位模块，使模型聚焦于分类目标整体，从而进一步提高分类准确率。实验结果表明，该算法在CUB-200-2011、FGVC Aircraft和Stanford Cars数据集中分别达到89.5%、93.6%和94.7%的分类准确率，较基准网络Inception-V3、双线性注意力池化特征聚合方式以及B-CNN、RA-CNN、MA-CNN等算法具有更好的分类性能。
关键词：	细粒度图像分类数据增强双线性网络注意力学习目标定位
收稿时间：	2020-11-26
修稿时间：	2021-01-24
Fine-Grained Image Classification Combining Dual Semantic Data Augmentation and Target Location

TAN Run,YE Wujian,LIU Yijun.Fine-Grained Image Classification Combining Dual Semantic Data Augmentation and Target Location[J].Computer Engineering,2022,48(2):237-242+249.

Authors:	TAN Run YE Wujian LIU Yijun

Affiliation:	School of Information Engineering, Guangdong University of Technology, Guangzhou 514000, China

Abstract:	Fine-grained image classification aims to classify images of the same basic category into more specific subcategories.These images are characterized by large intra-class differences and minor inter-class differences, so the extraction of local key features is crucial to fine-grained image classification.A fine-grained image classification algorithm combining dual semantic data augmentation and target location is proposed.To extract discriminative local key features, two modules are constructed in the training phase to obtain two types of data at different semantic levels.The attention learning module is constructed based on Bilinear Attention Pooling(BAP) to obtain local detail information of the target, and the information gain module is constructed based on Convolutional Block Attention Module(CBAM) to obtain the important contour of the target.Then the accuracy of the model can be improved in the way of dual semantic data augmentation.At the same time, a target location module is built in the testing phase to make the model focus on the overall classification target and further improve the classification accuracy.The experimental results show that the proposed model displays a classification accuracy of 89.5% on CUB-200-2011 dataset, 93.6% on FGVC Aircraft dataset and 94.7% on Stanford Cars dataset, delivering higher performance than benchmark network Inception-V3, Bilinear Attention Pooling(BAP) feature aggregation method, B-CNN, RA-CNN, MA-CNN and other algorithms.

Keywords:	fine-grained image classification data augmentation bilinear network attention learning target location
本文献已被维普等数据库收录！
	点击此处可从《计算机工程》浏览原始摘要信息
	点击此处可从《计算机工程》下载全文

设为首页 | 免责声明 | 关于勤云 | 加入收藏