首页 | 官方网站   微博 | 高级检索  
     

结合前景特征增强与区域掩码自注意力的细粒度图像分类
引用本文:刘万军,赵思琪,曲海成,王宇萍.结合前景特征增强与区域掩码自注意力的细粒度图像分类[J].智能系统学报,2022,17(6):1134-1144.
作者姓名:刘万军  赵思琪  曲海成  王宇萍
作者单位:辽宁工程技术大学 软件学院,辽宁 葫芦岛 125105
摘    要:为解决细粒度图像分类中不相关背景信息干扰以及子类别差异特征难以提取等问题,提出了一种结合前景特征增强和区域掩码自注意力的细粒度图像分类方法。首先,利用ResNet50提取输入图片的全局特征;然后通过前景特征增强网络定位前景目标在输入图片中的位置,在消除背景信息干扰的同时对前景目标进行特征增强,有效突出前景物体;最后,将特征增强的前景目标通过区域掩码自注意力网络学习丰富、多样化且区别于其他子类的特征信息。在训练模型的整个过程,建立多分支损失函数约束特征学习。实验表明,该模型在细粒度图像数据集CUB-200-2011、Stanford Cars和FGVC-Aircraft的准确率分别达到了88.0%、95.3%和93.6%,优于其他主流方法。

关 键 词:细粒度图像分类  目标定位  区域掩码  自注意力  多样化特征  特征增强  残差网络  深度学习

Combining foreground feature reinforcement and region mask self-attention for fine-grained image classification
LIU Wanjun,ZHAO Siqi,QU Haicheng,WANG Yuping.Combining foreground feature reinforcement and region mask self-attention for fine-grained image classification[J].CAAL Transactions on Intelligent Systems,2022,17(6):1134-1144.
Authors:LIU Wanjun  ZHAO Siqi  QU Haicheng  WANG Yuping
Affiliation:School of Software, Liaoning Technical University, Huludao 125105, China
Abstract:This study presents a method of foreground feature reinforcement and region mask self-attention for fine-grained image classification due to the difficulty in extracting subtle features of subordinate classes that are difficult to distinguish irrelevant background noise interference. The ResNet50 is used first to extract global features of the input image, followed by the foreground feature reinforcement, which predicts the position coordinates of the foreground object in the input image. While eliminating background information interference, the features of foreground objects are enhanced to effectively highlight foreground objects. Finally, the region mask self-attention network is used to teach feature-enhanced foreground objects with rich and diverse fine-grained information that is different from other subclasses. The multi-branch loss function constrains the network’s feature learning throughout the process. The comprehensive experiments show that our approach outperforms other mainstream methods on CUB-200-2011, Stanford Cars datasets, and FGVC-Aircraft, with 88.0%, 95.3%, and 93.6%, respectively.
Keywords:fine-grained image classification  object localization  region-based mask  self-attention  diverse feature  feature reinforcement  residual network  deep learning
点击此处可从《智能系统学报》浏览原始摘要信息
点击此处可从《智能系统学报》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号