基于改进的Transformer细粒度图像识别算法研究 Research on improved Transformer fine-grained image recognition algorithm期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

基于改进的Transformer细粒度图像识别算法研究

引用本文：	李冰锋,刘帅,杨艺. 基于改进的Transformer细粒度图像识别算法研究[J]. 电子测量技术, 2024, 47(2): 114-120

作者姓名：	李冰锋刘帅杨艺

作者单位：	河南理工大学电气工程与自动化学院

基金项目：	河南省科技攻关项目(222102210230)；;河南理工大学博士基金(B2018-33)项目资助；

摘要：	针对细粒度图像识别存在类间差异小、难以区分等问题，本文通过提升网络对图像细节特征的表达能力，来改善这一问题。为此，设计了一种基于改进的Transformer细粒度识别算法。首先，可变形卷积令牌嵌入通过自适应调整采样点的位置，来改变卷积操作范围及其卷积核的形状，从而增强网络模型对空间信息的感知能力，以获取更为精准的空间信息；其次，高效相关通道注意力机制通过对通道的自动选择，将通道注意力的计算从通道相邻转换成语义相似，来捕获语义相似的通道信息。而精准的空间信息和语义相似的通道信息将有效提升网络模型局部特征感知能力。实验结果表明，与基线算法相比，本文方法在CUB-200-2011、Stanford Cars和Stanford Dogs三个数据集上的识别结果分别提升了1.5%、2.4%、1.5%。结果表明，本文提出的方法通过提升细粒度图像细节特征的表达能力，从而有效提高了细粒度图像识别的有效性。
关键词：	细粒度图像识别 Transformer 可变形卷积
Research on improved Transformer fine-grained image recognition algorithm

Li Bingfeng,Liu Shuai,Yang Yi. Research on improved Transformer fine-grained image recognition algorithm[J]. Electronic Measurement Technology, 2024, 47(2): 114-120

Authors:	Li Bingfeng Liu Shuai Yang Yi

Affiliation:	School of Electrical Engineering and Automation，Henan Polytechnic University，Jiaozuo 454000,China

Abstract:	To address the issues of small inter-class differences and difficulty in distinguishing fine-grained images, this paper proposes a method that improves the network’s ability to express image detail features, aiming to alleviate this problem. To achieve this, an improved Transformer-based algorithm for fine grained recognition is designed in this study. Firstly, deformable convolutional token embedding adjusts the sampling points adaptively to modify the convolution operation range and the shape of its kernel, enhancing the network’s perception of spatial information for more accurate spatial details. Secondly, an efficient correlation channel attention mechanism automatically selects channels to transform the computation from neighboring channels to semantically similar channels, capturing semantic-related channel information. The precise spatial information and semantically related channel information effectively enhance the network’s perception of local features. Experimental results demonstrate that compared to the baseline algorithms, the proposed method improves recognition results by 1.5%, 2.4%, and 1.5% respectively on the CUB-200-2011, Stanford Cars, and Stanford Dogs datasets. These results indicate that the proposed approach effectively enhances the effectiveness of fine-grained image recognition by improving the expression capability of image detail features.

Keywords:	fine-grained image recognition；Transformer；deformable convolution

	点击此处可从《电子测量技术》浏览原始摘要信息
	点击此处可从《电子测量技术》下载免费的PDF全文

设为首页 | 免责声明 | 关于勤云 | 加入收藏