首页 | 本学科首页   官方微博 | 高级检索  
     

基于多视角融合的细粒度图像分类方法
引用本文:黄伟锋,张甜,常东良,闫冬,王嘉希,王丹,马占宇. 基于多视角融合的细粒度图像分类方法[J]. 信号处理, 2020, 36(9): 1607-1614. DOI: 10.16798/j.issn.1003-0530.2020.09.027
作者姓名:黄伟锋  张甜  常东良  闫冬  王嘉希  王丹  马占宇
作者单位:南水北调中线信息科技有限公司
基金项目:国家重点研发计划资助项目(2019YFF0303300)以及课题二(2019YFF0303302);国家自然科学基金资助项目(61773071,61922015,U19B2036);北京智源人工智能研究院资助项目(BAAI2020ZJ0204);北京市科技新星计划交叉学科合作课题(Z191100001119140);中国留学基金管理委员会奖学金(202006470036);北京邮电大学博士生创新基金资助项目(CX2020105)
摘    要:细粒度图像分类的目标是区分同一个常见类下的不同子类,由于数据集往往存在较大的类内差异和较大的类间相似性,细粒度图像分类相比于传统图像分类具有更大的挑战性。以往工作中,基于组件的方法和基于注意力的方法致力于挖掘图像中的判别力区域,而忽视了用来区分易混淆类别的微弱差异。为了解决以上问题,本文提出了一个基于多视角融合的细粒度图像分类方法,包含两个分支,其中一个分支基于特征图挖掘图像的局部特征,另一个分支则学习图像的全局特征。同时引入一种嵌入损失,与传统多分类交叉熵损失函数结合增强特征的判别性,进而提升模型的分类性能。所提方法仅使用图像级标签,在CUB-200-2011,Stanford Cars和FGVC Aircraft这三个基准数据集上的分类准确率分别达到了88.3%,94.3%和92.4%,实验结果表明所提方法相比其它细粒度图像分类方法具有一定的优越性。 

关 键 词:细粒度图像分类   度量学习   卷积神经网络   注意力机制
收稿时间:2020-07-08

Multi-View Comprehensive Based Fine-Grained Image Classification
Affiliation:South-to-North Water Diversion Middle Route Information Technology Co., Ltd
Abstract:Fine-grained image classification task focuses on discriminating diffierent sub-classes under the common category. Because of the exiting larger intra-class variance and larger inter-class similarity, fine-grained image classification task is extremely challenging compare with traditional task. In previous studies, the part-based and the attention-based approaches only focused on mining discriminative regions in images, while ignoring the weak differences used to distinguish confusing categories. This paper proposed a multi-view comprehensive based fine-grained image classification model, which included two branches, one of which based on feature maps to mine local features of the image, and the other branch learned the global features of the image. A combination of embedding loss and softmax loss is introduced to enhance the discriminativeness of features, thereby improving the classification performance of the model. The proposed method only used image-level labels, and the classification accuracies on the three benchmarks of CUB-200-2011, Stanford Cars, and FGVC Aircraft reached 88.3%, 94.3%, and 92.4% respectively. Experimental results show that it has certain advantages for fine-grained image classification task. 
Keywords:
点击此处可从《信号处理》浏览原始摘要信息
点击此处可从《信号处理》下载免费的PDF全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号