首页 | 本学科首页   官方微博 | 高级检索  
     

深度监督对齐的零样本图像分类方法
引用本文:曾素佳,庞善民,郝问裕. 深度监督对齐的零样本图像分类方法[J]. 浙江大学学报(工学版), 2022, 56(11): 2204-2214. DOI: 10.3785/j.issn.1008-973X.2022.11.011
作者姓名:曾素佳  庞善民  郝问裕
作者单位:西安交通大学 软件学院,陕西 西安 710049
基金项目:国家自然科学基金资助项目(61972312);陕西省重点研发计划一般工业资助项目(2020GY-002)
摘    要:针对零样本图像分类中属性向量的类别区分性差及对可见类别产生分类偏好的问题,提出一种深度监督对齐的零样本图像分类(DSAN)方法. DSAN构造类语义的全局监督标记,与专家标注的属性向量联合使用以增强类语义间的区分性.为了对齐视觉空间和语义空间的流形结构,采用视觉特征和语义特征分类网络分别学习2种空间特征的类别分布,并且无差异地对齐两者的分布.利用生成对抗网络的原理消除特征间的本质差异,以按位加的方式合并视觉特征和类语义特征,并利用关系网络学习两者间的非线性相似度.实验结果表明,DSAN在CUB、AWA1和AWA2数据集上对可见类别和未见类别的调和平均分类准确率比基线模型分别提高了4.3%、19.5%和21.9%;在SUN和APY数据集上,DSAN方法的调和平均分类准确率分别比CRnet方法高1.4%和2.2%,这些结果证明所提方法的有效性.

关 键 词:零样本学习  属性向量  关系网络  跨模态  生成对抗网络

Zero-shot image classification method base on deep supervised alignment
Su-jia ZENG,Shan-min PANG,Wen-yu HAO. Zero-shot image classification method base on deep supervised alignment[J]. Journal of Zhejiang University(Engineering Science), 2022, 56(11): 2204-2214. DOI: 10.3785/j.issn.1008-973X.2022.11.011
Authors:Su-jia ZENG  Shan-min PANG  Wen-yu HAO
Abstract:A zero-shot image classification method based on deep supervised alignment network (DSAN) was proposed to address the problems caused by poor class discrimination of attributes and the bias of classifying images into seen classes in generalized zero-shot image classification. The global supervised tags were constructed and used along with the attribute vectors annotated by expert systems to enhance the discrimination ability of class semantics. To align the manifolds structure of visual and semantic space, image and semantic feature classification networks were designed to learn their class distributions respectively, which were aligned afterwards with no difference. The generative adversarial network was also utilized to eliminate the heterogeneity between them. The element-wise addition was used to merge visual features and class semantic features when learning their nonlinear similarity by relation network. Experimental results showed that the harmonic mean classification accuracy for seen and unseen classes of the proposed method outperformed the baseline model by 4.3%, 19.5% and 21.9% on CUB, AWA1, AWA2 datasets, respectively. The harmonic mean classification accuracy was 1.4% and 2.2% higher than those of the existing best-performing CRnet method on SUN and APY datasets, respectively. The results demonstrated the effectiveness of the proposed method.
Keywords:zero-shot learning  attribute vector  relation network  cross modal  generative adversarial network  
点击此处可从《浙江大学学报(工学版)》浏览原始摘要信息
点击此处可从《浙江大学学报(工学版)》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号