基于深度学习中间层卷积特征的图像标注 Image Annotation Based on Middle-Layer Convolution Features of Deep Learning期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

基于深度学习中间层卷积特征的图像标注

引用本文：	于宁,宋海玉,孙东洋,王鹏杰,姚金鑫. 基于深度学习中间层卷积特征的图像标注[J]. 图学学报, 2019, 40(5): 872. DOI: 10.11996/JG.j.2095-302X.2019050872

作者姓名：	于宁宋海玉孙东洋王鹏杰姚金鑫

作者单位：	大连民族大学计算机科学与工程学院,辽宁大连,116600;安迅达盛医疗科技有限公司,北京,100020

基金项目：	国家自然科学基金项目(61300089)；辽宁省自然科学基金项目(201602199，2019-ZD-0182)；辽宁省高等学校创新人才支持计划项目 (LR2016071)

摘要：	针对基于深度特征的图像标注模型训练复杂、时空开销大的不足，提出一种由深度学习中间层特征表示图像视觉特征、由正例样本均值向量表示语义概念的图像标注方法。首先，通过预训练深度学习模型的中间层直接输出卷积结果作为低层视觉特征，并采用稀疏编码方式表示图像；然后，采用正例均值向量法为每个文本词汇构造视觉特征向量，从而构造出文本词汇的视觉特征向量库；最后，计算测试图像与所有文本词汇的视觉特征向量相似度，并取相似度最大的若干词汇作为标注词。多个数据集上的实验证明了所提出方法的有效性，就 F1 值而言，该方法在 IAPR TC-12 数据集上的标注性能比采用端到端深度特征的 2PKNN 和 JEC 分别提高 32%和 60%。
关键词：	深度学习图像标注卷积正例均值向量特征向量
Image Annotation Based on Middle-Layer Convolution Features of Deep Learning

YU Ning,SONG Hai-yu,SUN Dong-yang,WANG Peng-jie,YAO Jin-xin. Image Annotation Based on Middle-Layer Convolution Features of Deep Learning[J]. Journal of Graphics, 2019, 40(5): 872. DOI: 10.11996/JG.j.2095-302X.2019050872

Authors:	YU Ning SONG Hai-yu SUN Dong-yang WANG Peng-jie YAO Jin-xin

Affiliation:	(1. College of Computer Science and Engineering, Dalian Nationalities University, Dalian Liaoning 116600, China;2. Anxundasheng Medical Technology Company, Beijing 100020, China)

Abstract:	Image annotation based on deep features always requires complex model training and huge space-time cost. To overcome these shortcomings, an efficient and effective approach was proposed, whose visual feature was described by middle-level features of deep learning and semantic concept was represented by mean vector of positive samples. Firstly, the convolution result is directly outputted as the low-level visual feature by the middle layer of the pre-training deep learning model, and the sparse coding method was used to represent image. Then, visual feature vector was constructed for each textual word by the mean vector method of positive samples, and the visual feature vector database of the text vocabulary was constructed. Finally, the similarities of visual feature vectors between test image and all textual words were computed, and some words with largest similarities were selected as annotation words. The experimental results on several datasets demonstrate the effectiveness of the proposed method. In terms of F1-measure, the experimental results on IAPR TC-12 dataset show that the performance of the proposed method was improved by 32% and 60% respectively, compared to 2PKNN and JEC with end-to-end deep features.

Keywords:	deep learning image annotation convolution mean vector of positive sample feature vector
本文献已被 CNKI 万方数据等数据库收录！
	点击此处可从《图学学报》浏览原始摘要信息
	点击此处可从《图学学报》下载免费的PDF全文

设为首页 | 免责声明 | 关于勤云 | 加入收藏