首页 | 本学科首页   官方微博 | 高级检索  
     

融合三元卷积神经网络与关系网络的小样本食品图像识别
引用本文:吕永强,闵巍庆,段华,蒋树强.融合三元卷积神经网络与关系网络的小样本食品图像识别[J].计算机科学,2020,47(1):136-143.
作者姓名:吕永强  闵巍庆  段华  蒋树强
作者单位:山东科技大学数学与系统科学学院 山东 青岛 266590;中国科学院计算技术研究所 北京 100190;中国科学院计算技术研究所 北京 100190;山东科技大学数学与系统科学学院 山东 青岛 266590
基金项目:教育部人文社会科学研究项目;国家自然科学基金;山东省自然科学基金;山东科技大学领军人才与优秀科研团队计划资助项目
摘    要:食品识别在食品健康和智能家居等领域获得了广泛关注。目前大部分的食品识别工作是基于大规模标记样本的深度神经网络,这些工作无法有效地识别只有少量样本的类别,因此小样本食品识别是一个亟待解决的问题。目前基于度量学习的小样本识别方法着重于探究样本之间的相似度信息,忽略了类内与类间更加细粒度的区分。学习类内与类间区分信息的主流方法是基于线性度量函数的三元卷积神经网络,然而对于食品图像而言,线性度量函数的鉴别能力不足。为此,引入可学习的关系网络作为三元卷积神经网络的非线性度量函数,进一步提出了一种基于非线性度量的三元神经网络用于小样本食品识别方法。该方法使用三元神经网络学习图像的特征嵌入表示,然后采用鉴别能力更强的关系网络作为非线性度量函数,基于端到端的训练方式来学习类内与类间更加细粒度的区分信息。此外,提出了一种可以使模型训练更加稳定的三元组样本在线采样方案。通过在Food-101,VIREO Food-172和ChineseFoodNet食品数据集上的实验结果可知,相比基于孪生网络的小样本学习方法,所提方法的性能平均提高了3.0%,相比基于线性度量函数的三元神经网络的方法,所提方法的性能平均提升了1.0%。文中还探究了损失函数的阈值、三元组采样的参数和初始化方式对实验性能的影响。

关 键 词:食品识别  小样本识别  非线性度量  三元神经网络

Few-shot Food Recognition Combining Triplet Convolutional Neural Network with Relation Network
LV Yong-qiang,MIN Wei-qing,DUAN Hua,JIANG Shu-qiang.Few-shot Food Recognition Combining Triplet Convolutional Neural Network with Relation Network[J].Computer Science,2020,47(1):136-143.
Authors:LV Yong-qiang  MIN Wei-qing  DUAN Hua  JIANG Shu-qiang
Affiliation:(College of Mathematics and System Science,Shandong University of Science and Technology,Qingdao,Shandong 266590,China;Institute of Computing Technology,Chinese Academy of Sciences,Beijing 100190,China)
Abstract:Food recognition attracts wide attention in the fields of food health and smart home.Most existing work focuses on food recognition with large-scale labeled samples,thus failing to robustly recognize food categories with few samples,under this condition,few-shot food recognition is an urgent problem.Most metric learning based few-shot recognition methods emphasize more on the similarity values of the image pairs without paying substantial attention to the inter-class and intra-class variations.Most works mainly use triplet convolutional neural network with linear metric function to learn the inter-class and intra-class information,however the liner metric function is not discriminative enough for measuring similarities of food images.To address this problem,this paper used the learnable relation network as non-linear metric and proposed a triplet network with relation network to solve the above two disadvantages of the few-shot learning and triplet network.This model adopts triplet network as feature embedding network for the image feature learning and uses a relation network with better discrimination as the non-linearity metric to learn the inter-class and intra-class information.Also the proposed model is trained end-to-end.In addition,this paper proposed an on-line mining rule for triplet samples,which makes the model stable in the training stage.The comprehensive experi-mental was conducted on three food datasets,which are Food-101,VIREO Food-172 and ChineseFoodNet.Compared with popular few-shot learning methods,such as Relation network,Matching network,the proposed model achieves an average improvement of about 3.0%,and compared with triplet network with liner metric,it achieves an average improvement of about 1.0%.Also this paper explored the influence of the margin in the loss function,parameters setting of online triplet sampling and initialization methods on experiment performance.
Keywords:Food recognition  Few-shot learning  Non-linear metric  Triplet network
本文献已被 维普 万方数据 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号