首页 | 本学科首页   官方微博 | 高级检索  
     

多模态引导的局部特征选择小样本学习方法
引用本文:吕天根,洪日昌,何军,胡社教.多模态引导的局部特征选择小样本学习方法[J].软件学报,2023,34(5):2068-2082.
作者姓名:吕天根  洪日昌  何军  胡社教
作者单位:合肥工业大学 计算机与信息学院, 安徽 合肥 230031;合肥工业大学 计算机与信息学院, 安徽 合肥 230031;合肥综合性国家科学中心数据空间研究院, 安徽 合肥 230036
基金项目:国家自然科学基金重点项目(61932009)
摘    要:深度学习模型取得了令人瞩目的成绩,但其训练依赖于大量的标注样本,在标注样本匮乏的场景下模型表现不尽人意.针对这一问题,近年来以研究如何从少量样本快速学习的小样本学习被提了出来,方法主要采用元学习方式对模型进行训练,取得了不错的学习效果.但现有方法:1)通常仅基于样本的视觉特征来识别新类别,信息源较为单一; 2)元学习的使用使得模型从大量相似的小样本任务中学习通用的、可迁移的知识,不可避免地导致模型特征空间趋于一般化,存在样本特征表达不充分、不准确的问题.为解决上述问题,将预训练技术和多模态学习技术引入小样本学习过程,提出基于多模态引导的局部特征选择小样本学习方法.所提方法首先在包含大量样本的已知类别上进行模型预训练,旨在提升模型的特征表达能力;而后在元学习阶段,方法利用元学习对模型进行进一步优化,旨在提升模型的迁移能力或对小样本环境的适应能力,所提方法同时基于样本的视觉特征和文本特征进行局部特征选择来提升样本特征的表达能力,以避免元学习过程中模型特征表达能力的大幅下降;最后所提方法利用选择后的样本特征进行小样本学习.在MiniImageNet、CIFAR-FS和FC-100这3个基准数...

关 键 词:小样本学习  多模态融合  图像分类  表示学习
收稿时间:2022/4/18 0:00:00
修稿时间:2022/5/29 0:00:00

Multimodal-guided Local Feature Selection for Few-shot Learning
LV Tian-Gen,HONG Ri-Chang,HE Jun,HU She-Jiao.Multimodal-guided Local Feature Selection for Few-shot Learning[J].Journal of Software,2023,34(5):2068-2082.
Authors:LV Tian-Gen  HONG Ri-Chang  HE Jun  HU She-Jiao
Affiliation:School of Computer Science and Information Engineering, Hefei University of Technology, Hefei 230031, China;School of Computer Science and Information Engineering, Hefei University of Technology, Hefei 230031, China;Institute of Dataspace, Hefei Comprehensive National Science Center, Hefei 230026, China
Abstract:Deep learning models have yielded impressive results in many tasks. However, the success hinges on the availability a large amount of labeled samples for model training, and it is agreed that deep models tend to perform poorly in scenarios where labeled samples are scarce. To this end, recently few-shot learning (FSL) has been proposed to study how to learn quickly from a small number of samples, and has achieved good performance with the adoption of meta-learning. Nevertheless, two issues exist:1) Existing FSL methods usually manage to recognize novel concepts solely based on the visual characteristics of samples, without integrating information from other modalities; 2) By following the paradigm of meta-learning, a model aims at learning generic and transferable knowledge from lots of similar fake few-shot tasks, which inevitably leads to a feature space of good transferability but with weak representation ability. To tackle the two issues, we introduce model pre-training techniques and multimodal learning techniques into the FSL process in this paper, and propose a new multimodal-guided local feature selection strategy for few-shot learning. Specifically, we first train the target model for recognizing a set of known or seen classes, each with abundant samples, which greatly improves the representation learning ability of the model. Then, in the meta-learning stage, we further optimize the pre-trained model on a set of randomly sampled few-shot tasks, which improves its transferability or its ability of adapting to the challenging FSL environment. The proposed multimodal-guided local feature selection strategy employed during mete-learning leverages both visual features and textual features, which helps construct more discriminative features and alleviate degradation of the model''s representation ability. The resultant sample features are finally utilized for few-shot learning. Experiments on three benchmark datasets, namely miniImageNet, CIFAR-FS and FC100, demonstrate that our proposed FSL method can achieve better results.
Keywords:few-shot learning  multimodal fusion  image classification  representation learning
点击此处可从《软件学报》浏览原始摘要信息
点击此处可从《软件学报》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号