首页 | 本学科首页   官方微博 | 高级检索  
     

跨域和跨模态适应学习的无监督细粒度视频分类
引用本文:何相腾,彭宇新.跨域和跨模态适应学习的无监督细粒度视频分类[J].软件学报,2021,32(11):3482-3495.
作者姓名:何相腾  彭宇新
作者单位:北京大学王选计算机研究所, 北京 100080
基金项目:国家自然科学基金(61925201,61771025)
摘    要:细粒度视频分类旨在识别粗粒度大类中的细粒度子类,是计算机视觉中一个极具挑战的任务.考虑到视频数据的标注成本巨大,而图像的标注成本相对较小,且细粒度图像分类已经取得了较为显著的进展,一个自然的想法是不用标注,以无监督的方式将细粒度图像分类中学习到的知识自适应地迁移到细粒度视频分类中.然而,来源不同的图像和视频之间存在着域差异和模态差异,这导致细粒度图像分类的模型不能直接应用于细粒度视频分类.为了实现无监督的细粒度视频分类,提出一种无监督辨识适应网络,能够将辨识性定位能力从细粒度图像分类迁移到细粒度视频分类.进一步,提出一种渐进式伪标签策略来迭代地引导无监督辨识适应网络学习目标域视频的数据分布.在CUB-200-2011、Cars-196图像数据集和YouTube Birds、YouTube Cars视频数据集上验证该方法跨域、跨模态的适应能力,实验结果证明了该方法在无监督细粒度视频分类上的优势.

关 键 词:细粒度视频分类  无监督辨识适应网络  域差异  模态差异  域适应
收稿时间:2019/9/9 0:00:00
修稿时间:2020/3/9 0:00:00

Unsupervised Fine-grained Video Categorization via Adaptation Learning Across Domains and Modalities
HE Xiang-Teng,PENG Yu-Xin.Unsupervised Fine-grained Video Categorization via Adaptation Learning Across Domains and Modalities[J].Journal of Software,2021,32(11):3482-3495.
Authors:HE Xiang-Teng  PENG Yu-Xin
Affiliation:Wangxuan Institute of Computer Technology, Peking University, Beijing 100080, China
Abstract:Fine-grained video categorization is a highly challenging task to discriminate similar subcategories that belong to the same basic-level category. Due to the significant advances in fine-grained image categorization and expensive cost of labeling video data, it is intuitive to adapt the knowledge learned from image to video in an unsupervised manner. However, there is a clear gap to directly apply the models learned from image to recognize the fine-grained instances in video, due to domain distinction and modality distinction between image and video. Therefore, this study proposes the unsupervised discriminative adaptation network (UDAN), which transfers the ability of discrimination localization from image to video. A progressive pseudo labeling strategy is adopted to iteratively guide UDAN to approximate the distribution of the target video data. To verify the effectiveness of the proposed UDAN approach, adaptation tasks between image and video are performed, adapting the knowledge learned from CUB-200-2011/Cars-196 datasets (image) to YouTube Birds/YouTube Cars datasets (video). Experimental results illustrate the advantage of the proposed UDAN approach for unsupervised fine-grained video categorization.
Keywords:fine-grained video categorization  unsupervised discriminative adaptation network  domain distinction  modality distinction  domain adaption
点击此处可从《软件学报》浏览原始摘要信息
点击此处可从《软件学报》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号