深度学习跨模态图文检索研究综述 Survey of Research on Deep Learning Image-Text Cross-Modal Retrieval期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

深度学习跨模态图文检索研究综述

引用本文：	刘颖,郭莹莹,房杰,范九伦,郝羽,刘继明.深度学习跨模态图文检索研究综述[J].计算机科学与探索,2022,16(3):489-511.

作者姓名：	刘颖郭莹莹房杰范九伦郝羽刘继明

作者单位：	西安邮电大学图像与信息处理研究所,西安 710121;陕西省无线通信与信息处理技术国际合作研究中心,西安 710121;西安邮电大学电子信息现场勘验应用技术公安部重点实验室,西安 710121,西安邮电大学图像与信息处理研究所,西安 710121,西安邮电大学图像与信息处理研究所,西安 710121;西安邮电大学电子信息现场勘验应用技术公安部重点实验室,西安 710121,西安邮电大学通信与信息工程学院,西安 710121

摘要：	随着深度神经网络的兴起,多模态学习受到广泛关注.跨模态检索是多模态学习的重要分支,其目的在于挖掘不同模态样本之间的关系,即通过一种模态样本来检索具有近似语义的另一种模态样本.近年来,跨模态检索逐渐成为国内外学术界研究的前沿和热点,是信息检索领域未来发展的重要方向.首先,聚焦于深度学习跨模态图文检索研究的最新进展,对基于...
关键词：	跨模态检索深度学习特征学习图文匹配实值表示二进制表示
Survey of Research on Deep Learning Image-Text Cross-Modal Retrieval

LIU Ying,GUO Yingying,FANG Jie,FAN Jiulun,HAO Yu,LIU Jiming.Survey of Research on Deep Learning Image-Text Cross-Modal Retrieval[J].Journal of Frontier of Computer Science and Technology,2022,16(3):489-511.

Authors:	LIU Ying GUO Yingying FANG Jie FAN Jiulun HAO Yu LIU Jiming

Affiliation:	(Center for Image and Information Processing,Xi'an University of Posts and Telecommunications,Xi'an 710121,China;International Joint Research Center for Wireless Communication and Information Processing Technology of Shaanxi Province,Xi'an 710121,China;Key Laboratory of Electronic Information Application Technology for Crime Scene Investigation,Ministry of Public Security,Xi'an University of Posts and Telecommunications,Xi'an 710121,China;School of Communications and Information Engineering,Xi'an University of Posts and Telecommunications,Xi'an 710121,China)

Abstract:	As the rapid development of deep neural networks,multi-modal learning techniques are widely concerned.Cross-modal retrieval is an important branch of multimodal learning.Its fundamental purpose is to reveal the relation between different modal samples by retrieving modal samples with identical semantics.In recent years,cross-modal retrieval has gradually become the forefront and hot spot of academic research.It’s an important direction in the future development of information retrieval.This paper focuses on the latest development of cross-modal retrieval based on deep learning,reviews the development trends of real value representation-based and binary representationbased learning methods systematically.Among them,the real value representation-based method is adopted to improve the semantic relevance,and improve the accuracy,and the binary representation-based learning method is used to improve the efficiency of image-text cross-modal retrieval and reduce storage space.In addition,the common open datasets in the field of image-text cross-modal retrieval are summarized,and the performance of various algorithms on different datasets is compared.Especially,this paper summarizes and analyzes the specified implementations of cross-modal retrieval techniques in the fields of public security,media and medicine.Finally,combined with the state-of-the-art technologies,development trends and future research directions are discussed.

Keywords:	cross-modal retrieval deep learning feature learning image-text matching real value representation binary representation
本文献已被维普万方数据等数据库收录！

设为首页 | 免责声明 | 关于勤云 | 加入收藏