Multimodal Retrieval using Mutual Information based Textual Query Reformulation期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

Multimodal Retrieval using Mutual Information based Textual Query Reformulation

Affiliation:	1. National University of Cuyo, Engineering Faculty, Mendoza, Argentina;2. CONICET, National Research Council, Argentina;1. Centro de Investigaciones Economicas, Administrativas y Sociales, Instituto Politecnico Nacional, Lauro Aguirre 120, col. Agricultura, Del. Miguel Hidalgo, 11360, Ciudad de Mexico, Mexico;2. Department of Control Automatic, Center for Research and Advanced Studies, Av. IPN 2508, Col. San Pedro Zacatenco, 07360 Mexico City, Mexico;1. Institute of Digital Healthcare, WMG, University of Warwick, Coventry CV4 7AL, United Kingdom;2. Faculty of Engineering and Natural Sciences, Computer Science and Engineering Department, Sabanc? University, Orhanl?-Tuzla, Istanbul 34956, Turkey;1. Escuela Superior de Tizayuca, Universidad Autonoma del Estado de Hidalgo, Tizayuca, Hidalgo, Mexico;2. Centro de Investigacion en Computacion, Instituto Politecnico Nacional, Mexico City, Mexico;3. Unidad Profesional Interdisciplinaria de Biotecnologia, Instituto Politecnico Nacional, Mexico City, Mexico;1. Institute for Intelligent Systems Research and Innovation Deakin University, Geelong, VIC, 3217, Australia;2. School of Engineering and Computer Science Victoria University of Wellington, New Zealand

Abstract:	Multimodal Retrieval is a well-established approach for image retrieval. Usually, images are accompanied by text caption along with associated documents describing the image. Textual query expansion as a form of enhancing image retrieval is a relatively less explored area. In this paper, we first study the effect of expanding textual query on both image and its associated text retrieval. Our study reveals that judicious expansion of textual query through keyphrase extraction can lead to better results, either in terms of text-retrieval or both image and text-retrieval. To establish this, we use two well-known keyphrase extraction techniques based on tf-idf and KEA. While query expansion results in increased retrieval efficiency, it is imperative that the expansion be semantically justified. So, we propose a graph-based keyphrase extraction model that captures the relatedness between words in terms of both mutual information and relevance feedback. Most of the existing works have stressed on bridging the semantic gap by using textual and visual features, either in combination or individually. The way these text and image features are combined determines the efficacy of any retrieval. For this purpose, we adopt Fisher-LDA to adjudge the appropriate weights for each modality. This provides us with an intelligent decision-making process favoring the feature set to be infused into the final query. Our proposed algorithm is shown to supersede the previously mentioned keyphrase extraction algorithms for query expansion significantly. A rigorous set of experiments performed on ImageCLEF-2011 Wikipedia Retrieval task dataset validates our claim that capturing the semantic relation between words through Mutual Information followed by expansion of a textual query using relevance feedback can simultaneously enhance both text and image retrieval.

Keywords:
本文献已被 ScienceDirect 等数据库收录！

设为首页 | 免责声明 | 关于勤云 | 加入收藏