首页 | 本学科首页   官方微博 | 高级检索  
     


Crossing textual and visual content in different application scenarios
Authors:Julien Ah-Pine  Marco Bressan  Stephane Clinchant  Gabriela Csurka  Yves Hoppenot  Jean-Michel Renders
Affiliation:(1) Xerox Research Centre Europe, 6, chemin de Maupertuis, 38240 Meylan, France
Abstract:This paper deals with multimedia information access. We propose two new approaches for hybrid text-image information processing that can be straightforwardly generalized to the more general multimodal scenario. Both approaches fall in the trans-media pseudo-relevance feedback category. Our first method proposes using a mixture model of the aggregate components, considering them as a single relevance concept. In our second approach, we define trans-media similarities as an aggregation of monomodal similarities between the elements of the aggregate and the new multimodal object. We also introduce the monomodal similarity measures for text and images that serve as basic components for both proposed trans-media similarities. We show how one can frame a large variety of problem in order to address them with the proposed techniques: image annotation or captioning, text illustration and multimedia retrieval and clustering. Finally, we present how these methods can be integrated in two applications: a travel blog assistant system and a tool for browsing the Wikipedia taking into account the multimedia nature of its content.
Contact Information Gabriela CsurkaEmail:

Dr. Julien Ah-Pine   joined the XRCE Grenoble as Research Engineer in 2007. He is part of the Textual and Visual Pattern Analysis group and his current research activities are related to multi-modal information retrieval and machine learning. He received his PhD degree in mathematics from Pierre and Marie Curie University (University of Paris 6). From 2003 to 2007, he was with Thales Communications, working on relational analysis, data and text mining methods and social choice theory. MediaObjects/11042_2008_246_Figa_HTML.gif Dr. Marco Bressan   is Area Manager of the Textual and Visual Pattern Analysis area at Xerox Research Centre Europe. His main research interests are statistical learning and classification; image and video semantic scene understanding; image enhancement and aesthetics; object detection and recognition, particularly when dealing with uncontrolled environments. Prior to Xerox, several of his contributions in these fields were applied to a variety of scenarios including biometric solutions, data mining, CBIR and industrial vision. Dr. Bressan holds a BA in Applied Mathematics from the University of Buenos Aires, a M.Sc. in Computer Vision from the Computer Vision Centre in Spain and a Ph.D. in Computer Science and Artificial Intelligence from the Autonomous University of Barcelona. He is an active member of the network of Argentinean researchers abroad and one of the founders of the network of computer vision and cognitive science researchers. MediaObjects/11042_2008_246_Figb_HTML.gif Stephane Clinchant   is Ph.D. Student at University Joseph Fourier (Grenoble, France) and at the Xerox Research Centre Europe, that he joined in 2005. Before joining XRCE, Stephane obtained a Master Degree in Computer Sciences in 2005 from the Ecole Nationale Superieure d’Electrotechnique, d’Informatique, d’Hydraulique et des Telecommunications (France). His current research interests mainly focus on Machine Learning for Natural Language Processing and Multimedia Information Access. MediaObjects/11042_2008_246_Figc_HTML.gif Dr. Gabriela Csurka   is a research scientist in the Textual and Visual Pattern Analysis team at Xerox Research Centre Europe (XRCE). She obtained her Ph.D. degree (1996) in Computer Science from University of Nice Sophia - Antipolis. Before joining XRCE in 2002, she worked in fields such as stereo vision and projective reconstruction at INRIA (Sophia Antipolis, Rhone Alpes and IRISA) and image and video watermarking at University of Geneva and Institute Eurécom, Sophia Antipolis. Author of several publications in main journals and international conferences, she is also an active reviewer both for journals and conferences. Her current research interest concerns the exploration of new technologies for image content and aesthetic analysis, cross-modal image categorization and semantic based image segmentation. MediaObjects/11042_2008_246_Figd_HTML.gif Yves Hoppenot   is in charge of the development and integration of new technologies in our European research Technology Showroom. He is a software expert for the production, office and services sectors. Yves joined the Xerox Research Centre Europe in 2001. He graduated from the Ecole National Superieure des Telecommunications, Brest in France, and received a Master of Science degree from the Tampere University of Technology in Finland. MediaObjects/11042_2008_246_Fige_HTML.gif Dr. Jean-Michel Renders   joined the XRCE Grenoble as Research Engineer in 2001. His current research interests mainly focus on Machine Learning techniques applied to Statistical Natural Language Processing and Text Mining. Before joining XRCE, Jean-Michel obtained a PhD in Applied Sciences from the University of Brussels in 1993. He started his research activities in 1988, in the field of Robotics Dynamics and Control. Then, he joined the Joint Research Center of the European Communities to work on biologial metaphors (Genetic Algorithms, Neural Networks and Immune Networks) applied to process control. After spending one year as Visiting Scientist at York University (England), he spent 4 years applying Artificial Intelligence and Machine Learning Techniques in Industry (Tractebel - Suez). Then, he worked as Data Mining Senior Consultant and led projects in most major Belgian banks and utilities. MediaObjects/11042_2008_246_Figf_HTML.gif
Keywords:Text-image information processing  Trans-media similarities  Cross-content information retrieval and browsing  Image auto-annotation  Multimedia document generation
本文献已被 SpringerLink 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号