首页 | 本学科首页   官方微博 | 高级检索  
     


Web image indexing by using associated texts
Authors:Zhiguo Gong  Leong Hou U  Chan Wa Cheang
Affiliation:(1) Faculty of Science and Technology, University of Macau, Macao, P.R. China
Abstract:In order to index Web images, the whole associated texts are partitioned into a sequence of text blocks, then the local relevance of a term to the corresponding image is calculated with respect to both its local occurrence in the block and the distance of the block to the image. Thus, the overall relevance of a term is determined as the sum of all its local weight values multiplied by the corresponding distance factors of the text blocks. In the present approach, the associated text of a Web image is firstly partitioned into three parts, including a page-oriented text (TM), a link-oriented text (LT), and a caption-oriented text (BT). Since the big size and semantic divergence, the caption-oriented text is further partitioned into finer blocks based on the tree structure of the tag elements within the BT text. During the processing, all heading nodes are pulled up in order to correlate with their semantic scopes, and a collapse algorithm is also exploited to remove the empty blocks. In our system, the relevant factors of the text blocks are determined by using a greedy Two-Way-Merging algorithm. Zhiguo Gong is an associate Professor in the Department of Computer and Information Science, Faculty of Science and Technology, University of Macau, Macao, China. He received his BS, MS, and PhD from the Hebei Normal University, Peking University, and the Chinese Academy of Science in 1983, 1988, and 1998, respectively. His research interests include Distributed Database, Multimedia Database, Digital Library, Web Information Retrieval, and Web Mining. Leong Hou U is currently a Master Candidate in the Department of Computer and Information Science, Faculty of Science and Technology, University of Macau, Macao, China. He received his BS from National Chi Nan University, Taiwan in 2003. His research interests include Web Information Retrieval and Web Mining. Chan Wa Cheang is currently a Master Candidate in the Department of Computer and Information Science, Faculty of Science and Technology, University of Macau, Macao, China. He received his BS from the National Taiwan University, Taiwan in 2003. His research interests include Web Information Retrieval and Web Mining.
Keywords:Web images  Text-based  Indexing  Segmentation  Retrieval
本文献已被 SpringerLink 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号