首页 | 本学科首页   官方微博 | 高级检索  
 共查询到20条相似文献,搜索用时 46 毫秒
This paper reports a document retrieval technique that retrieves machine-printed Latin-based document images through word shape coding. Adopting the idea of image annotation, a word shape coding scheme is proposed, which converts each word image into a word shape code by using a few shape features. The text contents of imaged documents are thus captured by a document vector constructed with the converted word shape code and word frequency information. Similarities between different document images are then gauged based on the constructed document vectors. We divide the retrieval process into two stages. Based on the observation that documents of the same language share a large number of high-frequency language-specific stop words, the first stage retrieves documents with the same underlying language as that of the query document. The second stage then re-ranks the documents retrieved in the first stage based on the topic similarity. Experiments show that document images of different languages and topics can be retrieved properly by using the proposed word shape coding scheme.  相似文献   

This paper presents a graph-theoretic approach for interactive region-based image retrieval. When dealing with image matching problems, we use graphs to represent images, transform the region correspondence estimation problem into an inexact graph matching problem, and propose an optimization technique to derive the solution. We then define the image distance in terms of the estimated region correspondence. In the relevance feedback steps, with the estimated region correspondence, we propose to use a maximum likelihood method to re-estimate the ideal query and the image distance measurement. Experimental results show that the proposed graph-theoretic image matching criterion outperforms the other methods incorporating no spatially adjacent relationship within images. Furthermore, our maximum likelihood method combined with the estimated region correspondence improves the retrieval performance in feedback steps.  相似文献   

为实现基于关键词的维吾尔文文档图像检索,提出一种基于由粗到细层级匹配的关键词文档图像检索方法。使用改进的投影切分法将经过预处理的文档图像切分成单词图像库,使用模板匹配对关键词进行粗匹配;在粗匹配的基础上,提取单词图像的方向梯度直方图(HOG)特征向量;通过支持向量机(SVM)分类器学习特征向量,实现关键词图像检索。在包含108张文档图像的数据库中进行实验,实验结果表明,检索准确率平均值为91.14%,召回率平均值为79.31%,该方法能有效实现基于关键词的维吾尔文文档图像检索。  相似文献   

This paper presents a document retrieval technique that is capable of searching document images without OCR (optical character recognition). The proposed technique retrieves document images by a new word shape coding scheme, which captures the document content through annotating each word image by a word shape code. In particular, we annotate word images by using a set of topological shape features including character ascenders/descenders, character holes, and character water reservoirs. With the annotated word shape codes, document images can be retrieved by either query keywords or a query document image. Experimental results show that the proposed document image retrieval technique is fast, efficient, and tolerant to various types of document degradation.  相似文献   

Information spotting in scanned historical document images is a very challenging task. The joint use of the mechanical press and of human controlled inking introduced great variability in ink level within a book or even within a page. Consequently characters are often broken or merged together and thus become difficult to segment and recognize. The limitations of commercial OCR engines for information retrieval in historical document images have inspired alternative means of identification of given words in such documents. We present a word spotting method for scanned documents in order to find the word images that are similar to a query word, without assuming a correct segmentation of the words into characters. The connected components are first processed to transform a word pattern into a sequence of sub-patterns. Each sub-pattern is represented by a sequence of feature vectors. A modified Edit distance is proposed to perform a segmentation-driven string matching and to compute the Segmentation Driven Edit (SDE) distance between the words to be compared. The set of SDE operations is defined to obtain the word segmentations that are the most appropriate to evaluate their similarity. These operations are efficient to cope with broken and touching characters in words. The distortion of character shapes is handled by coupling the string matching process with local shape comparisons that are achieved by Dynamic Time Warping (DTW). The costs of the SDE operations are provided by the DTW distances. A sub-optimal version of the SDE string matching is also proposed to reduce the computation time, nevertheless it did not lead to a great decrease in performance. It is possible to enter a query by example or a textual query entered with the keyboard. Textual queries can be used to directly spot the word without the need to synthesize its image, as far as character prototype images are available. Results are presented for different documents and compared with other methods, showing the efficiency of our method.  相似文献   

As one of the most pervasive methods of individual identification and document authentication, signatures present convincing evidence and provide an important form of indexing for effective document image processing and retrieval in a broad range of applications. However, detection and segmentation of free-form objects such as signatures from clustered background is currently an open document analysis problem. In this paper, we focus on two fundamental problems in signature-based document image retrieval. First, we propose a novel multiscale approach to jointly detecting and segmenting signatures from document images. Rather than focusing on local features that typically have large variations, our approach captures the structural saliency using a signature production model and computes the dynamic curvature of 2D contour fragments over multiple scales. This detection framework is general and computationally tractable. Second, we treat the problem of signature retrieval in the unconstrained setting of translation, scale, and rotation invariant nonrigid shape matching. We propose two novel measures of shape dissimilarity based on anisotropic scaling and registration residual error and present a supervised learning framework for combining complementary shape information from different dissimilarity metrics using LDA. We quantitatively study state-of-the-art shape representations, shape matching algorithms, measures of dissimilarity, and the use of multiple instances as query in document image retrieval. We further demonstrate our matching techniques in offline signature verification. Extensive experiments using large real-world collections of English and Arabic machine-printed and handwritten documents demonstrate the excellent performance of our approaches.  相似文献   

Word searching in non-structural layout such as graphical documents is a difficult task due to arbitrary orientations of text words and the presence of graphical symbols. This paper presents an efficient approach for word searching in documents of non-structural layout using an efficient indexing and retrieval approach. The proposed indexing scheme stores spatial information of text characters of a document using a character spatial feature table (CSFT). The spatial feature of text component is derived from the neighbor component information. The character labeling of a multi-scaled and multi-oriented component is performed using support vector machines. For searching purpose, the positional information of characters is obtained from the query string by splitting it into possible combinations of character pairs. Each of these character pairs searches the position of corresponding text in document with the help of CSFT. Next, the searched text components are joined and formed into sequence by spatial information matching. String matching algorithm is performed to match the query word with the character pair sequence in documents. The experimental results are presented on two different datasets of graphical documents: maps dataset and seal/logo image dataset. The results show that the method is efficient to search query word from unconstrained document layouts of arbitrary orientation.  相似文献   

基于词汇树的词袋模型(Bag-of-Words)表示算法是目前图像检索领域中的主流算法.针对传统词汇树方法中空间上下文信息缺失的问题,提出一种基于空间上下文加权词汇树的图像检索方法.该方法在词汇树框架下,首先生成SIFT点的空间上下文信息描述.然后利用SIFT点间的空间上下文相似度对SIFT间的匹配得分进行加权,得到图像间的相似度.最后,通过相似度排序完成图像检索.实验结果表明,该方法能够大幅度提高图像检索的性能,同时,对大规模图像库有较好的适用性.  相似文献   

Image retrieval is an important problem for researchers in computer vision and content-based image retrieval (CBIR) fields. Over the last decades, many image retrieval systems were based on image representation as a set of extracted low-level features such as color, texture and shape. Then, systems calculate similarity metrics between features in order to find similar images to a query image. The disadvantage of this approach is that images visually and semantically different may be similar in the low level feature space. So, it is necessary to develop tools to optimize retrieval of information. Integration of vector space models is one solution to improve the performance of image retrieval. In this paper, we present an efficient and effective retrieval framework which includes a vectorization technique combined with a pseudo relevance model. The idea is to transform any similarity matching model (between images) to a vector space model providing a score. A study on several methodologies to obtain the vectorization is presented. Some experiments have been undertaken on Wang, Oxford5k and Inria Holidays datasets to show the performance of our proposed framework.  相似文献   

Text Retrieval from Document Images Based on Word Shape Analysis   总被引:2,自引:1,他引:2  
In this paper, we propose a method of text retrieval from document images using a similarity measure based on word shape analysis. We directly extract image features instead of using optical character recognition. Document images are segmented into word units and then features called vertical bar patterns are extracted from these word units through local extrema points detection. All vertical bar patterns are used to build document vectors. Lastly, we obtain the pair-wise similarity of document images by means of the scalar product of the document vectors. Four corpora of news articles were used to test the validity of our method. During the test, the similarity of document images using this method was compared with the result of ASCII version of those documents based on the N-gram algorithm for text documents.  相似文献   

In this paper, we propose a keyword retrieval system for locating words in historical Mongolian document images. Based on the word spotting technology, a collection of historical Mongolian document images is converted into a collection of word images by word segmentation, and a number of profile-based features are extracted to represent word images. For each word image, a fixed-length feature vector is formulated by obtaining the appropriate number of the complex coefficients of discrete Fourier transform on each profile feature. The system supports online image-to-image matching by calculating similarities between a query word image and each word image in the collection, and consequently, a ranked result is returned in descending order of the similarities. Therein, the query word image can be generated by synthesizing a sequence of glyphs when being retrieved. By experimental evaluations, the performance of the system is confirmed.  相似文献   

In this paper, we present a method of image indexing and retrieval which takes into account the relative positions of the regions within the image. Indexing is based on a segmentation of the image into fuzzy regions; we propose an algorithm which produces a fuzzy segmentation. The image retrieval is based on inexact graph matching, taking into account both the similarity between regions and the spatial relation between them. We propose, on one hand a solution to reduce the combinatorial complexity of the graph matching, and on the other hand, a measure of similarity between graphs allowing the result images ranking. A relevance feedback process based on region classifiers allows then a good generalization to a large variety of the regions. The method is adapted to partial queries, aiming for example at retrieving images containing a specific type of object. Applications may be of two types, firstly an on-line search from a partial query, with a relevance feedback aiming at interactively leading the search, and secondly an off-line learning of categories from a set of examples of the object. The name of the system is FReBIR for Fuzzy Region-Based Image Retrieval.  相似文献   

Content-based image retrieval by hierarchical linear subspace method   总被引:1,自引:0,他引:1  
We describe a hierarchical linear subspace method to query large on-line image databases using image similarity as the basis of the queries. The method is based on the generic multimedia indexing (GEMINI) approach which is used in the IBM query through the image content search system. Our approach is demonstrated on image indexing, in which the subspaces correspond to different resolutions of the images. During content-based image retrieval, the search starts in the subspace with the lowest resolution of the images. In this subspace, the set of all possible similar images is determined. In the next subspace, additional metric information corresponding to a higher resolution is used to reduce this set. This procedure is repeated until the similar images can be determined. For evaluation we used three image databases and two different subspace sequences.  相似文献   

In this paper, we introduce a novel visual similarity measuring technique to retrieve face images in photo album databases for law enforcement. Though much work is being done on face similarity matching techniques, little attention is given to the design of face matching schemes suitable for visual retrieval in single model databases where accuracy, robustness to scale and environmental changes, and computational efficiency are three important issues to be considered. This paper presents a robust face retrieval approach using structural and spatial point correspondence in which the directional corner points (DCPs) are generated for efficient face coding and retrieval. A complete investigation on the proposed method is conducted, which covers face retrieval under controlled/ideal condition, scale variations, environmental changes and subject actions. The system performance is compared with the performance of the eigenface method. It is an attractive finding that the proposed DCP retrieval technique has performed superior to the eigenface method in most of the comparison experiments. This research demonstrates that the proposed DCP approach provides a new way, which is both robust to scale and environmental changes, and efficient in computation, for retrieving human faces in single model databases.  相似文献   

A near-duplicate document image matching approach characterized by a graphical perspective is proposed in this paper. Document images are represented by graphs whose nodes correspond to the objects in the images. Consequently, the image matching problem is then converted to graph matching. To deal with the instability of object segmentation, a multi-granularity object tree is constructed for a document image. Each level in the tree corresponds to one possible object segmentation, while different levels are characterized by various object granularities. Some graphs can be generated from the tree and the objects associated with each graph may be of different granularities. Two graphs with the maximum similarity are found from the multi-granularity object trees of the two near-duplicate document images which are to be matched. The encouraging experimental results have demonstrated the effectiveness of the proposed approach.  相似文献   

This paper presents an integrated approach to spot the spoken keywords in digitized Tamil documents by combining word image matching and spoken word recognition techniques. The work involves the segmentation of document images into words, creation of an index of keywords, and construction of word image hidden Markov model (HMM) and speech HMM for each keyword. The word image HMMs are constructed using seven dimensional profile and statistical moment features and used to recognize a segmented word image for possible inclusion of the keyword in the index. The spoken query word is recognized using the most likelihood of the speech HMMs using the 39 dimensional mel frequency cepstral coefficients derived from the speech samples of the keywords. The positional details of the search keyword obtained from the automatically updated index retrieve the relevant portion of text from the document during word spotting. The performance measures such as recall, precision, and F-measure are calculated for 40 test words from the four groups of literary documents to illustrate the ability of the proposed scheme and highlight its worthiness in the emerging multilingual information retrieval scenario.  相似文献   

《Image and vision computing》2007,25(11):1802-1813
Sketch-based image retrieval systems need to handle two main problems. First of all, they have to recognize shapes similar but not necessarily identical to the user’s query. Hence, exact object identification techniques do not fit in this case. The second problem is the selection of the image features to compare with the user’s sketch. In domain-independent visual repositories, real-life images with non-uniform background and possible occluding objects make this second task particularly hard.We address the second problem proposing a variant of the well-known Generalized Hough Transform (GHT), which is a robust object identification technique for unsegmented images. Moreover, we solve the first problem modifying the GHT to deal with an inexact matching problem. In this paper, we show how this idea can be efficiently and accurately realized. Experimental results are shown with two different databases of real, unsegmented images.  相似文献   

RUBRIC: A System for Rule-Based Information Retrieval   总被引:1,自引:0,他引:1  
A research prototype software system for conceptual information retrieval has been developed. The goal of the system, called RUBRIC, is to provide more automated and relevant access to unformatted textual databases. The approach is to use production rules from artificial intelligence to define a hierarchy of retrieval subtopics, with fuzzy context expressions and specific word phrases at the bottom. RUBRIC allows the definition of detailed queries starting at a conceptual level, partial matching of a query and a document, selection of only the highest ranked documents for presentation to the user, and detailed explanation of how and why a particular document was selected. Initial experiments indicate that a RUBRIC rule set better matches human retrieval judgment than a standard Boolean keyword expression, given equal amounts of effort in defining each. The techniques presented may be useful in stand-alone retrieval systems, front-ends to existing information retrieval systems, or real-time document filtering and routing.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号