首页 | 本学科首页   官方微博 | 高级检索  
 共查询到20条相似文献,搜索用时 0 毫秒
Abstract. For document images corrupted by various kinds of noise, direct binarization images may be severely blurred and degraded. A common treatment for this problem is to pre-smooth input images using noise-suppressing filters. This article proposes an image-smoothing method used for prefiltering the document image binarization. Conceptually, we propose that the influence range of each pixel affecting its neighbors should depend on local image statistics. Technically, we suggest using coplanar matrices to capture the structural and textural distribution of similar pixels at each site. This property adapts the smoothing process to the contrast, orientation, and spatial size of local image structures. Experimental results demonstrate the effectiveness of the proposed method, which compares favorably with existing methods in reducing noise and preserving image features. In addition, due to the adaptive nature of the similar pixel definition, the proposed filter output is more robust regarding different noise levels than existing methods. Received: October 31, 2001 / October 09, 2002 Correspondence to:L. Fan (e-mail: fanlixin@ieee.org)  相似文献   

Binarization of document images with poor contrast, strong noise, complex patterns, and variable modalities in the gray-scale histograms is a challenging problem. A new binarization algorithm has been developed to address this problem for personal cheque images. The main contribution of this approach is optimizing the binarization of a part of the document image that suffers from noise interference, referred to as the Target Sub-Image (TSI), using information easily extracted from another noise-free part of the same image, referred to as the Model Sub-Image (MSI). Simple spatial features extracted from MSI are used as a model for handwriting strokes. This model captures the underlying characteristics of the writing strokes, and is invariant to the handwriting style or content. This model is then utilized to guide the binarization in the TSI. Another contribution is a new technique for the structural analysis of document images, which we call “Wavelet Partial Reconstruction” (WPR). The algorithm was tested on 4,200 cheque images and the results show significant improvement in binarization quality in comparison with other well-established algorithms. Received: October 10, 2001 / Accepted: May 7, 2002 This research was supported in part by NCR and NSERC's industrial postgraduate scholarship No. 239464. A simplified version of this paper has been presented at ICDAR 2001 [3].  相似文献   

Binary image representation is essential format for document analysis. In general, different available binarization techniques are implemented for different types of binarization problems. The majority of binarization techniques are complex and are compounded from filters and existing operations. However, the few simple thresholding methods available cannot be applied to many binarization problems. In this paper, we propose a local binarization method based on a simple, novel thresholding method with dynamic and flexible windows. The proposed method is tested on selected samples called the DIBCO 2009 benchmark dataset using specialized evaluation techniques for binarization processes. To evaluate the performance of our proposed method, we compared it with the Niblack, Sauvola and NICK methods. The results of the experiments show that the proposed method adapts well to all types of binarization challenges, can deal with higher numbers of binarization problems and boosts the overall performance of the binarization.  相似文献   

A new adaptive thresholding algorithm concerning extraction of targets from the background in a given image sequence is proposed. The conventional histogram-based or fixed-value thresholdings are deficient in detecting targets due to the poor contrast between targets and the background, or to the change of illumination. This research solves the problems mentioned above by learning the characteristics of the background from the given images and determines the proper thresholds based on this information. Experiments show that the proposed algorithm is superior to the optimal layering algorithm in target detection and tracking. Received: 28 December 1999 / Accepted: 8 August 2000  相似文献   

Document image segmentation is the first step in document image analysis and understanding. One major problem centres on the performance analysis of the evolving segmentation algorithms. The use of a standard document database maintained at the Universities/Research Laboratories helps to solve the problem of getting authentic data sources and other information, but some methodologies have to be used for performance analysis of the segmentation. We describe a new document model in terms of a bounding box representation of its constituent parts and suggest an empirical measure of performance of a segmentation algorithm based on this new graph-like model of the document. Besides the global error measures, the proposed method also produces segment-wise details of common segmentation problems such as horizontal and vertical split and merge as well as invalid and mismatched regions. Received July 14, 2000 / Revised June 12, 2001[-1mm]  相似文献   

In the literature, many feature types are proposed for document classification. However, an extensive and systematic evaluation of the various approaches has not yet been done. In particular, evaluations on OCR documents are very rare. In this paper we investigate seven text representations based on n-grams and single words. We compare their effectiveness in classifying OCR texts and the corresponding correct ASCII texts in two domains: business letters and abstracts of technical reports. Our results indicate that the use of n-grams is an attractive technique which can even compare to techniques relying on a morphological analysis. This holds for OCR texts as well as for correct ASCII texts. Received February 17, 1998 / Revised April 8, 1998  相似文献   

This paper describes a novel method for extracting text from document pages of mixed content. The method works by detecting pieces of text lines in small overlapping columns of width , shifted with respect to each other by image elements (good default values are: of the image width, ) and by merging these pieces in a bottom-up fashion to form complete text lines and blocks of text lines. The algorithm requires about 1.3 s for a 300 dpi image on a PC with a Pentium II CPU, 300 MHz, MotherBoard Intel440LX. The algorithm is largely independent of the layout of the document, the shape of the text regions, and the font size and style. The main assumptions are that the background be uniform and that the text sit approximately horizontally. For a skew of up to about 10 degrees no skew correction mechanism is necessary. The algorithm has been tested on the UW English Document Database I of the University of Washington and its performance has been evaluated by a suitable measure of segmentation accuracy. Also, a detailed analysis of the segmentation accuracy achieved by the algorithm as a function of noise and skew has been carried out. Received April 4, 1999 / Revised June 1, 1999  相似文献   

Knowledge-based systems for document analysis and understanding (DAU) are quite useful whenever analysis has to deal with the changing of free-form document types which require different analysis components. In this case, declarative modeling is a good way to achieve flexibility. An important application domain for such systems is the business letter domain. Here, high accuracy and the correct assignment to the right people and the right processes is a crucial success factor. Our solution to this proposes a comprehensive knowledge-centered approach: we model not only comparatively static knowledge concerning document properties and analysis results within the same declarative formalism, but we also include the analysis task and the current context of the system environment within the same formalism. This allows an easy definition of new analysis tasks and also an efficient and accurate analysis by using expectations about incoming documents as context information. The approach described has been implemented within the VOPR (VOPR is an acronym for the Virtual Office PRototype.) system. This DAU system gains the required context information from a commercial workflow management system (WfMS) by constant exchanges of expectations and analysis tasks. Further interaction between these two systems covers the delivery of results from DAU to the WfMS and the delivery of corrected results vice versa. Received June 19, 1999 / Revised November 8, 2000  相似文献   

We present a useful method for assessing the quality of a typewritten document image and automatically selecting an optimal restoration method based on that assessment. We use five quality measures that assess the severity of background speckle, touching characters, and broken characters. A linear classifier uses these measures to select a restoration method. On a 139-document corpus, our methodology reduced the corpus OCR character error rate from 20.27% to 12.60%. Received November 10, 1998 / Revised October 27, 1999  相似文献   

A segmentation algorithm using a water flow model [Kim et al., Pattern Recognition 35 (2002) 265–277] has already been presented where a document image can be efficiently divided into two regions, characters and background, due to the property of locally adaptive thresholding. However, this method has not decided when to stop the iterative process and required long processing time. Plus, characters on poor contrast backgrounds often fail to be separated successfully. Accordingly, to overcome the above drawbacks to the existing method, the current paper presents an improved approach that includes extraction of regions of interest (ROIs), an automatic stopping criterion, and hierarchical thresholding. Experimental results show that the proposed method can achieve a satisfactory binarization quality, especially for document images with a poor contrast background, and is significantly faster than the existing method.  相似文献   

The most noticeable characteristic of a construction tender document is that its hierarchical architecture is not obviously expressed but is implied in the citing information. Currently available methods cannot deal with such documents. In this paper, the intra-page and inter-page relationships are analyzed in detail. The creation of citing relationships is essential to extracting the logical structure of tender documents. The hierarchy of tender documents naturally leads to extracting and displaying the logical structure as tree structure. This method is successfully implemented in VHTender, and is the key to the efficiency and flexibility of the whole system. Received February 28, 2000 / Revised October 20, 2000  相似文献   

Image-processing systems, each consisting of massively parallel photodetectors and digital processing elements on a monolithic circuit, are currently being developed by several researchers. Some earlyvision-like processing algorithms are installed in the vision systems. However, they are not sufficient for applications because their output is in the form of pattern information, so that, in order to respond to input, some feature values are required to be extracted from the pattern. In the present paper, we propose a robust method for extracting feature values associated with images in a massively parallel vision system.  相似文献   

In this paper a system for analysis and automatic indexing of imaged documents for high-volume applications is described. This system, named STRETCH (STorage and RETrieval by Content of imaged documents), is based on an Archiving and Retrieval Engine, which overcomes the bottleneck of document profiling bypassing some limitations of existing pre-defined indexing schemes. The engine exploits a structured document representation and can activate appropriate methods to characterise and automatically index heterogeneous documents with variable layout. The originality of STRETCH lies principally in the possibility for unskilled users to define the indexes relevant to the document domains of their interest by simply presenting visual examples and applying reliable automatic information extraction methods (document classification, flexible reading strategies) to index the documents automatically, thus creating archives as desired. STRETCH offers ease of use and application programming and the ability to dynamically adapt to new types of documents. The system has been tested in two applications in particular, one concerning passive invoices and the other bank documents. In these applications, several classes of documents are involved. The indexing strategy first automatically classifies the document, thus avoiding pre-sorting, then locates and reads the information pertaining to the specific document class. Experimental results are encouraging overall; in particular, document classification results fulfill the requirements of high-volume application. Integration into production lines is under execution. Received March 30, 2000 / Revised June 26, 2001  相似文献   

Document image processing is a crucial process in office automation and begins at the ‘OCR’ phase with difficulties in document ‘analysis’ and ‘understanding’. This paper presents a hybrid and comprehensive approach to document structure analysis. Hybrid in the sense that it makes use of layout (geometrical) as well as textual features of a given document. These features are the base for potential conditions which in turn are used to express fuzzy matched rules of an underlying rule base. Rules can be formulated based on features which might be observed within one specific layout object. However, rules can also express dependencies between different layout objects. In addition to its rule driven analysis, which allows an easy adaptation to specific domains with their specific logical objects, the system contains domain-independent markup algorithms for common objects (e.g., lists). Received June 19, 2000 / Revised November 8, 2000  相似文献   

Searching for documents by their type or genre is a natural way to enhance the effectiveness of document retrieval. The layout of a document contains a significant amount of information that can be used to classify it by type in the absence of domain-specific models. Our approach to classification is based on “visual similarity” of layout structure and is implemented by building a supervised classifier, given examples of each class. We use image features such as percentages of text and non-text (graphics, images, tables, and rulings) content regions, column structures, relative point sizes of fonts, density of content area, and statistics of features of connected components which can be derived without class knowledge. In order to obtain class labels for training samples, we conducted a study where subjects ranked document pages with respect to their resemblance to representative page images. Class labels can also be assigned based on known document types, or can be defined by the user. We implemented our classification scheme using decision tree classifiers and self-organizing maps. Received June 15, 2000 / Revised November 15, 2000  相似文献   

The combination of SGML and database technology allows to refine both declarative and navigational access mechanisms for structured document collection: with regard to declarative access, the user can formulate complex information needs without knowing a query language, the respective document type definition (DTD) or the underlying modelling. Navigational access is eased by hyperlink-rendition mechanisms going beyond plain link-integrity checking. With our approach, the database-internal representation of documents is configurable. It allows for an efficient implementation of operations, because DTD knowledge is not needed for document structure recognition. We show how the number of method invocations and the cost of parsing can be significantly reduced. Edited by Y.C. Tay. Received April 22, 1996 / Accepted March 16, 1997  相似文献   

In this work, a multi-scale binarization framework is introduced, which can be used along with any adaptive threshold-based binarization method. This framework is able to improve the binarization results and to restore weak connections and strokes, especially in the case of degraded historical documents. This is achieved thanks to localized nature of the framework on the spatial domain. The framework requires several binarizations on different scales, which is addressed by introduction of fast grid-based models. This enables us to explore high scales which are usually unreachable to the traditional approaches. In order to expand our set of adaptive methods, an adaptive modification of Otsu's method, called AdOtsu, is introduced. In addition, in order to restore document images suffering from bleed-through degradation, we combine the framework with recursive adaptive methods. The framework shows promising performance in subjective and objective evaluations performed on available datasets.  相似文献   

A new method based on an optics model for highly reliable surface inspection of industrial parts has been developed. This method uses multiple images taken under different camera conditions. Phong's model is employed for surface reflection, and then the albedo and the reflection model parameters are estimated by the least squares method. The developed method has advantages over conventional binarization in that it can easily determine the threshold of product acceptability and cope with changes in light intensity when detecting defects.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号