共查询到20条相似文献,搜索用时 0 毫秒
1.
Lixin Fan Liying Fan Chew Lim Tan 《International Journal on Document Analysis and Recognition》2003,5(2-3):88-101
Abstract. For document images corrupted by various kinds of noise, direct binarization images may be severely blurred and degraded.
A common treatment for this problem is to pre-smooth input images using noise-suppressing filters. This article proposes an
image-smoothing method used for prefiltering the document image binarization. Conceptually, we propose that the influence
range of each pixel affecting its neighbors should depend on local image statistics. Technically, we suggest using coplanar matrices to capture the structural and textural distribution of similar pixels at each site. This property adapts the smoothing process
to the contrast, orientation, and spatial size of local image structures. Experimental results demonstrate the effectiveness
of the proposed method, which compares favorably with existing methods in reducing noise and preserving image features. In
addition, due to the adaptive nature of the similar pixel definition, the proposed filter output is more robust regarding
different noise levels than existing methods.
Received: October 31, 2001 / October 09, 2002
Correspondence to:L. Fan (e-mail: fanlixin@ieee.org) 相似文献
2.
Amer Dawoud Mohamed Kamel 《International Journal on Document Analysis and Recognition》2002,5(1):28-38
Binarization of document images with poor contrast, strong noise, complex patterns, and variable modalities in the gray-scale
histograms is a challenging problem. A new binarization algorithm has been developed to address this problem for personal
cheque images. The main contribution of this approach is optimizing the binarization of a part of the document image that
suffers from noise interference, referred to as the Target Sub-Image (TSI), using information easily extracted from another
noise-free part of the same image, referred to as the Model Sub-Image (MSI). Simple spatial features extracted from MSI are
used as a model for handwriting strokes. This model captures the underlying characteristics of the writing strokes, and is
invariant to the handwriting style or content. This model is then utilized to guide the binarization in the TSI. Another contribution
is a new technique for the structural analysis of document images, which we call “Wavelet Partial Reconstruction” (WPR). The
algorithm was tested on 4,200 cheque images and the results show significant improvement in binarization quality in comparison
with other well-established algorithms.
Received: October 10, 2001 / Accepted: May 7, 2002
This research was supported in part by NCR and NSERC's industrial postgraduate scholarship No. 239464.
A simplified version of this paper has been presented at ICDAR 2001 [3]. 相似文献
3.
An adaptive local binarization method for document images based on a novel thresholding method and dynamic windows 总被引:1,自引:0,他引:1
Bilal Bataineh Siti Norul Huda Sheikh Abdullah Khairuddin Omar 《Pattern recognition letters》2011,32(14):1805-1813
Binary image representation is essential format for document analysis. In general, different available binarization techniques are implemented for different types of binarization problems. The majority of binarization techniques are complex and are compounded from filters and existing operations. However, the few simple thresholding methods available cannot be applied to many binarization problems. In this paper, we propose a local binarization method based on a simple, novel thresholding method with dynamic and flexible windows. The proposed method is tested on selected samples called the DIBCO 2009 benchmark dataset using specialized evaluation techniques for binarization processes. To evaluate the performance of our proposed method, we compared it with the Niblack, Sauvola and NICK methods. The results of the experiments show that the proposed method adapts well to all types of binarization challenges, can deal with higher numbers of binarization problems and boosts the overall performance of the binarization. 相似文献
4.
A new adaptive thresholding algorithm concerning extraction of targets from the background in a given image sequence is proposed.
The conventional histogram-based or fixed-value thresholdings are deficient in detecting targets due to the poor contrast
between targets and the background, or to the change of illumination. This research solves the problems mentioned above by
learning the characteristics of the background from the given images and determines the proper thresholds based on this information.
Experiments show that the proposed algorithm is superior to the optimal layering algorithm in target detection and tracking.
Received: 28 December 1999 / Accepted: 8 August 2000 相似文献
5.
Amit Kumar Das Sanjoy Kumar Saha Bhabatosh Chanda 《International Journal on Document Analysis and Recognition》2002,4(3):183-190
Document image segmentation is the first step in document image analysis and understanding. One major problem centres on
the performance analysis of the evolving segmentation algorithms. The use of a standard document database maintained at the
Universities/Research Laboratories helps to solve the problem of getting authentic data sources and other information, but
some methodologies have to be used for performance analysis of the segmentation. We describe a new document model in terms
of a bounding box representation of its constituent parts and suggest an empirical measure of performance of a segmentation
algorithm based on this new graph-like model of the document. Besides the global error measures, the proposed method also
produces segment-wise details of common segmentation problems such as horizontal and vertical split and merge as well as invalid
and mismatched regions.
Received July 14, 2000 / Revised June 12, 2001[-1mm] 相似文献
6.
7.
Markus Junker Rainer Hoch 《International Journal on Document Analysis and Recognition》1998,1(2):116-122
In the literature, many feature types are proposed for document classification. However, an extensive and systematic evaluation
of the various approaches has not yet been done. In particular, evaluations on OCR documents are very rare. In this paper
we investigate seven text representations based on n-grams and single words. We compare their effectiveness in classifying OCR texts and the corresponding correct ASCII texts
in two domains: business letters and abstracts of technical reports. Our results indicate that the use of n-grams is an attractive technique which can even compare to techniques relying on a morphological analysis. This holds for
OCR texts as well as for correct ASCII texts.
Received February 17, 1998 / Revised April 8, 1998 相似文献
8.
Pietro Parodi Roberto Fontana 《International Journal on Document Analysis and Recognition》1999,2(2-3):67-79
This paper describes a novel method for extracting text from document pages of mixed content. The method works by detecting
pieces of text lines in small overlapping columns of width , shifted with respect to each other by image elements (good default values are: of the image width, ) and by merging these pieces in a bottom-up fashion to form complete text lines and blocks of text lines. The algorithm requires
about 1.3 s for a 300 dpi image on a PC with a Pentium II CPU, 300 MHz, MotherBoard Intel440LX. The algorithm is largely independent
of the layout of the document, the shape of the text regions, and the font size and style. The main assumptions are that the
background be uniform and that the text sit approximately horizontally. For a skew of up to about 10 degrees no skew correction
mechanism is necessary. The algorithm has been tested on the UW English Document Database I of the University of Washington
and its performance has been evaluated by a suitable measure of segmentation accuracy. Also, a detailed analysis of the segmentation
accuracy achieved by the algorithm as a function of noise and skew has been carried out.
Received April 4, 1999 / Revised June 1, 1999 相似文献
9.
Claudia Wenzel Heiko Maus 《International Journal on Document Analysis and Recognition》2001,3(4):248-260
Knowledge-based systems for document analysis and understanding (DAU) are quite useful whenever analysis has to deal with
the changing of free-form document types which require different analysis components. In this case, declarative modeling is
a good way to achieve flexibility. An important application domain for such systems is the business letter domain. Here, high
accuracy and the correct assignment to the right people and the right processes is a crucial success factor. Our solution
to this proposes a comprehensive knowledge-centered approach: we model not only comparatively static knowledge concerning
document properties and analysis results within the same declarative formalism, but we also include the analysis task and
the current context of the system environment within the same formalism. This allows an easy definition of new analysis tasks
and also an efficient and accurate analysis by using expectations about incoming documents as context information. The approach
described has been implemented within the VOPR (VOPR is an acronym for the Virtual Office PRototype.) system. This DAU system
gains the required context information from a commercial workflow management system (WfMS) by constant exchanges of expectations
and analysis tasks. Further interaction between these two systems covers the delivery of results from DAU to the WfMS and
the delivery of corrected results vice versa.
Received June 19, 1999 / Revised November 8, 2000 相似文献
10.
Michael Cannon Judith Hochberg Patrick Kelly 《International Journal on Document Analysis and Recognition》1999,2(2-3):80-89
We present a useful method for assessing the quality of a typewritten document image and automatically selecting an optimal
restoration method based on that assessment. We use five quality measures that assess the severity of background speckle,
touching characters, and broken characters. A linear classifier uses these measures to select a restoration method. On a 139-document
corpus, our methodology reduced the corpus OCR character error rate from 20.27% to 12.60%.
Received November 10, 1998 / Revised October 27, 1999 相似文献
11.
A segmentation algorithm using a water flow model [Kim et al., Pattern Recognition 35 (2002) 265–277] has already been presented where a document image can be efficiently divided into two regions, characters and background, due to the property of locally adaptive thresholding. However, this method has not decided when to stop the iterative process and required long processing time. Plus, characters on poor contrast backgrounds often fail to be separated successfully. Accordingly, to overcome the above drawbacks to the existing method, the current paper presents an improved approach that includes extraction of regions of interest (ROIs), an automatic stopping criterion, and hierarchical thresholding. Experimental results show that the proposed method can achieve a satisfactory binarization quality, especially for document images with a poor contrast background, and is significantly faster than the existing method. 相似文献
12.
Shuhua Wang Yang Cao Shijie Cai 《International Journal on Document Analysis and Recognition》2001,4(1):27-34
The most noticeable characteristic of a construction tender document is that its hierarchical architecture is not obviously
expressed but is implied in the citing information. Currently available methods cannot deal with such documents. In this paper,
the intra-page and inter-page relationships are analyzed in detail. The creation of citing relationships is essential to extracting
the logical structure of tender documents. The hierarchy of tender documents naturally leads to extracting and displaying
the logical structure as tree structure. This method is successfully implemented in VHTender, and is the key to the efficiency
and flexibility of the whole system.
Received February 28, 2000 / Revised October 20, 2000 相似文献
13.
Image-processing systems, each consisting of massively parallel photodetectors and digital processing elements on a monolithic
circuit, are currently being developed by several researchers. Some earlyvision-like processing algorithms are installed in
the vision systems. However, they are not sufficient for applications because their output is in the form of pattern information,
so that, in order to respond to input, some feature values are required to be extracted from the pattern. In the present paper,
we propose a robust method for extracting feature values associated with images in a massively parallel vision system. 相似文献
14.
E. Appiani F. Cesarini A.M. Colla M. Diligenti M. Gori S. Marinai G. Soda 《International Journal on Document Analysis and Recognition》2001,4(2):69-83
In this paper a system for analysis and automatic indexing of imaged documents for high-volume applications is described.
This system, named STRETCH (STorage and RETrieval by Content of imaged documents), is based on an Archiving and Retrieval Engine, which overcomes the bottleneck of document profiling bypassing some limitations of existing pre-defined indexing schemes.
The engine exploits a structured document representation and can activate appropriate methods to characterise and automatically
index heterogeneous documents with variable layout. The originality of STRETCH lies principally in the possibility for unskilled
users to define the indexes relevant to the document domains of their interest by simply presenting visual examples and applying
reliable automatic information extraction methods (document classification, flexible reading strategies) to index the documents
automatically, thus creating archives as desired. STRETCH offers ease of use and application programming and the ability to
dynamically adapt to new types of documents. The system has been tested in two applications in particular, one concerning
passive invoices and the other bank documents. In these applications, several classes of documents are involved. The indexing
strategy first automatically classifies the document, thus avoiding pre-sorting, then locates and reads the information pertaining
to the specific document class. Experimental results are encouraging overall; in particular, document classification results
fulfill the requirements of high-volume application. Integration into production lines is under execution.
Received March 30, 2000 / Revised June 26, 2001 相似文献
15.
Stefan Klink Thomas Kieninger 《International Journal on Document Analysis and Recognition》2001,4(1):18-26
Document image processing is a crucial process in office automation and begins at the ‘OCR’ phase with difficulties in document
‘analysis’ and ‘understanding’. This paper presents a hybrid and comprehensive approach to document structure analysis. Hybrid
in the sense that it makes use of layout (geometrical) as well as textual features of a given document. These features are
the base for potential conditions which in turn are used to express fuzzy matched rules of an underlying rule base. Rules
can be formulated based on features which might be observed within one specific layout object. However, rules can also express
dependencies between different layout objects. In addition to its rule driven analysis, which allows an easy adaptation to
specific domains with their specific logical objects, the system contains domain-independent markup algorithms for common
objects (e.g., lists).
Received June 19, 2000 / Revised November 8, 2000 相似文献
16.
Christian Shin David Doermann Azriel Rosenfeld 《International Journal on Document Analysis and Recognition》2001,3(4):232-247
Searching for documents by their type or genre is a natural way to enhance the effectiveness of document retrieval. The layout
of a document contains a significant amount of information that can be used to classify it by type in the absence of domain-specific
models. Our approach to classification is based on “visual similarity” of layout structure and is implemented by building
a supervised classifier, given examples of each class. We use image features such as percentages of text and non-text (graphics,
images, tables, and rulings) content regions, column structures, relative point sizes of fonts, density of content area, and
statistics of features of connected components which can be derived without class knowledge. In order to obtain class labels
for training samples, we conducted a study where subjects ranked document pages with respect to their resemblance to representative
page images. Class labels can also be assigned based on known document types, or can be defined by the user. We implemented
our classification scheme using decision tree classifiers and self-organizing maps.
Received June 15, 2000 / Revised November 15, 2000 相似文献
17.
Structured document storage and refined declarative and navigational access mechanisms in HyperStorM 总被引:2,自引:0,他引:2
Klemens Böhm Karl Aberer Erich J. Neuhold Xiaoya Yang 《The VLDB Journal The International Journal on Very Large Data Bases》1997,6(4):296-311
The combination of SGML and database technology allows to refine both declarative and navigational access mechanisms for
structured document collection: with regard to declarative access, the user can formulate complex information needs without
knowing a query language, the respective document type definition (DTD) or the underlying modelling. Navigational access is
eased by hyperlink-rendition mechanisms going beyond plain link-integrity checking. With our approach, the database-internal
representation of documents is configurable. It allows for an efficient implementation of operations, because DTD knowledge
is not needed for document structure recognition. We show how the number of method invocations and the cost of parsing can
be significantly reduced.
Edited by Y.C. Tay. Received April 22, 1996 / Accepted March 16, 1997 相似文献
18.
Reza Farrahi Moghaddam Author Vitae 《Pattern recognition》2010,43(6):2186-2198
In this work, a multi-scale binarization framework is introduced, which can be used along with any adaptive threshold-based binarization method. This framework is able to improve the binarization results and to restore weak connections and strokes, especially in the case of degraded historical documents. This is achieved thanks to localized nature of the framework on the spatial domain. The framework requires several binarizations on different scales, which is addressed by introduction of fast grid-based models. This enables us to explore high scales which are usually unreachable to the traditional approaches. In order to expand our set of adaptive methods, an adaptive modification of Otsu's method, called AdOtsu, is introduced. In addition, in order to restore document images suffering from bleed-through degradation, we combine the framework with recursive adaptive methods. The framework shows promising performance in subjective and objective evaluations performed on available datasets. 相似文献
19.
20.
Norifumi Katafuchi Mutsuo Sano Shuichi Ohara Masashi Okudaira 《Machine Vision and Applications》2000,12(4):170-176
A new method based on an optics model for highly reliable surface inspection of industrial parts has been developed. This
method uses multiple images taken under different camera conditions. Phong's model is employed for surface reflection, and
then the albedo and the reflection model parameters are estimated by the least squares method. The developed method has advantages
over conventional binarization in that it can easily determine the threshold of product acceptability and cope with changes
in light intensity when detecting defects. 相似文献