共查询到20条相似文献,搜索用时 15 毫秒
1.
Stefan Klink Thomas Kieninger 《International Journal on Document Analysis and Recognition》2001,4(1):18-26
Document image processing is a crucial process in office automation and begins at the ‘OCR’ phase with difficulties in document
‘analysis’ and ‘understanding’. This paper presents a hybrid and comprehensive approach to document structure analysis. Hybrid
in the sense that it makes use of layout (geometrical) as well as textual features of a given document. These features are
the base for potential conditions which in turn are used to express fuzzy matched rules of an underlying rule base. Rules
can be formulated based on features which might be observed within one specific layout object. However, rules can also express
dependencies between different layout objects. In addition to its rule driven analysis, which allows an easy adaptation to
specific domains with their specific logical objects, the system contains domain-independent markup algorithms for common
objects (e.g., lists).
Received June 19, 2000 / Revised November 8, 2000 相似文献
2.
Amit Kumar Das Sanjoy Kumar Saha Bhabatosh Chanda 《International Journal on Document Analysis and Recognition》2002,4(3):183-190
Document image segmentation is the first step in document image analysis and understanding. One major problem centres on
the performance analysis of the evolving segmentation algorithms. The use of a standard document database maintained at the
Universities/Research Laboratories helps to solve the problem of getting authentic data sources and other information, but
some methodologies have to be used for performance analysis of the segmentation. We describe a new document model in terms
of a bounding box representation of its constituent parts and suggest an empirical measure of performance of a segmentation
algorithm based on this new graph-like model of the document. Besides the global error measures, the proposed method also
produces segment-wise details of common segmentation problems such as horizontal and vertical split and merge as well as invalid
and mismatched regions.
Received July 14, 2000 / Revised June 12, 2001[-1mm] 相似文献
3.
Shuhua Wang Yang Cao Shijie Cai 《International Journal on Document Analysis and Recognition》2001,4(1):27-34
The most noticeable characteristic of a construction tender document is that its hierarchical architecture is not obviously
expressed but is implied in the citing information. Currently available methods cannot deal with such documents. In this paper,
the intra-page and inter-page relationships are analyzed in detail. The creation of citing relationships is essential to extracting
the logical structure of tender documents. The hierarchy of tender documents naturally leads to extracting and displaying
the logical structure as tree structure. This method is successfully implemented in VHTender, and is the key to the efficiency
and flexibility of the whole system.
Received February 28, 2000 / Revised October 20, 2000 相似文献
4.
Michael Cannon Judith Hochberg Patrick Kelly 《International Journal on Document Analysis and Recognition》1999,2(2-3):80-89
We present a useful method for assessing the quality of a typewritten document image and automatically selecting an optimal restoration method based on that assessment. We use five quality measures that assess the severity of background speckle, touching characters, and broken characters. A linear classifier uses these measures to select a restoration method. On a 139-document corpus, our methodology reduced the corpus OCR character error rate from 20.27% to 12.60%. Received November 10, 1998 / Revised October 27, 1999 相似文献
5.
6.
Identifying facsimile duplicates using radial pixel densities 总被引:2,自引:0,他引:2
P. Chatelain 《International Journal on Document Analysis and Recognition》2002,4(4):219-225
A method for detecting full layout facsimile duplicates based on radial pixel densities is proposed. It caters for facsimiles,
including text and/or graphics. Pages may be positioned upright or inverted on the scanner bed. The method is not dependent
on the computation of text skew or text orientation. Using a database of original documents, 92% of non-duplicates and upright
duplicates as well as 89% of inverted duplicates could be correctly identified. The method is vulnerable to double scanning.
This occurs when documents are copied using a photocopier and the copies are subsequently transmitted using a facsimile machine.
Received September 29, 2000 / Revised: August 23, 2001 相似文献
7.
Pietro Parodi Roberto Fontana 《International Journal on Document Analysis and Recognition》1999,2(2-3):67-79
This paper describes a novel method for extracting text from document pages of mixed content. The method works by detecting
pieces of text lines in small overlapping columns of width , shifted with respect to each other by image elements (good default values are: of the image width, ) and by merging these pieces in a bottom-up fashion to form complete text lines and blocks of text lines. The algorithm requires
about 1.3 s for a 300 dpi image on a PC with a Pentium II CPU, 300 MHz, MotherBoard Intel440LX. The algorithm is largely independent
of the layout of the document, the shape of the text regions, and the font size and style. The main assumptions are that the
background be uniform and that the text sit approximately horizontally. For a skew of up to about 10 degrees no skew correction
mechanism is necessary. The algorithm has been tested on the UW English Document Database I of the University of Washington
and its performance has been evaluated by a suitable measure of segmentation accuracy. Also, a detailed analysis of the segmentation
accuracy achieved by the algorithm as a function of noise and skew has been carried out.
Received April 4, 1999 / Revised June 1, 1999 相似文献
8.
Structured document storage and refined declarative and navigational access mechanisms in HyperStorM 总被引:2,自引:0,他引:2
Klemens Böhm Karl Aberer Erich J. Neuhold Xiaoya Yang 《The VLDB Journal The International Journal on Very Large Data Bases》1997,6(4):296-311
The combination of SGML and database technology allows to refine both declarative and navigational access mechanisms for
structured document collection: with regard to declarative access, the user can formulate complex information needs without
knowing a query language, the respective document type definition (DTD) or the underlying modelling. Navigational access is
eased by hyperlink-rendition mechanisms going beyond plain link-integrity checking. With our approach, the database-internal
representation of documents is configurable. It allows for an efficient implementation of operations, because DTD knowledge
is not needed for document structure recognition. We show how the number of method invocations and the cost of parsing can
be significantly reduced.
Edited by Y.C. Tay. Received April 22, 1996 / Accepted March 16, 1997 相似文献
9.
E. Appiani F. Cesarini A.M. Colla M. Diligenti M. Gori S. Marinai G. Soda 《International Journal on Document Analysis and Recognition》2001,4(2):69-83
In this paper a system for analysis and automatic indexing of imaged documents for high-volume applications is described.
This system, named STRETCH (STorage and RETrieval by Content of imaged documents), is based on an Archiving and Retrieval Engine, which overcomes the bottleneck of document profiling bypassing some limitations of existing pre-defined indexing schemes.
The engine exploits a structured document representation and can activate appropriate methods to characterise and automatically
index heterogeneous documents with variable layout. The originality of STRETCH lies principally in the possibility for unskilled
users to define the indexes relevant to the document domains of their interest by simply presenting visual examples and applying
reliable automatic information extraction methods (document classification, flexible reading strategies) to index the documents
automatically, thus creating archives as desired. STRETCH offers ease of use and application programming and the ability to
dynamically adapt to new types of documents. The system has been tested in two applications in particular, one concerning
passive invoices and the other bank documents. In these applications, several classes of documents are involved. The indexing
strategy first automatically classifies the document, thus avoiding pre-sorting, then locates and reads the information pertaining
to the specific document class. Experimental results are encouraging overall; in particular, document classification results
fulfill the requirements of high-volume application. Integration into production lines is under execution.
Received March 30, 2000 / Revised June 26, 2001 相似文献
10.
An information retrieval system that captures both visual and textual contents from paper documents can derive maximal benefits
from DAR techniques while demanding little human assistance to achieve its goals. This article discusses technical problems,
along with solution methods, and their integration into a well-performing system. The focus of the discussion is very difficult
applications, for example, Chinese and Japanese documents. Solution methods are also highlighted, with the emphasis placed
upon some new ideas, including window-based binarization using scale measures, document layout analysis for solving the multiple
constraint problem, and full-text searching techniques capable of evading machine recognition errors.
Received May 25, 2000 / Revised November 7, 2000 相似文献
11.
Christian Shin David Doermann Azriel Rosenfeld 《International Journal on Document Analysis and Recognition》2001,3(4):232-247
Searching for documents by their type or genre is a natural way to enhance the effectiveness of document retrieval. The layout
of a document contains a significant amount of information that can be used to classify it by type in the absence of domain-specific
models. Our approach to classification is based on “visual similarity” of layout structure and is implemented by building
a supervised classifier, given examples of each class. We use image features such as percentages of text and non-text (graphics,
images, tables, and rulings) content regions, column structures, relative point sizes of fonts, density of content area, and
statistics of features of connected components which can be derived without class knowledge. In order to obtain class labels
for training samples, we conducted a study where subjects ranked document pages with respect to their resemblance to representative
page images. Class labels can also be assigned based on known document types, or can be defined by the user. We implemented
our classification scheme using decision tree classifiers and self-organizing maps.
Received June 15, 2000 / Revised November 15, 2000 相似文献
12.
Markus Junker Rainer Hoch 《International Journal on Document Analysis and Recognition》1998,1(2):116-122
In the literature, many feature types are proposed for document classification. However, an extensive and systematic evaluation
of the various approaches has not yet been done. In particular, evaluations on OCR documents are very rare. In this paper
we investigate seven text representations based on n-grams and single words. We compare their effectiveness in classifying OCR texts and the corresponding correct ASCII texts
in two domains: business letters and abstracts of technical reports. Our results indicate that the use of n-grams is an attractive technique which can even compare to techniques relying on a morphological analysis. This holds for
OCR texts as well as for correct ASCII texts.
Received February 17, 1998 / Revised April 8, 1998 相似文献
13.
Hon-Son Don 《International Journal on Document Analysis and Recognition》2001,4(2):131-138
A new thresholding method, called the noise attribute thresholding method (NAT), for document image binarization is presented
in this paper. This method utilizes the noise attribute features extracted from the images to make the selection of threshold
values for image thresholding. These features are based on the properties of noise in the images and are independent of the
strength of the signals (objects and background) in the image. A simple noise model is given to explain these noise properties.
The NAT method has been applied to the problem of removing text and figures printed on the back of the paper. Conventional
global thresholding methods cannot solve this kind of problem satisfactorily. Experimental results show that the NAT method
is very effective.
Received July 05, 1999 / Revised July 07, 2000 相似文献
14.
Lixin Fan Liying Fan Chew Lim Tan 《International Journal on Document Analysis and Recognition》2003,5(2-3):88-101
Abstract. For document images corrupted by various kinds of noise, direct binarization images may be severely blurred and degraded.
A common treatment for this problem is to pre-smooth input images using noise-suppressing filters. This article proposes an
image-smoothing method used for prefiltering the document image binarization. Conceptually, we propose that the influence
range of each pixel affecting its neighbors should depend on local image statistics. Technically, we suggest using coplanar matrices to capture the structural and textural distribution of similar pixels at each site. This property adapts the smoothing process
to the contrast, orientation, and spatial size of local image structures. Experimental results demonstrate the effectiveness
of the proposed method, which compares favorably with existing methods in reducing noise and preserving image features. In
addition, due to the adaptive nature of the similar pixel definition, the proposed filter output is more robust regarding
different noise levels than existing methods.
Received: October 31, 2001 / October 09, 2002
Correspondence to:L. Fan (e-mail: fanlixin@ieee.org) 相似文献
15.
16.
17.
Christopher D. Shaw James M. Kukla Ian Soboroff David S. Ebert Charles K. Nicholas Amen Zwa Ethan L. Miller D. Aaron Roberts 《International Journal on Digital Libraries》1999,2(2-3):144-156
This paper describes aminimally immersive three-dimensional volumetric interactive information visualization system for management
and analysis of document corpora. The system, SFA, uses glyph-based volume rendering, enabling more complex data relationships
and information attributes to be visualized than traditional 2D and surface-based visualization systems. Two-handed interaction
using three-space magnetic trackers and stereoscopic viewing are combined to produce aminimally immersive interactive system
that enhances the user’s three-dimensional perception of the information space. This new system capitalizes on the human visual
system’s pre-attentive learning capabilities to quickly analyze the displayed information. SFA is integrated with adocument
management and information retrieval engine named Telltale. Together, these systems integrate visualization and document analysis
technologies to solve the problem of analyzing large document corpora. We describe the usefulness of this system for the analysis
and visualization of document similarity within acorpus of textual documents, and present an example exploring authorship
of ancient Biblical texts.
Received: 15 December 1997 / Revised: June 1999 相似文献
18.
19.
Shape understanding by contour-driven retiling 总被引:1,自引:0,他引:1
Published online: 14 February 2003 相似文献
20.
A. Belaïd 《International Journal on Document Analysis and Recognition》1998,1(3):125-146
This paper describes a framework for retrospective document conversion in the library domain. Drawing on the experience and
insight gained from projects launched over the present decade by the European Commission, it outlines the requirements for
solving the problem of retroconversion and traces the main phases of associated processing. To highlight the main problems
encountered in this area, the paper also outlines studies conducted by our group in the more project for the retroconversion of old catalogues belonging to two different libraries: National French Library and Royal
Belgian Library. For the French Library, the idea was to study the feasibility of a recognition approach avoiding the use
of ocr and basing the strategy mainly on visual features. The challenge was to recognize a logical structure from its physical aspects.
The modest results obtained from experiments for this first study led us, in the second study, to base the structural recognition
methodology more on the logical aspects by focusing the analysis on the content. Furthermore, for the Belgian references,
the aim was to convert reference catalogues into a more conventional unimarc format while respecting the industrial constraints. Without manual intervention, 75% rate of correct recognition was obtained
on 11 catalogues containing about 4548 references.
Received March 10, 1998 / Revised August 12, 1998 相似文献