共查询到20条相似文献,搜索用时 15 毫秒
1.
Abstract. We propose a new adaptive strategy for text recognition that attempts to derive knowledge about the dominant font on a given
page. The strategy uses a linguistic observation that over half of all words in a typical English passage are contained in
a small set of less than 150 stop words. A small dictionary of such words is compiled from the Brown corpus. An arbitrary
text page first goes through layout analysis that produces word segmentation. A fast procedure is then applied to locate the
most likely candidates for those words, using only widths of the word images. The identity of each word is determined using
a word shape classifier. Using the word images together with their identities, character prototypes can be extracted using
a previously proposed method. We describe experiments using simulated and real images. In an experiment using 400 real page
images, we show that on average, eight distinct characters can be learned from each page, and the method is successful on
90% of all the pages. These can serve as useful seeds to bootstrap font learning.
Received October 8, 1999 / Revised March 29, 2000 相似文献
2.
3.
Handprinted word recognition on a NIST data set 总被引:1,自引:0,他引:1
Paul Gader Michael Whalen Margaret Ganzberger Dan Hepp 《Machine Vision and Applications》1995,8(1):31-40
An approach to handprinted word recognition is described. The approach is based on the use of generating multiple possible segmentations of a word image into characters and matching these segmentations to a lexicon of candidate strings. The segmentation process uses a combination of connected component analysis and distance transform-based, connected character splitting. Neural networks are used to assign character confidence values to potential character within word images. Experimental results are provided for both character and word recognition modules on data extracted from the NIST handprinted character database. 相似文献
4.
Gyeonghwan Kim Venu Govindaraju Sargur N. Srihari 《International Journal on Document Analysis and Recognition》1999,2(1):37-44
This paper presents an end-to-end system for reading handwritten page images. Five functional modules included in the system
are introduced in this paper: (i) pre-processing, which concerns introducing an image representation for easy manipulation
of large page images and image handling procedures using the image representation; (ii) line separation, concerning text line
detection and extracting images of lines of text from a page image; (iii) word segmentation, which concerns locating word
gaps and isolating words from a line of text image obtained efficiently and in an intelligent manner; (iv) word recognition,
concerning handwritten word recognition algorithms; and (v) linguistic post-pro- cessing, which concerns the use of linguistic
constraints to intelligently parse and recognize text. Key ideas employed in each functional module, which have been developed
for dealing with the diversity of handwriting in its various aspects with a goal of system reliability and robustness, are
described in this paper. Preliminary experiments show promising results in terms of speed and accuracy.
Received October 30, 1998 / Revised January 15, 1999 相似文献
5.
Sonia Garcia-Salicetti Bernadette Dorizzi Patrick Gallinari Zsolt Wimmer 《International Journal on Document Analysis and Recognition》2001,4(1):56-68
In this paper, we present a hybrid online handwriting recognition system based on hidden Markov models (HMMs). It is devoted
to word recognition using large vocabularies. An adaptive segmentation of words into letters is integrated with recognition,
and is at the heart of the training phase. A word-model is a left-right HMM in which each state is a predictive multilayer
perceptron that performs local regression on the drawing (i.e., the written word) relying on a context of observations. A
discriminative training paradigm related to maximum mutual information is used, and its potential is shown on a database of
9,781 words.
Received June 19, 2000 / Revised October 16, 2000 相似文献
6.
E. Kavallieratou N. Fakotakis G. Kokkinakis 《International Journal on Document Analysis and Recognition》2002,4(4):226-242
In this paper, an integrated offline recognition system for unconstrained handwriting is presented. The proposed system consists
of seven main modules: skew angle estimation and correction, printed-handwritten text discrimination, line segmentation, slant
removing, word segmentation, and character segmentation and recognition, stemming from the implementation of already existing
algorithms as well as novel algorithms. This system has been tested on the NIST, IAM-DB, and GRUHD databases and has achieved
accuracy that varies from 65.6% to 100% depending on the database and the experiment. 相似文献
7.
S. Adam J.M. Ogier C. Cariou R. Mullot J. Labiche J. Gardes 《International Journal on Document Analysis and Recognition》2000,3(2):89-101
In this paper, we consider the general problem of technical document interpretation, as applied to the documents of the French
Telephonic Operator, France Télécom. More precisely, we focus the content of this paper on the computation of a new set of
features allowing the classification of multioriented and multiscaled patterns. This set of invariants is based on the Fourier–Mellin
Transform. The interests of this computation rely on the excellent classification rate obtained with this method and also
on using this Fourier–Mellin transform within a “filtering mode”, with which we can solve the well known difficult problem
of connected character recognition. 相似文献
8.
Automatic text segmentation and text recognition for video indexing 总被引:13,自引:0,他引:13
Efficient indexing and retrieval of digital video is an important function of video databases. One powerful index for retrieval
is the text appearing in them. It enables content-based browsing. We present our new methods for automatic segmentation of
text in digital videos. The algorithms we propose make use of typical characteristics of text in videos in order to enable
and enhance segmentation performance. The unique features of our approach are the tracking of characters and words over their
complete duration of occurrence in a video and the integration of the multiple bitmaps of a character over time into a single
bitmap. The output of the text segmentation step is then directly passed to a standard OCR software package in order to translate
the segmented text into ASCII. Also, a straightforward indexing and retrieval scheme is introduced. It is used in the experiments
to demonstrate that the proposed text segmentation algorithms together with existing text recognition algorithms are suitable
for indexing and retrieval of relevant video sequences in and from a video database. Our experimental results are very encouraging
and suggest that these algorithms can be used in video retrieval applications as well as to recognize higher level semantics
in videos. 相似文献
9.
Bir Bhanu Yingqiang Lin Grinnell Jones Jing Peng 《Machine Vision and Applications》2000,11(6):289-299
Target recognition is a multilevel process requiring a sequence of algorithms at low, intermediate and high levels. Generally,
such systems are open loop with no feedback between levels and assuring their performance at the given probability of correct
identification (PCI) and probability of false alarm (Pf) is a key challenge in computer vision and pattern recognition research. In this paper, a robust closed-loop system for recognition
of SAR images based on reinforcement learning is presented. The parameters in model-based SAR target recognition are learned.
The method meets performance specifications by using PCI and Pf as feedback for the learning system. It has been experimentally validated by learning the parameters of the recognition system
for SAR imagery, successfully recognizing articulated targets, targets of different configuration and targets at different
depression angles. 相似文献
10.
Giovanni Seni John Seybold 《International Journal on Document Analysis and Recognition》1999,2(1):24-29
Out-of-order diacriticals introduce significant complexity to the design of an online handwriting recognizer, because they
require some reordering of the time domain information. It is common in cursive writing to write the body of an `i' or `t'
during the writing of the word, and then to return and dot or cross the letter once the word is complete. The difficulty arises
because we have to look ahead, when scoring one of these letters, to find the mark occurring later in the writing stream that
completes the letter. We should also remember that we have used this mark, so that we don't use it again for a different letter,
and we should also penalize a word if there are some marks that look like diacriticals that are not used. One approach to
this problem is to scan the writing some distance into the future to identify candidate diacriticals, remove them in a preprocessing
step, and associate them with the matching letters earlier in the word. If done as a preliminary operation, this approach
is error-prone: marks that are not diacriticals may be incorrectly identified and removed, and true diacriticals may be skipped.
This paper describes a novel extension to a forward search algorithm that provides a natural mechanism for considering alternative
treatments of potential diacriticals, to see whether it is better to treat a given mark as a diacritical or not, and directly
compare the two outcomes by score.
Received October 30, 1998 / Revised January 25, 1999 相似文献
11.
Fast template matching using bounded partial correlation 总被引:8,自引:0,他引:8
This paper describes a novel, fast template-matching technique, referred to as bounded partial correlation (BPC), based on
the normalised cross-correlation (NCC) function. The technique consists in checking at each search position a suitable elimination
condition relying on the evaluation of an upper-bound for the NCC function. The check allows for rapidly skipping the positions
that cannot provide a better degree of match with respect to the current best-matching one. The upper-bounding function incorporates
partial information from the actual cross-correlation function and can be calculated very efficiently using a recursive scheme.
We show also a simple improvement to the basic BPC formulation that provides additional computational benefits and renders
the technique more robust with respect to the parameters choice.
Received: 2 November 2000 / Accepted: 25 July 2001
Correspondence to: L. Di Stefano 相似文献
12.
13.
S. Jaeger S. Manke J. Reichert A. Waibel 《International Journal on Document Analysis and Recognition》2001,3(3):169-180
This paper presents the online handwriting recognition system NPen++ developed at the University of Karlsruhe and Carnegie
Mellon University. The NPen++ recognition engine is based on a multi-state time delay neural network and yields recognition
rates from 96% for a 5,000 word dictionary to 93.4% on a 20,000 word dictionary and 91.2% for a 50,000 word dictionary. The
proposed tree search and pruning technique reduces the search space considerably without losing too much recognition performance
compared to an exhaustive search. This enables the NPen++ recognizer to be run in real-time with large dictionaries. Initial
recognition rates for whole sentences are promising and show that the MS-TDNN architecture is suited to recognizing handwritten
data ranging from single characters to whole sentences.
Received September 3, 2000 / Revised October 9, 2000 相似文献
14.
Hwan-Chul Park Se-Young Ok Young-Jung Yu Hwan-Gue Cho 《International Journal on Document Analysis and Recognition》2001,4(2):115-130
Automatic character recognition and image understanding of a given paper document are the main objectives of the computer
vision field. For these problems, a basic step is to isolate characters and group words from these isolated characters. In
this paper, we propose a new method for extracting characters from a mixed text/graphic machine-printed document and an algorithm
for distinguishing words from the isolated characters. For extracting characters, we exploit several features (size, elongation,
and density) of characters and propose a characteristic value for classification using the run-length frequency of the image
component. In the context of word grouping, previous works have largely been concerned with words which are placed on a horizontal
or vertical line. Our word grouping algorithm can group words which are on inclined lines, intersecting lines, and even curved
lines. To do this, we introduce the 3D neighborhood graph model which is very useful and efficient for character classification
and word grouping. In the 3D neighborhood graph model, each connected component of a text image segment is mapped onto 3D
space according to the area of the bounding box and positional information from the document. We conducted tests with more
than 20 English documents and more than ten oriental documents scanned from books, brochures, and magazines. Experimental
results show that more than 95% of words are successfully extracted from general documents, even in very complicated oriental
documents.
Received August 3, 2001 / Accepted August 8, 2001 相似文献
15.
Mathematical expression recognition: a survey 总被引:15,自引:0,他引:15
Kam-Fai Chan Dit-Yan Yeung 《International Journal on Document Analysis and Recognition》2000,3(1):3-15
Abstract. Automatic recognition of mathematical expressions is one of the key vehicles in the drive towards transcribing documents
in scientific and engineering disciplines into electronic form. This problem typically consists of two major stages, namely,
symbol recognition and structural analysis. In this survey paper, we will review most of the existing work with respect to
each of the two major stages of the recognition process. In particular, we try to put emphasis on the similarities and differences
between systems. Moreover, some important issues in mathematical expression recognition will be addressed in depth. All these
together serve to provide a clear overall picture of how this research area has been developed to date.
Received February 22, 2000 / Revised June 12, 2000 相似文献
16.
John F. Pitrelli Amit Roy 《International Journal on Document Analysis and Recognition》2003,5(2-3):126-137
We discuss development of a word-unigram language model for online handwriting recognition. First, we tokenize a text corpus
into words, contrasting with tokenization methods designed for other purposes. Second, we select for our model a subset of
the words found, discussing deviations from an N-most-frequent-words approach. From a 600-million-word corpus, we generated a 53,000-word model which eliminates 45% of word-recognition
errors made by a character-level-model baseline system. We anticipate that our methods will be applicable to offline recognition
as well, and to some extent to other recognizers, such as speech recognizers and video retrieval systems.
Received: November 1, 2001 / Revised version: July 22, 2002 相似文献
17.
V. Vuori J. Laaksonen E. Oja J. Kangas 《International Journal on Document Analysis and Recognition》2001,3(3):150-159
This paper describes an adaptive recognition system for isolated handwritten characters and the experiments carried out with
it. The characters used in our experiments are alphanumeric characters, including both the upper- and lower-case versions
of the Latin alphabets and three Scandinavian diacriticals. The writers are allowed to use their own natural style of writing.
The recognition system is based on the k-nearest neighbor rule. The six character similarity measures applied by the system are all based on dynamic time warping.
The aim of the first experiments is to choose the best combination of the simple preprocessing and normalization operations
and the dissimilarity measure for a multi-writer system. However, the main focus of the work is on online adaptation. The
purpose of the adaptations is to turn a writer-independent system into writer-dependent and increase recognition performance.
The adaptation is carried out by modifying the prototype set of the classifier according to its recognition performance and
the user's writing style. The ways of adaptation include: (1) adding new prototypes; (2) inactivating confusing prototypes;
and (3) reshaping existing prototypes. The reshaping algorithm is based on the Learning Vector Quantization. Four different
adaptation strategies, according to which the modifications of the prototype set are performed, have been studied both offline
and online. Adaptation is carried out in a self-supervised fashion during normal use and thus remains unnoticed by the user.
Received June 30, 1999 / Revised September 29, 2000 相似文献
18.
19.
Cheng-Lin Liu Hiroshi Sako Hiromichi Fujisawa 《International Journal on Document Analysis and Recognition》2002,4(3):191-204
This paper describes a performance evaluation study in which some efficient classifiers are tested in handwritten digit recognition.
The evaluated classifiers include a statistical classifier (modified quadratic discriminant function, MQDF), three neural
classifiers, and an LVQ (learning vector quantization) classifier. They are efficient in that high accuracies can be achieved
at moderate memory space and computation cost. The performance is measured in terms of classification accuracy, sensitivity
to training sample size, ambiguity rejection, and outlier resistance. The outlier resistance of neural classifiers is enhanced
by training with synthesized outlier data. The classifiers are tested on a large data set extracted from NIST SD19. As results,
the test accuracies of the evaluated classifiers are comparable to or higher than those of the nearest neighbor (1-NN) rule
and regularized discriminant analysis (RDA). It is shown that neural classifiers are more susceptible to small sample size
than MQDF, although they yield higher accuracies on large sample size. As a neural classifier, the polynomial classifier (PC)
gives the highest accuracy and performs best in ambiguity rejection. On the other hand, MQDF is superior in outlier rejection
even though it is not trained with outlier data. The results indicate that pattern classifiers have complementary advantages
and they should be appropriately combined to achieve higher performance.
Received: July 18, 2001 / Accepted: September 28, 2001 相似文献
20.
Y. Nakajima S. Mori S. Takegami S. Sato 《International Journal on Document Analysis and Recognition》1999,2(1):19-23
Two methods for stroke segmentation from a global point of view are presented and compared. One is based on thinning methods
and the other is based on contour curve fitting. For both cases an input image is binarized. For the former, Hilditch's method
is used, then crossing points are sought, around which a domain is constructed. Outside the domain, a set of line segments
are identified. These lines are connected and approximated by cubic B-spline curves. Smoothly connected lines are selected
as segmented curves. This method works well for a limited class of crossing lines, which are shown experimentally. In the
latter, a contour line is approximated by cubic B-spline curve, along which curvature is measured. According to the extreme
points of the curvature graph, the contour line is segmented, based on which the line segment is obtained. Experimental results
are shown for some difficult cases.
Received October 31, 1998 / Revised January 12, 1999 相似文献