共查询到20条相似文献,搜索用时 10 毫秒
1.
Automatic text segmentation and text recognition for video indexing 总被引:13,自引:0,他引:13
Efficient indexing and retrieval of digital video is an important function of video databases. One powerful index for retrieval
is the text appearing in them. It enables content-based browsing. We present our new methods for automatic segmentation of
text in digital videos. The algorithms we propose make use of typical characteristics of text in videos in order to enable
and enhance segmentation performance. The unique features of our approach are the tracking of characters and words over their
complete duration of occurrence in a video and the integration of the multiple bitmaps of a character over time into a single
bitmap. The output of the text segmentation step is then directly passed to a standard OCR software package in order to translate
the segmented text into ASCII. Also, a straightforward indexing and retrieval scheme is introduced. It is used in the experiments
to demonstrate that the proposed text segmentation algorithms together with existing text recognition algorithms are suitable
for indexing and retrieval of relevant video sequences in and from a video database. Our experimental results are very encouraging
and suggest that these algorithms can be used in video retrieval applications as well as to recognize higher level semantics
in videos. 相似文献
2.
Cheng-Lin Liu Hiroshi Sako Hiromichi Fujisawa 《International Journal on Document Analysis and Recognition》2002,4(3):191-204
This paper describes a performance evaluation study in which some efficient classifiers are tested in handwritten digit recognition.
The evaluated classifiers include a statistical classifier (modified quadratic discriminant function, MQDF), three neural
classifiers, and an LVQ (learning vector quantization) classifier. They are efficient in that high accuracies can be achieved
at moderate memory space and computation cost. The performance is measured in terms of classification accuracy, sensitivity
to training sample size, ambiguity rejection, and outlier resistance. The outlier resistance of neural classifiers is enhanced
by training with synthesized outlier data. The classifiers are tested on a large data set extracted from NIST SD19. As results,
the test accuracies of the evaluated classifiers are comparable to or higher than those of the nearest neighbor (1-NN) rule
and regularized discriminant analysis (RDA). It is shown that neural classifiers are more susceptible to small sample size
than MQDF, although they yield higher accuracies on large sample size. As a neural classifier, the polynomial classifier (PC)
gives the highest accuracy and performs best in ambiguity rejection. On the other hand, MQDF is superior in outlier rejection
even though it is not trained with outlier data. The results indicate that pattern classifiers have complementary advantages
and they should be appropriately combined to achieve higher performance.
Received: July 18, 2001 / Accepted: September 28, 2001 相似文献
3.
Abstract. We propose a new adaptive strategy for text recognition that attempts to derive knowledge about the dominant font on a given
page. The strategy uses a linguistic observation that over half of all words in a typical English passage are contained in
a small set of less than 150 stop words. A small dictionary of such words is compiled from the Brown corpus. An arbitrary
text page first goes through layout analysis that produces word segmentation. A fast procedure is then applied to locate the
most likely candidates for those words, using only widths of the word images. The identity of each word is determined using
a word shape classifier. Using the word images together with their identities, character prototypes can be extracted using
a previously proposed method. We describe experiments using simulated and real images. In an experiment using 400 real page
images, we show that on average, eight distinct characters can be learned from each page, and the method is successful on
90% of all the pages. These can serve as useful seeds to bootstrap font learning.
Received October 8, 1999 / Revised March 29, 2000 相似文献
4.
This paper investigates the automatic reading of unconstrained omni-writer handwritten texts. It shows how to endow the reading system with learning faculties necessary to adapt the recognition to each writer's handwriting. In the first part of this paper, we explain how the recognition system can be adapted to a current handwriting by exploiting the graphical context defined by the writer's invariants. This adaptation is guaranteed by activating interaction links over the whole text between the recognition procedures of word entities and those of letter entities. In the second part, we justify the need of an open multiple-agent architecture to support the implementation of such a principle of adaptation. The proposed platform allows to plug expert treatments dedicated to handwriting analysis. We show that this platform helps to implement specific collaboration or cooperation schemes between agents which bring out new trends in the automatic reading of handwritten texts. 相似文献
5.
V. Vuori J. Laaksonen E. Oja J. Kangas 《International Journal on Document Analysis and Recognition》2001,3(3):150-159
This paper describes an adaptive recognition system for isolated handwritten characters and the experiments carried out with
it. The characters used in our experiments are alphanumeric characters, including both the upper- and lower-case versions
of the Latin alphabets and three Scandinavian diacriticals. The writers are allowed to use their own natural style of writing.
The recognition system is based on the k-nearest neighbor rule. The six character similarity measures applied by the system are all based on dynamic time warping.
The aim of the first experiments is to choose the best combination of the simple preprocessing and normalization operations
and the dissimilarity measure for a multi-writer system. However, the main focus of the work is on online adaptation. The
purpose of the adaptations is to turn a writer-independent system into writer-dependent and increase recognition performance.
The adaptation is carried out by modifying the prototype set of the classifier according to its recognition performance and
the user's writing style. The ways of adaptation include: (1) adding new prototypes; (2) inactivating confusing prototypes;
and (3) reshaping existing prototypes. The reshaping algorithm is based on the Learning Vector Quantization. Four different
adaptation strategies, according to which the modifications of the prototype set are performed, have been studied both offline
and online. Adaptation is carried out in a self-supervised fashion during normal use and thus remains unnoticed by the user.
Received June 30, 1999 / Revised September 29, 2000 相似文献
6.
Sonia Garcia-Salicetti Bernadette Dorizzi Patrick Gallinari Zsolt Wimmer 《International Journal on Document Analysis and Recognition》2001,4(1):56-68
In this paper, we present a hybrid online handwriting recognition system based on hidden Markov models (HMMs). It is devoted
to word recognition using large vocabularies. An adaptive segmentation of words into letters is integrated with recognition,
and is at the heart of the training phase. A word-model is a left-right HMM in which each state is a predictive multilayer
perceptron that performs local regression on the drawing (i.e., the written word) relying on a context of observations. A
discriminative training paradigm related to maximum mutual information is used, and its potential is shown on a database of
9,781 words.
Received June 19, 2000 / Revised October 16, 2000 相似文献
7.
Xiangyun Ye Mohamed Cheriet Ching Y. Suen 《International Journal on Document Analysis and Recognition》2001,4(2):84-96
The automation of business form processing is attracting intensive research interests due to its wide application and its
reduction of the heavy workload due to manual processing. Preparing clean and clear images for the recognition engines is
often taken for granted as a trivial task that requires little attention. In reality, handwritten data usually touch or cross
the preprinted form frames and texts, creating tremendous problems for the recognition engines. In this paper, we contribute
answers to two questions: “Why do we need cleaning and enhancement procedures in form processing systems?” and “How can we
clean and enhance the hand-filled items with easy implementation and high processing speed?” Here, we propose a generic system
including only cleaning and enhancing phases. In the cleaning phase, the system registers a template to the input form by
aligning corresponding landmarks. A unified morphological scheme is proposed to remove the form frames and restore the broken
handwriting from gray or binary images. When the handwriting is found touching or crossing preprinted texts, morphological
operations based on statistical features are used to clean it. In applications where a black-and-white scanning mode is adopted,
handwriting may contain broken or hollow strokes due to improper thresholding parameters. Therefore, we have designed a module
to enhance the image quality based on morphological operations. Subjective and objective evaluations have been studied to
show the effectiveness of the proposed procedures.
Received January 19, 2000 / Revised March 20, 2001 相似文献
8.
9.
10.
Giovanni Seni John Seybold 《International Journal on Document Analysis and Recognition》1999,2(1):24-29
Out-of-order diacriticals introduce significant complexity to the design of an online handwriting recognizer, because they
require some reordering of the time domain information. It is common in cursive writing to write the body of an `i' or `t'
during the writing of the word, and then to return and dot or cross the letter once the word is complete. The difficulty arises
because we have to look ahead, when scoring one of these letters, to find the mark occurring later in the writing stream that
completes the letter. We should also remember that we have used this mark, so that we don't use it again for a different letter,
and we should also penalize a word if there are some marks that look like diacriticals that are not used. One approach to
this problem is to scan the writing some distance into the future to identify candidate diacriticals, remove them in a preprocessing
step, and associate them with the matching letters earlier in the word. If done as a preliminary operation, this approach
is error-prone: marks that are not diacriticals may be incorrectly identified and removed, and true diacriticals may be skipped.
This paper describes a novel extension to a forward search algorithm that provides a natural mechanism for considering alternative
treatments of potential diacriticals, to see whether it is better to treat a given mark as a diacritical or not, and directly
compare the two outcomes by score.
Received October 30, 1998 / Revised January 25, 1999 相似文献
11.
E. Kavallieratou N. Fakotakis G. Kokkinakis 《International Journal on Document Analysis and Recognition》2002,4(4):226-242
In this paper, an integrated offline recognition system for unconstrained handwriting is presented. The proposed system consists
of seven main modules: skew angle estimation and correction, printed-handwritten text discrimination, line segmentation, slant
removing, word segmentation, and character segmentation and recognition, stemming from the implementation of already existing
algorithms as well as novel algorithms. This system has been tested on the NIST, IAM-DB, and GRUHD databases and has achieved
accuracy that varies from 65.6% to 100% depending on the database and the experiment. 相似文献
12.
Segmentation and recognition of Chinese bank check amounts 总被引:1,自引:0,他引:1
M.L. Yu P.C.K. Kwok C.H. Leung K.W. Tse 《International Journal on Document Analysis and Recognition》2001,3(4):207-217
This paper describes a system for the recognition of legal amounts on bank checks written in the Chinese language. It consists
of subsystems that perform preprocessing, segmentation, and recognition of the legal amount. In each step of the segmentation
and recognition phases, a list of possible choices are obtained. An approach is adopted whereby a large number of choices
can be processed effectively and efficiently in order to achieve the best recognition result. The contribution of this paper
is the proposal of a grammar checker for Chinese bank check amounts. It is found to be very effective in reducing the substitution
error rate. The recognition rate of the system is 74.0%, the error rate is 10.4%, and the reliability is 87.7%.
Received June 9, 2000 / Revised January 10, 2001 相似文献
13.
Offline handwritten Amharic word recognition 总被引:1,自引:0,他引:1
This paper describes two approaches for Amharic word recognition in unconstrained handwritten text using HMMs. The first approach builds word models from concatenated features of constituent characters and in the second method HMMs of constituent characters are concatenated to form word model. In both cases, the features used for training and recognition are a set of primitive strokes and their spatial relationships. The recognition system does not require segmentation of characters but requires text line detection and extraction of structural features, which is done by making use of direction field tensor. The performance of the recognition system is tested by a dataset of unconstrained handwritten documents collected from various sources, and promising results are obtained. 相似文献
14.
John F. Pitrelli Amit Roy 《International Journal on Document Analysis and Recognition》2003,5(2-3):126-137
We discuss development of a word-unigram language model for online handwriting recognition. First, we tokenize a text corpus
into words, contrasting with tokenization methods designed for other purposes. Second, we select for our model a subset of
the words found, discussing deviations from an N-most-frequent-words approach. From a 600-million-word corpus, we generated a 53,000-word model which eliminates 45% of word-recognition
errors made by a character-level-model baseline system. We anticipate that our methods will be applicable to offline recognition
as well, and to some extent to other recognizers, such as speech recognizers and video retrieval systems.
Received: November 1, 2001 / Revised version: July 22, 2002 相似文献
15.
Alceu de S. Britto Jr Robert Sabourin Flavio Bortolozzi Ching Y. Suen 《International Journal on Document Analysis and Recognition》2003,5(2-3):102-117
In this paper, a two-stage HMM-based recognition method allows us to compensate for the possible loss in terms of recognition
performance caused by the necessary trade-off between segmentation and recognition in an implicit segmentation-based strategy.
The first stage consists of an implicit segmentation process that takes into account some contextual information to provide
multiple segmentation-recognition hypotheses for a given preprocessed string. These hypotheses are verified and re-ranked
in a second stage by using an isolated digit classifier. This method enables the use of two sets of features and numeral models:
one taking into account both the segmentation and recognition aspects in an implicit segmentation-based strategy, and the
other considering just the recognition aspects of isolated digits. These two stages have been shown to be complementary, in
the sense that the verification stage compensates for the loss in terms of recognition performance brought about by the necessary
tradeoff between segmentation and recognition carried out in the first stage. The experiments on 12,802 handwritten numeral
strings of different lengths have shown that the use of a two-stage recognition strategy is a promising idea. The verification
stage brought about an average improvement of 9.9% on the string recognition rates. On touching digit pairs, the method achieved
a recognition rate of 89.6%.
Received June 28, 2002 / Revised July 03, 2002 相似文献
16.
S. Jaeger S. Manke J. Reichert A. Waibel 《International Journal on Document Analysis and Recognition》2001,3(3):169-180
This paper presents the online handwriting recognition system NPen++ developed at the University of Karlsruhe and Carnegie
Mellon University. The NPen++ recognition engine is based on a multi-state time delay neural network and yields recognition
rates from 96% for a 5,000 word dictionary to 93.4% on a 20,000 word dictionary and 91.2% for a 50,000 word dictionary. The
proposed tree search and pruning technique reduces the search space considerably without losing too much recognition performance
compared to an exhaustive search. This enables the NPen++ recognizer to be run in real-time with large dictionaries. Initial
recognition rates for whole sentences are promising and show that the MS-TDNN architecture is suited to recognizing handwritten
data ranging from single characters to whole sentences.
Received September 3, 2000 / Revised October 9, 2000 相似文献
17.
Handprinted word recognition on a NIST data set 总被引:1,自引:0,他引:1
Paul Gader Michael Whalen Margaret Ganzberger Dan Hepp 《Machine Vision and Applications》1995,8(1):31-40
An approach to handprinted word recognition is described. The approach is based on the use of generating multiple possible segmentations of a word image into characters and matching these segmentations to a lexicon of candidate strings. The segmentation process uses a combination of connected component analysis and distance transform-based, connected character splitting. Neural networks are used to assign character confidence values to potential character within word images. Experimental results are provided for both character and word recognition modules on data extracted from the NIST handprinted character database. 相似文献
18.
An optical character recognition (OCR) framework is developed and applied to handprinted numeric fields recognition. The
numeric fields were extracted from binary images of VISA? credit card application forms. The images include personal identity
numbers and telephone numbers. The proposed OCR framework is a cascaded neural networks. The first stage is a self-organizing
feature map algorithm. The second stage maps distance values into allograph membership values using a gradient descent learning
algorithm. The third stage is a multi-layer feedforward network. In this paper, we present experimental results which demonstrate
the ability to read handprinted numeric fields. Experiments were performed on a test data set from the CCL/ITRI database which
consists of over 90,390 handwritten numeric digits. 相似文献
19.
In this paper we describe a database that consists of handwritten English sentences. It is based on the Lancaster-Oslo/Bergen
(LOB) corpus. This corpus is a collection of texts that comprise about one million word instances. The database includes 1,066
forms produced by approximately 400 different writers. A total of 82,227 word instances out of a vocabulary of 10,841 words
occur in the collection. The database consists of full English sentences. It can serve as a basis for a variety of handwriting
recognition tasks. However, it is expected that the database would be particularly useful for recognition tasks where linguistic
knowledge beyond the lexicon level is used, because this knowledge can be automatically derived from the underlying corpus.
The database also includes a few image-processing procedures for extracting the handwritten text from the forms and the segmentation
of the text into lines and words.
Received September 28, 2001 / Revised October 10, 2001 相似文献
20.
Dot-matrix text recognition is a difficult problem, especially when characters are broken into several disconnected components.
We present a dot-matrix text recognition system which uses the fact that dot-matrix fonts are fixed-pitch, in order to overcome
the difficulty of the segmentation process. After finding the most likely pitch of the text, a decision is made as to whether
the text is written in a fixed-pitch or proportional font. Fixed-pitch text is segmented using a pitch-based segmentation
process that can successfully segment both touching and broken characters. We report performance results for the pitch estimation,
fixed-pitch decision and segmentation, and recognition processes.
Received October 18, 1999 / Revised April 21, 2000 相似文献