共查询到20条相似文献,搜索用时 15 毫秒
1.
Toshio Sato Takeo Kanade Ellen K. Hughes Michael A. Smith Shin'ichi Satoh 《Multimedia Systems》1999,7(5):385-395
The automatic extraction and recognition of news captions and annotations can be of great help locating topics of interest
in digital news video libraries. To achieve this goal, we present a technique, called Video OCR (Optical Character Reader),
which detects, extracts, and reads text areas in digital video data. In this paper, we address problems, describe the method
by which Video OCR operates, and suggest applications for its use in digital news archives. To solve two problems of character
recognition for videos, low-resolution characters and extremely complex backgrounds, we apply an interpolation filter, multi-frame
integration and character extraction filters. Character segmentation is performed by a recognition-based segmentation method,
and intermediate character recognition results are used to improve the segmentation. We also include a method for locating
text areas using text-like properties and the use of a language-based postprocessing technique to increase word recognition
rates. The overall recognition results are satisfactory for use in news indexing. Performing Video OCR on news video and combining
its results with other video understanding techniques will improve the overall understanding of the news video content. 相似文献
2.
A labelling approach for the automatic recognition of tables of contents (ToC) is described in this paper. A prototype is
used for the electronic consulting of scientific papers in a digital library system named Calliope. This method operates on
a roughly structured ASCII file, produced by OCR. The recognition approach operates by text labelling without using any a
priori model. Labelling is based on part-of-speech tagging (PoS) which is initiated by a primary labelling of text components
using some specific dictionaries. Significant tags are first grouped into homogeneous classes according to their grammar categories
and then reduced in canonical forms corresponding to article fields: “title” and “authors”. Non-labelled tokens are integrated
in one or another field by either applying PoS correction rules or using a structure model generated from well-detected articles.
The designed prototype operates very well on different ToC layouts and character recognition qualities. Without manual intervention,
a 96.3% rate of correct segmentation was obtained on 38 journals, including 2,020 articles, accompanied by a 93.0% rate of
correct field extraction.
Received April 5, 2000 / Revised February 19, 2001 相似文献
3.
Dot-matrix text recognition is a difficult problem, especially when characters are broken into several disconnected components.
We present a dot-matrix text recognition system which uses the fact that dot-matrix fonts are fixed-pitch, in order to overcome
the difficulty of the segmentation process. After finding the most likely pitch of the text, a decision is made as to whether
the text is written in a fixed-pitch or proportional font. Fixed-pitch text is segmented using a pitch-based segmentation
process that can successfully segment both touching and broken characters. We report performance results for the pitch estimation,
fixed-pitch decision and segmentation, and recognition processes.
Received October 18, 1999 / Revised April 21, 2000 相似文献
4.
Automatic text segmentation and text recognition for video indexing 总被引:13,自引:0,他引:13
Efficient indexing and retrieval of digital video is an important function of video databases. One powerful index for retrieval
is the text appearing in them. It enables content-based browsing. We present our new methods for automatic segmentation of
text in digital videos. The algorithms we propose make use of typical characteristics of text in videos in order to enable
and enhance segmentation performance. The unique features of our approach are the tracking of characters and words over their
complete duration of occurrence in a video and the integration of the multiple bitmaps of a character over time into a single
bitmap. The output of the text segmentation step is then directly passed to a standard OCR software package in order to translate
the segmented text into ASCII. Also, a straightforward indexing and retrieval scheme is introduced. It is used in the experiments
to demonstrate that the proposed text segmentation algorithms together with existing text recognition algorithms are suitable
for indexing and retrieval of relevant video sequences in and from a video database. Our experimental results are very encouraging
and suggest that these algorithms can be used in video retrieval applications as well as to recognize higher level semantics
in videos. 相似文献
5.
Hideaki Goto Hirotomo Aso 《International Journal on Document Analysis and Recognition》2002,4(4):258-268
Recent remarkable progress in computer systems and printing devices has made it easier to produce printed documents with
various designs. Text characters are often printed on colored backgrounds, and sometimes on complex backgrounds such as photographs,
computer graphics, etc. Some methods have been developed for character pattern extraction from document images and scene images
with complex backgrounds. However, the previous methods are suitable only for extracting rather large characters, and the
processes often fail to extract small characters with thin strokes. This paper proposes a new method by which character patterns
can be extracted from document images with complex backgrounds. The method is based on local multilevel thresholding and pixel
labeling, and region growing. This framework is very useful for extracting character patterns from badly illuminated document
images. The performance of extracting small character patterns has been improved by suppressing the influence of mixed-color
pixels around character edges. Experimental results show that the method is capable of extracting very small character patterns
from main text blocks in various documents, separating characters and complex backgrounds, as long as the thickness of the
character strokes is more than about 1.5 pixels.
Received July 23, 2001 / Accepted November 5, 2001 相似文献
6.
基于Parzen窗的印刷文档数学公式抽取的研究 总被引:3,自引:0,他引:3
数学公式抽取是公式识别的首要步骤,目前相关的研究还很欠缺。针对印刷文档中数学公式的抽取展开了研究,提出了一种Parzen窗和启发式规则相结合的公式抽取方法。对于孤立式公式采用Parzen窗方法将其从文档中抽取出来,对于嵌入式公式采用启发式规则将其从文本行中抽取出来。实验表明,这两种抽取方法的结合取得了较好的效果。 相似文献
7.
Recognizing acronyms and their definitions 总被引:1,自引:0,他引:1
Kazem Taghva Jeff Gilbreth 《International Journal on Document Analysis and Recognition》1999,1(4):191-198
This paper introduces an automatic method for finding acronyms and their definitions in free text. The method is based on
an inexact pattern matching algorithm applied to text surrounding the possible acronym. Evaluation shows both high recall
and precision for a set of documents randomly selected from a larger set of full text documents.
Received October 1, 1997 / Revised September 8, 1998 相似文献
8.
E. Kavallieratou N. Fakotakis G. Kokkinakis 《International Journal on Document Analysis and Recognition》2002,4(4):226-242
In this paper, an integrated offline recognition system for unconstrained handwriting is presented. The proposed system consists
of seven main modules: skew angle estimation and correction, printed-handwritten text discrimination, line segmentation, slant
removing, word segmentation, and character segmentation and recognition, stemming from the implementation of already existing
algorithms as well as novel algorithms. This system has been tested on the NIST, IAM-DB, and GRUHD databases and has achieved
accuracy that varies from 65.6% to 100% depending on the database and the experiment. 相似文献
9.
Alceu de S. Britto Jr Robert Sabourin Flavio Bortolozzi Ching Y. Suen 《International Journal on Document Analysis and Recognition》2003,5(2-3):102-117
In this paper, a two-stage HMM-based recognition method allows us to compensate for the possible loss in terms of recognition
performance caused by the necessary trade-off between segmentation and recognition in an implicit segmentation-based strategy.
The first stage consists of an implicit segmentation process that takes into account some contextual information to provide
multiple segmentation-recognition hypotheses for a given preprocessed string. These hypotheses are verified and re-ranked
in a second stage by using an isolated digit classifier. This method enables the use of two sets of features and numeral models:
one taking into account both the segmentation and recognition aspects in an implicit segmentation-based strategy, and the
other considering just the recognition aspects of isolated digits. These two stages have been shown to be complementary, in
the sense that the verification stage compensates for the loss in terms of recognition performance brought about by the necessary
tradeoff between segmentation and recognition carried out in the first stage. The experiments on 12,802 handwritten numeral
strings of different lengths have shown that the use of a two-stage recognition strategy is a promising idea. The verification
stage brought about an average improvement of 9.9% on the string recognition rates. On touching digit pairs, the method achieved
a recognition rate of 89.6%.
Received June 28, 2002 / Revised July 03, 2002 相似文献
10.
Identifying facsimile duplicates using radial pixel densities 总被引:2,自引:0,他引:2
P. Chatelain 《International Journal on Document Analysis and Recognition》2002,4(4):219-225
A method for detecting full layout facsimile duplicates based on radial pixel densities is proposed. It caters for facsimiles,
including text and/or graphics. Pages may be positioned upright or inverted on the scanner bed. The method is not dependent
on the computation of text skew or text orientation. Using a database of original documents, 92% of non-duplicates and upright
duplicates as well as 89% of inverted duplicates could be correctly identified. The method is vulnerable to double scanning.
This occurs when documents are copied using a photocopier and the copies are subsequently transmitted using a facsimile machine.
Received September 29, 2000 / Revised: August 23, 2001 相似文献
11.
Chi Fang Changsong Liu Liangrui Peng Xiaoqing Ding 《International Journal on Document Analysis and Recognition》2002,4(3):177-182
Performance evaluation is crucial for improving the performance of OCR systems. However, this is trivial and sophisticated
work to do by hand. Therefore, we have developed an automatic performance evaluation system for a printed Chinese character
recognition (PCCR) system. Our system is characterized by using real-world data as test data and automatically obtaining the
performance of the PCCR system by comparing the correct text and the recognition result of the document image. In addition,
our performance evaluation system also provides some evaluation of performance for the segmentation module, the classification
module, and the post-processing module of the PCCR system. For this purpose, a segmentation error-tolerant character-string
matching algorithm is proposed to obtain the correspondence between the correct text and the recognition result. The experiments
show that our performance evaluation system is an accurate and powerful tool for studying deficiencies in the PCCR system.
Although our approach is aimed at the PCCR system, the idea also can be applied to other OCR systems. 相似文献
12.
Segmentation and recognition of Chinese bank check amounts 总被引:1,自引:0,他引:1
M.L. Yu P.C.K. Kwok C.H. Leung K.W. Tse 《International Journal on Document Analysis and Recognition》2001,3(4):207-217
This paper describes a system for the recognition of legal amounts on bank checks written in the Chinese language. It consists
of subsystems that perform preprocessing, segmentation, and recognition of the legal amount. In each step of the segmentation
and recognition phases, a list of possible choices are obtained. An approach is adopted whereby a large number of choices
can be processed effectively and efficiently in order to achieve the best recognition result. The contribution of this paper
is the proposal of a grammar checker for Chinese bank check amounts. It is found to be very effective in reducing the substitution
error rate. The recognition rate of the system is 74.0%, the error rate is 10.4%, and the reliability is 87.7%.
Received June 9, 2000 / Revised January 10, 2001 相似文献
13.
Mathematical expression recognition: a survey 总被引:15,自引:0,他引:15
Kam-Fai Chan Dit-Yan Yeung 《International Journal on Document Analysis and Recognition》2000,3(1):3-15
Abstract. Automatic recognition of mathematical expressions is one of the key vehicles in the drive towards transcribing documents
in scientific and engineering disciplines into electronic form. This problem typically consists of two major stages, namely,
symbol recognition and structural analysis. In this survey paper, we will review most of the existing work with respect to
each of the two major stages of the recognition process. In particular, we try to put emphasis on the similarities and differences
between systems. Moreover, some important issues in mathematical expression recognition will be addressed in depth. All these
together serve to provide a clear overall picture of how this research area has been developed to date.
Received February 22, 2000 / Revised June 12, 2000 相似文献
14.
B.B. Chaudhuri U. Garain 《International Journal on Document Analysis and Recognition》2001,3(3):138-149
Extraction of some meta-information from printed documents without carrying out optical character recognition (OCR) is considered.
It can be statistically verified that important terms in technical articles are mainly printed in italic, bold, and all-capital
style. A quick approach to detecting them is proposed here. This approach is based on the global shape heuristics of these
styles of any font. Important words in a document are sometimes printed in larger size as well. A smart approach for the determination
of font size is also presented. Detection of type styles helps in improving OCR performance, especially for reading italicized
text. Another advantage to identifying word type styles and font size has been discussed in the context of extracting: (i)
different logical labels; and (ii) important terms from the document. Experimental results on the performance of the approach
on a large number of good quality, as well as degraded, document images are presented.
Received July 12, 2000 / Revised October 1, 2000 相似文献
15.
In this paper we describe a database that consists of handwritten English sentences. It is based on the Lancaster-Oslo/Bergen
(LOB) corpus. This corpus is a collection of texts that comprise about one million word instances. The database includes 1,066
forms produced by approximately 400 different writers. A total of 82,227 word instances out of a vocabulary of 10,841 words
occur in the collection. The database consists of full English sentences. It can serve as a basis for a variety of handwriting
recognition tasks. However, it is expected that the database would be particularly useful for recognition tasks where linguistic
knowledge beyond the lexicon level is used, because this knowledge can be automatically derived from the underlying corpus.
The database also includes a few image-processing procedures for extracting the handwritten text from the forms and the segmentation
of the text into lines and words.
Received September 28, 2001 / Revised October 10, 2001 相似文献
16.
Offline handwritten Amharic word recognition 总被引:1,自引:0,他引:1
This paper describes two approaches for Amharic word recognition in unconstrained handwritten text using HMMs. The first approach builds word models from concatenated features of constituent characters and in the second method HMMs of constituent characters are concatenated to form word model. In both cases, the features used for training and recognition are a set of primitive strokes and their spatial relationships. The recognition system does not require segmentation of characters but requires text line detection and extraction of structural features, which is done by making use of direction field tensor. The performance of the recognition system is tested by a dataset of unconstrained handwritten documents collected from various sources, and promising results are obtained. 相似文献
17.
Pietro Parodi Roberto Fontana 《International Journal on Document Analysis and Recognition》1999,2(2-3):67-79
This paper describes a novel method for extracting text from document pages of mixed content. The method works by detecting
pieces of text lines in small overlapping columns of width , shifted with respect to each other by image elements (good default values are: of the image width, ) and by merging these pieces in a bottom-up fashion to form complete text lines and blocks of text lines. The algorithm requires
about 1.3 s for a 300 dpi image on a PC with a Pentium II CPU, 300 MHz, MotherBoard Intel440LX. The algorithm is largely independent
of the layout of the document, the shape of the text regions, and the font size and style. The main assumptions are that the
background be uniform and that the text sit approximately horizontally. For a skew of up to about 10 degrees no skew correction
mechanism is necessary. The algorithm has been tested on the UW English Document Database I of the University of Washington
and its performance has been evaluated by a suitable measure of segmentation accuracy. Also, a detailed analysis of the segmentation
accuracy achieved by the algorithm as a function of noise and skew has been carried out.
Received April 4, 1999 / Revised June 1, 1999 相似文献
18.
We describe a process of word recognition that has high tolerance for poor image quality, tunability to the lexical content
of the documents to which it is applied, and high speed of operation. This process relies on the transformation of text images
into character shape codes, and on special lexica that contain information on the shape of words. We rely on the structure
of English and the high efficiency of mapping between shape codes and the characters in the words. Remaining ambiguity is
reduced by template matching using exemplars derived from surrounding text, taking advantage of the local consistency of font,
face and size as well as image quality. This paper describes the effects of lexical content, structure and processing on the
performance of a word recognition engine. Word recognition performance is shown to be enhanced by the application of an appropriate
lexicon. Recognition speed is shown to be essentially independent of the details of lexical content provided the intersection
of the occurrences of words in the document and the lexicon is high. Word recognition accuracy is dependent on both intersection
and specificity of the lexicon.
Received May 1, 1998 / Revised October 20, 1998 相似文献
19.
20.
Abstract. We propose a new adaptive strategy for text recognition that attempts to derive knowledge about the dominant font on a given
page. The strategy uses a linguistic observation that over half of all words in a typical English passage are contained in
a small set of less than 150 stop words. A small dictionary of such words is compiled from the Brown corpus. An arbitrary
text page first goes through layout analysis that produces word segmentation. A fast procedure is then applied to locate the
most likely candidates for those words, using only widths of the word images. The identity of each word is determined using
a word shape classifier. Using the word images together with their identities, character prototypes can be extracted using
a previously proposed method. We describe experiments using simulated and real images. In an experiment using 400 real page
images, we show that on average, eight distinct characters can be learned from each page, and the method is successful on
90% of all the pages. These can serve as useful seeds to bootstrap font learning.
Received October 8, 1999 / Revised March 29, 2000 相似文献