共查询到20条相似文献,搜索用时 15 毫秒
1.
S. Basu Author VitaeAuthor Vitae M. Kundu Author Vitae Author Vitae D.K. Basu Author Vitae 《Pattern recognition》2007,40(6):1825-1839
A novel text line extraction technique is presented for multi-skewed document images of handwritten English or Bengali text. It assumes that hypothetical water flows, from both left and right sides of the image frame, face obstruction from characters of text lines. The stripes of areas left unwetted on the image frame are finally labelled for extraction of text lines. The success rate of the technique, as observed experimentally, are 90.34% and 91.44% for handwritten Bengali and English document images, respectively. The work may contribute significantly for the development of applications related to optical character recognition of Bengali/English text. 相似文献
2.
Layout extraction of mixed mode documents 总被引:2,自引:0,他引:2
Proper processing and efficient representation of the digitized images of printed documents require the separation of the various information types: text, graphics, and image elements. For most applications it is sufficient to separate text and nontext, because text contains the most information. This paper describes the implementation and performance of a robust algorithm for text extraction and segmentation that is completely independent of text orientation and can deal with text in various font styles and sizes. Text objects can be nested in nontext areas, and inverse printing can also be analyzed. It should be mentioned that the classification is based only on rough image features, and individual characters are not recognized. The three main processing steps of the system are the generation of connected components, neighborhood analysis, and generation of text lines and blocks. As output, connected components are classified as text or nontext. Text components are grouped as characters, words, lines, and blocks. Nontext objects are accumulated as a separate nontext block. 相似文献
3.
4.
G. Louloudis B. Gatos I. Pratikakis C. HalatsisAuthor vitae 《Pattern recognition》2009,42(12):3169-3183
In this paper, we present a segmentation methodology of handwritten documents in their distinct entities, namely, text lines and words. Text line segmentation is achieved by applying Hough transform on a subset of the document image connected components. A post-processing step includes the correction of possible false alarms, the detection of text lines that Hough transform failed to create and finally the efficient separation of vertically connected characters using a novel method based on skeletonization. Word segmentation is addressed as a two class problem. The distances between adjacent overlapped components in a text line are calculated using the combination of two distance metrics and each of them is categorized either as an inter- or an intra-word distance in a Gaussian mixture modeling framework. The performance of the proposed methodology is based on a consistent and concrete evaluation methodology that uses suitable performance measures in order to compare the text line segmentation and word segmentation results against the corresponding ground truth annotation. The efficiency of the proposed methodology is demonstrated by experimentation conducted on two different datasets: (a) on the test set of the ICDAR2007 handwriting segmentation competition and (b) on a set of historical handwritten documents. 相似文献
5.
In this paper, Delaunay triangulation is applied for the extraction of text areas in a document image. By representing the location of connected components in a document image with their centroids, the page structure is described as a set of points in two-dimensional space. When imposing Delaunay triangulation on these points, the text regions in the Delaunay triangulation will have distinguishing triangular features from image and drawing regions. For analysis, the Delaunay triangles are divided into four classes. The study reveals that specific triangles in text areas can be clustered together and identified as text body. Using this method, text regions in a document image containing fragments can also be recognized accurately. Experiments show the method is also very efficient. 相似文献
6.
Laurence Likforman-Sulem Abderrazak Zahour Bruno Taconet 《International Journal on Document Analysis and Recognition》2007,9(2-4):123-138
There is a huge amount of historical documents in libraries and in various National Archives that have not been exploited
electronically. Although automatic reading of complete pages remains, in most cases, a long-term objective, tasks such as
word spotting, text/image alignment, authentication and extraction of specific fields are in use today. For all these tasks,
a major step is document segmentation into text lines. Because of the low quality and the complexity of these documents (background
noise, artifacts due to aging, interfering lines), automatic text line segmentation remains an open research field. The objective
of this paper is to present a survey of existing methods, developed during the last decade and dedicated to documents of historical
interest. 相似文献
7.
Keechul Jung Author Vitae Kwang In Kim Author Vitae Author Vitae 《Pattern recognition》2004,37(5):977-997
Text data present in images and video contain useful information for automatic annotation, indexing, and structuring of images. Extraction of this information involves detection, localization, tracking, extraction, enhancement, and recognition of the text from a given image. However, variations of text due to differences in size, style, orientation, and alignment, as well as low image contrast and complex background make the problem of automatic text extraction extremely challenging. While comprehensive surveys of related problems such as face detection, document analysis, and image & video indexing can be found, the problem of text information extraction is not well surveyed. A large number of techniques have been proposed to address this problem, and the purpose of this paper is to classify and review these algorithms, discuss benchmark data and performance evaluation, and to point out promising directions for future research. 相似文献
8.
G. Louloudis Author Vitae B. Gatos Author Vitae I. Pratikakis Author Vitae 《Pattern recognition》2008,41(12):3758-3772
In this paper, we present a new text line detection method for handwritten documents. The proposed technique is based on a strategy that consists of three distinct steps. The first step includes image binarization and enhancement, connected component extraction, partitioning of the connected component domain into three spatial sub-domains and average character height estimation. In the second step, a block-based Hough transform is used for the detection of potential text lines while a third step is used to correct possible splitting, to detect text lines that the previous step did not reveal and, finally, to separate vertically connected characters and assign them to text lines. The performance evaluation of the proposed approach is based on a consistent and concrete evaluation methodology. 相似文献
9.
10.
Nikos Nikolaou Michael Makridis Basilis Gatos Nikolaos Stamatopoulos Nikos Papamarkos 《Image and vision computing》2010
In this paper, we strive towards the development of efficient techniques in order to segment document pages resulting from the digitization of historical machine-printed sources. This kind of documents often suffer from low quality and local skew, several degradations due to the old printing matrix quality or ink diffusion, and exhibit complex and dense layout. To face these problems, we introduce the following innovative aspects: (i) use of a novel Adaptive Run Length Smoothing Algorithm (ARLSA) in order to face the problem of complex and dense document layout, (ii) detection of noisy areas and punctuation marks that are usual in historical machine-printed documents, (iii) detection of possible obstacles formed from background areas in order to separate neighboring text columns or text lines, and (iv) use of skeleton segmentation paths in order to isolate possible connected characters. Comparative experiments using several historical machine-printed documents prove the efficiency of the proposed technique. 相似文献
11.
Face detection in complicated backgrounds and different illumination conditions by using YCbCr color space and neural network 总被引:1,自引:0,他引:1
This investigation develops an efficient face detection scheme that can detect multiple faces in color images with complex environments and different illumination levels. The proposed scheme comprises two stages. The first stage adopts color and triangle-based segmentation to search potential face regions. The second stage involves face verification using a multilayer feedforward neural network. The system can handle various sizes of faces, different illumination conditions, diverse pose and changeable expression. In particular, the scheme significantly increases the execution speed of the face detection algorithm in the case of complex backgrounds. Results of this study demonstrate that the proposed method performs better than previous methods in terms of speed and ability to handle different illumination conditions. 相似文献
12.
Hideaki Goto Hirotomo Aso 《International Journal on Document Analysis and Recognition》2002,4(4):258-268
Recent remarkable progress in computer systems and printing devices has made it easier to produce printed documents with
various designs. Text characters are often printed on colored backgrounds, and sometimes on complex backgrounds such as photographs,
computer graphics, etc. Some methods have been developed for character pattern extraction from document images and scene images
with complex backgrounds. However, the previous methods are suitable only for extracting rather large characters, and the
processes often fail to extract small characters with thin strokes. This paper proposes a new method by which character patterns
can be extracted from document images with complex backgrounds. The method is based on local multilevel thresholding and pixel
labeling, and region growing. This framework is very useful for extracting character patterns from badly illuminated document
images. The performance of extracting small character patterns has been improved by suppressing the influence of mixed-color
pixels around character edges. Experimental results show that the method is capable of extracting very small character patterns
from main text blocks in various documents, separating characters and complex backgrounds, as long as the thickness of the
character strokes is more than about 1.5 pixels.
Received July 23, 2001 / Accepted November 5, 2001 相似文献
13.
14.
Datong Chen Author Vitae Jean-Marc Odobez Author VitaeAuthor Vitae 《Pattern recognition》2004,37(3):595-608
This paper presents a new method for detecting and recognizing text in complex images and video frames. Text detection is performed in a two-step approach that combines the speed of a text localization step, enabling text size normalization, with the strength of a machine learning text verification step applied on background independent features. Text recognition, applied on the detected text lines, is addressed by a text segmentation step followed by an traditional OCR algorithm within a multi-hypotheses framework relying on multiple segments, language modeling and OCR statistics. Experiments conducted on large databases of real broadcast documents demonstrate the validity of our approach. 相似文献
15.
16.
Text characters embedded in images represent a rich source of information for content-based indexing and retrieval applications. However, these text characters are difficult to be detected and recognized due to their various sizes, grayscale values, and complex backgrounds. Existing methods cannot handle well those texts with different contrast or embedded in a complex image background. In this paper, a set of sequential algorithms for text extraction and enhancement of image using cellular automata are proposed. The image enhancement includes gray level, contrast manipulation, edge detection, and filtering. First, it applies edge detection and uses a threshold to filter out for low-contrast text and simplify complex background of high-contrast text from binary image. The proposed algorithm is simple and easy to use and requires only a sample texture binary image as an input. It generates textures with perceived quality, better than those proposed by earlier published techniques. The performance of our method is demonstrated by presenting experimental results for a set of text based binary images. The quality of thresholding is assessed using the precision and recall analysis of the resultant text in the binary image. 相似文献
17.
FranciscoDel Puerto Mounir Ben Ghalia 《Engineering Applications of Artificial Intelligence》2002,15(6):601-606
This paper discusses the application of neural networks to the white tracking adjustment of television receivers during production. High quality levels of tracking for the color temperature 8000 K were obtained with four-layer (7-10-10-6) network. The network input set consists of brightness level, high and low luminance levels, and “x” and “y” coordinates on the chromaticity diagram for both high and low luminance. The network output set consists of recommended adjustments for brightness, red, green, and blue cutoffs, and green and blue gains. The network was trained using the back-propagation algorithm. The experimental study has shown that the application of neural networks has reduced the testing time which has led to an increase in production rate. 相似文献
18.
This paper proposes a new, efficient algorithm for extracting similar sections between two time sequence data sets. The algorithm,
called Relay Continuous Dynamic Programming (Relay CDP), realizes fast matching between arbitrary sections in the reference
pattern and the input pattern and enables the extraction of similar sections in a frame synchronous manner. In addition, Relay
CDP is extended to two types of applications that handle spoken documents. The first application is the extraction of repeated
utterances in a presentation or a news speech because repeated utterances are assumed to be important parts of the speech.
These repeated utterances can be regarded as labels for information retrieval. The second application is flexible spoken document
retrieval. A phonetic model is introduced to cope with the speech of different speakers. The new algorithm allows a user to
query by natural utterance and searches spoken documents for any partial matches to the query utterance. We present herein
a detailed explanation of Relay CDP and the experimental results for the extraction of similar sections and report results
for two applications using Relay CDP.
Yoshiaki Itoh has been an associate professor in the Faculty of Software and Information Science at Iwate Prefectural University, Iwate,
Japan, since 2001. He received the B.E. degree, M.E. degree, and Dr. Eng. from Tokyo University, Tokyo, in 1987, 1989, and
1999, respectively. From 1989 to 2001 he was a researcher and a staff member of Kawasaki Steel Corporation, Tokyo and Okayama.
From 1992 to 1994 he transferred as a researcher to Real World Computing Partnership, Tsukuba, Japan. Dr. Itoh's research
interests include spoken document processing without recognition, audio and video retrieval, and real-time human communication
systems. He is a member of ISCA, Acoustical Society of Japan, Institute of Electronics, Information and Communication Engineers,
Information Processing Society of Japan, and Japan Society of Artificial Intelligence.
Kazuyo Tanaka has been a professor at the University of Tsukuba, Tsukuba, Japan, since 2002. He received the B.E. degree from Yokohama
National University, Yokohama, Japan, in 1970, and the Dr. Eng. degree from Tohoku University, Sendai, Japan, in 1984. From
1971 to 2002 he was research officer of Electrotechnical Laboratory (ETL), Tsukuba, Japan, and the National Institute of Advanced
Science and Technology (AIST), Tsukuba, Japan, where he was working on speech analysis, synthesis, recognition, and understanding,
and also served as chief of the speech processing section. His current interests include digital signal processing, spoken
document processing, and human information processing. He is a member of IEEE, ISCA, Acoustical Society of Japan, Institute
of Electronics, Information and Communication Engineers, and Japan Society of Artificial Intelligence.
Shi-Wook Lee received the B.E. degree and M.E. degree from Yeungnam University, Korea and Ph.D. degree from the University of Tokyo in
1995, 1997, and 2001, respectively. Since 2001 he has been working in the Research Group of Speech and Auditory Signal Processing,
the National Institute of Advanced Science and Technology (AIST), Tsukuba, Japan, as a postdoctoral fellow. His research interests
include spoken document processing, speech recognition, and understanding. 相似文献
19.
20.
针对印刷文字缺陷检测难等问题,本文提出了一种改进的图像细化算法用于文字检测。首先根据投影法分割字符,进一步对分割的字符图像逐步细化得到文字的骨架,然后根据文字骨架的信息完成检测; 相似文献