首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 11 毫秒
1.
Goraine  H. Usher  M. Al-Emami  S. 《Computer》1992,25(7):71-74
A personal computer-based Arabic character recognition system that performs three preprocessing stages sequentially, thinning, stroke segmentation, and sampling, is described. The eight-direction code used for stroke representation and classification, the character classification done at primary and secondary levels, and the contextual postprocessor used for error detection and correction are described. Experimental results obtained using samples of handwritten and typewritten Arabic words are presented  相似文献   

2.
Printed Arabic character recognition using HMM   总被引:1,自引:0,他引:1       下载免费PDF全文
The Arabic Language has a very rich vocabulary. More than 200 million people speak this language as their native speaking, and over 1 billion people use it in several religion-related activities. In this paper a new technique is presented for recognizing printed Arabic characters. After a word is segmented, each character/word is entirely transformed into a feature vector. The features of printed Arabic characters include strokes and bays in various directions, endpoints, intersection points, loops, dots and zigzags. The word skeleton is decomposed into a number of links in orthographic order, and then it is transferred into a sequence of symbols using vector quantization. Single hidden Markov model has been used for recognizing the printed Arabic characters. Experimental results show that the high recognition rate depends on the number of states in each sample.  相似文献   

3.
The aim of our work is to present a new method based on structural characteristics and a fuzzy classifier for off-line recognition of handwritten Arabic characters in all their forms (beginning, end, middle and isolated). The proposed method can be integrated in any handwritten Arabic words recognition system based on an explicit segmentation process. First, three preprocessing operations are applied on character images: thinning, contour tracing and connected components detection. These operations extract structural characteristics used to divide the set of characters into five subsets. Next, features are extracted using invariant pseudo-Zernike moments. Classification was done using the Fuzzy ARTMAP neural network, which is very fast in training and supports incremental learning. Five Fuzzy ARTMAP neural networks were employed; each one is designed to recognize one subset of characters. The recognition process is achieved in two steps: in the first one, a clustering method affects characters to one of the five character subsets. In the second one, the pseudo-Zernike features are used by the appropriate Fuzzy ARTMAP classifier to identify the character. Training process and tests were performed on a set of character images manually extracted from the IFN/ENIT database. A height recognition rate was reported.  相似文献   

4.
Neural Computing and Applications - Optical character recognition for the English text may be considered one of the most important research topics, whether, printed or handwritten. Although...  相似文献   

5.
6.
7.
Two research strands, each identifying an area of markedly increasing importance in the current development of pattern analysis technology, underlie the review covered by this paper, and are drawn together to offer both a task-oriented and a fundamentally generic perspective on the discipline of pattern recognition. The first of these is the concept of decision fusion for high-performance pattern recognition, where (often very diverse) classification technologies, each providing complementary sources of information about class membership, can be integrated to provide more accurate, robust and reliable classification decisions. The second is the rapid expansion in technology for the automated analysis of (especially) handwritten data for OCR applications including document and form processing, pen-based computing, forensic analysis, biometrics and security, and many other areas, especially those which seek to provide online or offline processing of data which is available in a human-oriented medium. Classifier combination/multiple expert processing has a long history, but the sheer volume and diversity of possible strategies now available suggest that it is timely to consider a structured review of the field. Handwritten character processing provides an ideal context for such a review, both allowing engagement with a problem area which lends itself ideally to the performance enhancements offered by multi-classifier configurations, but also allowing a clearer focus to what otherwise, because of the unlimited application horizons, would be a task of unmanageable proportions. Hence, this paper explicitly reviews the field of multiple classifier decision combination strategies for character recognition, from some of its early roots to the present day. In order to give structure and a sense of direction to the review, a new taxonomy for categorising approaches is defined and explored, and this both imposes a discipline on the presentation of the material available and helps to clarify the mechanisms by which multi-classifier configurations deliver performance enhancements. The review incorporates a discussion both of processing structures themselves and a range of important related topics which are essential to maximise an understanding of the potential of such structures. Most importantly, the paper illustrates explicitly how the principles underlying the application of multi-classifier approaches to character recognition can easily generalise to a wide variety of different task domains.Received: 3 February 2002, Accepted: 9 October 2002, Published online: 6 June 2003  相似文献   

8.
9.
Multimedia Tools and Applications - Object classification, such as handwritten Arabic character recognition, is a computer vision application. Deep learning techniques such as convolutional neural...  相似文献   

10.
To successfully apply character recognition technology most of the forms currently hand-keyed will need to be redesigned. This paper presents results from a comprehensive study of three versions of a redesigned tax form. Analyses show that using separately spaced character boxes provides superior machine readability over fields containing combs and adjoining character boxes. It is shown that character boxes containing two vertically stacked ovals cause writers much more difficulty. Analyses provide proof that writer idiosyncratic responses on forms are the major source of errors, and proper form design can reduce these errors  相似文献   

11.
讨论了形态学处理在字符识别中的应用研究,用开运算和闭运算等形态学处理知识对图像进行增强处理,目的是消除字符间断裂、粘连等问题.实验结果表明,用形态学处理进行图像预处理能有效消除笔画断裂、粘连等问题,进一步增强目标字符特征,提高字符识别正确率.  相似文献   

12.
13.
14.
15.
Video text detection and segmentation for optical character recognition   总被引:1,自引:0,他引:1  
In this paper, we present approaches to detecting and segmenting text in videos. The proposed video-text-detection technique is capable of adaptively applying appropriate operators for video frames of different modalities by classifying the background complexities. Effective operators such as the repeated shifting operations are applied for the noise removal of images with high edge density. Meanwhile, a text-enhancement technique is used to highlight the text regions of low-contrast images. A coarse-to-fine projection technique is then employed to extract text lines from video frames. Experimental results indicate that the proposed text-detection approach is superior to the machine-learning-based (such as SVM and neural network), multiresolution-based, and DCT-based approaches in terms of detection and false-alarm rates. Besides text detection, a technique for text segmentation is also proposed based on adaptive thresholding. A commercial OCR package is then used to recognize the segmented foreground text. A satisfactory character-recognition rate is reported in our experiments.Published online: 14 December 2004  相似文献   

16.
17.
Considers the problem of evaluating character image generators that model distortions encountered in optical character recognition (OCR). While a number of such defect models have been proposed, the contention that they produce the desired result is typically argued in an ad hoc and informal way. The authors introduce a rigorous and more pragmatic definition of when a model is accurate: they say a defect model is validated if the OCR errors induced by the model are indistinguishable from the errors encountered when using real scanned documents. The authors describe four measures to quantify this similarity, and compare and contrast them using over ten million scanned and synthesized characters in three fonts. The measures differentiate effectively between different fonts and different scans of the same font regardless of the underlying text  相似文献   

18.
In this paper, we focus on information extraction from optical character recognition (OCR) output. Since the content from OCR inherently has many errors, we present robust algorithms for information extraction from OCR lattices instead of merely looking them up in the top-choice (1-best) OCR output. Specifically, we address the challenge of named entity detection in noisy OCR output and show that searching for named entities in the recognition lattice significantly improves detection accuracy over 1-best search. While lattice-based named entity (NE) detection improves NE recall from OCR output, there are two problems with this approach: (1) the number of false alarms can be prohibitive for certain applications and (2) lattice-based search is computationally more expensive than 1-best NE lookup. To mitigate the above challenges, we present techniques for reducing false alarms using confidence measures and for reducing the amount of computation involved in performing the NE search. Furthermore, to demonstrate that our techniques are applicable across multiple domains and languages, we experiment with optical character recognition systems for videotext in English and scanned handwritten text in Arabic.  相似文献   

19.
20.
In this paper, we fill a gap in the literature by studying the problem of Arabic handwritten digit recognition. The performances of different classification and feature extraction techniques on recognizing Arabic digits are going to be reported to serve as a benchmark for future work on the problem. The performance of well known classifiers and feature extraction techniques will be reported in addition to a novel feature extraction technique we present in this paper that gives a high accuracy and competes with the state-of-the-art techniques. A total of 54 different classifier/features combinations will be evaluated on Arabic digits in terms of accuracy and classification time. The results are analyzed and the problem of the digit ‘0’ is identified with a proposed method to solve it. Moreover, we propose a strategy to select and design an optimal two-stage system out of our study and, hence, we suggest a fast two-stage classification system for Arabic digits which achieves as high accuracy as the highest classifier/features combination but with much less recognition time.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号