首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
2.
This paper considers the development of a real-time Arabic handwritten character recognition system. The shape of an Arabic character depends on its position in a given word. The system assumes that characters result from a reliable segmentation stage, thus, the position of the character is known a priori. Thus, four different sets of character shapes have been independently considered. Each set is further divided into four subsets depending on the number of strokes in the character. The system has been heavily tested and the average recognition rate has been found to be 99.6% where most of the misrecognized characters were actually written with little care. Thus, the system can be reliably used for the recognition of on-line handwritten characters entered via a graphic tablet.  相似文献   

3.
4.
The aim of our work is to present a new method based on structural characteristics and a fuzzy classifier for off-line recognition of handwritten Arabic characters in all their forms (beginning, end, middle and isolated). The proposed method can be integrated in any handwritten Arabic words recognition system based on an explicit segmentation process. First, three preprocessing operations are applied on character images: thinning, contour tracing and connected components detection. These operations extract structural characteristics used to divide the set of characters into five subsets. Next, features are extracted using invariant pseudo-Zernike moments. Classification was done using the Fuzzy ARTMAP neural network, which is very fast in training and supports incremental learning. Five Fuzzy ARTMAP neural networks were employed; each one is designed to recognize one subset of characters. The recognition process is achieved in two steps: in the first one, a clustering method affects characters to one of the five character subsets. In the second one, the pseudo-Zernike features are used by the appropriate Fuzzy ARTMAP classifier to identify the character. Training process and tests were performed on a set of character images manually extracted from the IFN/ENIT database. A height recognition rate was reported.  相似文献   

5.
6.
7.
In the context of Arabic optical characters recognition, Arabic poses more challenges because of its cursive nature. We purpose a system for recognizing a document containing Arabic text, using a pipeline of three neural networks. The first network model predicts the font size of an Arabic word, then the word is normalized to an 18pt font size that will be used to train the next two models. The second model is used to segment a word into characters. The problem of words segmentation in the Arabic language, as in many similar cursive languages, presents a challenge to the OCR systems. This paper presents a multichannel neural network to solve the offline segmentation of machine-printed Arabic documents. The segmented characters are then fed as an input to a convolutional neural network for Arabic characters recognition. The font size prediction model produced a test accuracy of 99.1%. The accuracy of the segmentation model using one font is 98.9%, while four-font model showed 95.5% accuracy. The whole pipeline showed an accuracy of 94.38% on Arabic Transparent font of size 18pt from APTI data set.  相似文献   

8.
Optical Character Recognition (OCR) is the process of recognizing printed or handwritten text on paper documents. This paper proposes an OCR system for Arabic characters. In addition to the preprocessing phase, the proposed recognition system consists mainly of three phases. In the first phase, we employ word segmentation to extract characters. In the second phase, Histograms of Oriented Gradient (HOG) are used for feature extraction. The final phase employs Support Vector Machine (SVM) for classifying characters. We have applied the proposed method for the recognition of Jordanian city, town, and village names as a case study, in addition to many other words that offers the characters shapes that are not covered with Jordan cites. The set has carefully been selected to include every Arabic character in its all four forms. To this end, we have built our own dataset consisting of more than 43.000 handwritten Arabic words (30000 used in the training stage and 13000 used in the testing stage). Experimental results showed a great success of our recognition method compared to the state of the art techniques, where we could achieve very high recognition rates exceeding 99%.  相似文献   

9.
Segmentation is the most challenging part of Arabic handwriting recognition due to the unique characteristics of Arabic writing that allow the same shape to denote different characters. An Arabic handwriting recognition system cannot be successful without using an appropriate segmentation method. In this paper, a very effective and efficient off-line Arabic handwriting recognition approach is proposed. The proposed approach has three stages. Firstly, all characters are simplified to single-pixel-thin images that preserve the fundamental writing characteristics. Secondly, the image pixels are normalized into horizontal and vertical lines only. Therefore, the different writing styles can be unified and the shapes of characters are standardized. Finally, these orthogonal lines are coded as unique vectors; each vector represents one letter of a word. To evaluate the proposed techniques, we have tested our approach on two different datasets. Our experimental results show that the proposed approach has superior performance over the state-of-the-art approaches.  相似文献   

10.
In this paper, we present a novel segmentation-free Arabic handwriting recognition system based on hidden Markov model (HMM). Two main contributions are introduced: a new technique for dividing the image into nonuniform horizontal segments to extract the features and a new technique for solving the problems of the skewing of characters by fusing multiple HMMs. Moreover, two enhancements are introduced: the pre-processing method and feature extraction using concavity space. The proposed system first pre-processes the input image by setting the thickness of the input word to three pixels and fixing the spacing between the different parts of the word. The input image is divided into constant number of nonuniform horizontal segments depending on the distribution of the foreground pixels. A set of robust features representing the gradient of the foreground pixels is extracted using sliding windows. The input image is decomposed into several images representing the vertical, horizontal, left diagonal and right diagonal edges in the image. A set of robust features representing the densities of the foreground pixels in the various edge images is extracted using sliding windows. The proposed system builds character HMM models and learns word HMM models using embedded training. Besides the vertical sliding window, two slanted sliding windows are used to extract the features. Three different HMMs are used: one for the vertical sliding window and two for the slanted windows. A fusion scheme is used to combine the three HMMs. The proposed system is very promising and outperforms all the other Arabic handwriting recognition systems reported in the literature.  相似文献   

11.
12.

Automated techniques for Arabic content recognition are at a beginning period contrasted with their partners for the Latin and Chinese contents recognition. There is a bulk of handwritten Arabic archives available in libraries, data centers, historical centers, and workplaces. Digitization of these documents facilitates (1) to preserve and transfer the country’s history electronically, (2) to save the physical storage space, (3) to proper handling of the documents, and (4) to enhance the retrieval of information through the Internet and other mediums. Arabic handwritten character recognition (AHCR) systems face several challenges including the unlimited variations in human handwriting and the leakage of large and public databases. In the current study, the segmentation and recognition phases are addressed. The text segmentation challenges and a set of solutions for each challenge are presented. The convolutional neural network (CNN), deep learning approach, is used in the recognition phase. The usage of CNN leads to significant improvements across different machine learning classification algorithms. It facilitates the automatic feature extraction of images. 14 different native CNN architectures are proposed after a set of try-and-error trials. They are trained and tested on the HMBD database that contains 54,115 of the handwritten Arabic characters. Experiments are performed on the native CNN architectures and the best-reported testing accuracy is 91.96%. A transfer learning (TF) and genetic algorithm (GA) approach named “HMB-AHCR-DLGA” is suggested to optimize the training parameters and hyperparameters in the recognition phase. The pre-trained CNN models (VGG16, VGG19, and MobileNetV2) are used in the later approach. Five optimization experiments are performed and the best combinations are reported. The highest reported testing accuracy is 92.88%.

  相似文献   

13.
Recognition of Chinese characters has been an area of major interest for many years, and a large number of research papers and reports have already been published in this area. There are several major problems with Chinese character recognition: Chinese characters are distinct and ideographic, the character size is very large and a lot of structurally similar characters exist in the character set. Thus, classification criteria are difficult to generate. This paper presents a new technique for the recognition of hand-printed Chinese characters using the C4.5 machine learning system. Conventional methods have relied on hand-constructed dictionaries which are tedious to construct and difficult to make tolerant to variation in writing styles. The paper discusses Chinese character recognition using theHough transform for feature extraction and C4.5 system. The system was tested with 900 characters written by different writers from poor to acceptable quality (each character has 40 samples) and the rate of recognition obtained was 84%.  相似文献   

14.
This paper presents a handwriting recognition system that deals with unconstrained handwriting and large vocabularies. The system is based on the segmentation-recognition paradigm where words are first loosely segmented into characters or pseudocharacters and the final segmentation is obtained during the recognition process, which is carried out with a lexicon. Characters are modeled by multiple hidden Markov models (HMMs), which are concatenated to build up word models. The lexicon is organized as a tree structure, and during the decoding words with similar prefixes share the same computation steps. To avoid an explosion of the search space due to the presence of multiple character models, a lexicon-driven level building algorithm (LDLBA) is used to decode the lexical tree and to choose at each level the more likely models. Bigram probabilities related to the variation of writing styles within the words are inserted between the levels of the LDLBA to improve the recognition accuracy. To further speed up the recognition process, some constraints are added to limit the search efforts to the more likely parts of the search space. Experimental results on a dataset of 4674 unconstrained words show that the proposed recognition system achieves recognition rates from 98% for a 10-word vocabulary to 71% for a 30,000-word vocabulary and recognition times from 9 ms to 18.4 s, respectively.Received: 8 July 2002, Accepted: 1 July 2003, Published online: 12 September 2003 Correspondence to: Alessandro L. Koerich  相似文献   

15.
16.
手写汉字识别是手写汉字输入的基础。目前智能设备中的手写汉字输入法无法根据用户的汉字书写习惯,动态调整识别模型以提升手写汉字的正确识别率。通过对最新深度学习算法及训练模型的研究,提出了一种基于用户手写汉字样本实时采集的个性化手写汉字输入系统的设计方法。该方法将采集用户的手写汉字作为增量样本,通过对服务器端训练生成的手写汉字识别模型的再次训练,使识别模型能够更好地适应该用户的书写习惯,提升手写汉字输入系统的识别率。最后,在该理论方法的基础上,结合新设计的深度残差网络,进行了手写汉字识别的对比实验。实验结果显示,通过引入实时采集样本的再次训练,手写汉字识别模型的识别率有较大幅度的提升,能够更有效的满足用户在智能设备端对手写汉字输入系统的使用需求。  相似文献   

17.
This paper presents a novel framework for recognition of Ethiopic characters using structural and syntactic techniques. Graphically complex characters are represented by the spatial relationships of less complex primitives which form a unique set of patterns for each character. The spatial relationship is represented by a special tree structure which is also used to generate string patterns of primitives. Recognition is then achieved by matching the generated string pattern against each pattern in the alphabet knowledge-base built for this purpose. The recognition system tolerates variations on the parameters of characters like font type, size and style. Direction field tensor is used as a tool to extract structural features.  相似文献   

18.
19.
In this paper, a structural method of recognising Arabic handwritten characters is proposed. The major problem in cursive text recognition is the segmentation into characters or into representative strokes. When we segment the cursive portions of words, we take into account the contextual properties of the Arabic grammar and the junction segments connecting the characters to each other along the writing line. The problem of overlapping characters is resolved with a contour-following algorithm associated with the labelling of the detected contours. In the recognition phase, the characters are gathered into ten families of candidate characters with similar shapes. Then a heterarchical analysis follows that checks the pattern via goal-directed feedback control.  相似文献   

20.
The widely-used PDAs, touch screens, tablet-PCs are alternatives to keyboards with the advantages of being more friendly, easy, and natural. A framework for Arabic online character recognition is developed. The framework integrates the different phases of online Arabic text recognition. The used data poses several challenges such as delayed strokes handling, connectivity problems, variability, and style change of text. We process the delayed strokes at the different phases differently to improve the overall performance. This work includes feature extraction of many features, including several novel statistical features. Experimental results on challenging online Arabic characters show encouraging results.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号