期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

A survey on Arabic character segmentation

Yasser M. Alginahi 《International Journal on Document Analysis and Recognition》2013,16(2):105-126

相似文献

2.

Straight line approximation and 1D representation of off-line handwritten text

ISI Abuhaiba MJJ Holt S Datta 《Image and vision computing》1994,12(10):649-659

Algorithms to process off-line Arabic handwriting before recognition are presented. First, an algorithm that converts smoothed and thinned images into straight line approximations is described. Second, an algorithm is developed to obtain a 1D representation of off-line Arabic handwriting. This is achieved by first finding the start-end pair of vertices of writing. Then a stroke is traversed from the start to the end vertex by solving the Chinese postman's problem for its graph. Special rules are applied to enforce temporal information on the stroke to obtain the most likely traversal that is consistent with Arabic handwriting. Finally, an algorithm is suggested to reduce straight line approximations to other approximations in which loops are represented by vertices with features. In testing, 2256 unconstrained handwritten strokes, written by six writes, were used. In 96.5% of the samples, the algorithm restored the actual temporal information. 相似文献

3.

Development of an efficient neural-based segmentation technique for Arabic handwriting recognition

Husam A. Al Hamad Author Vitae Raed Abu Zitar^{Author Vitae} 《Pattern recognition》2010,43(8):2773-2798

相似文献

4.

Local features enhancement using deep auto-encoder scheme for the recognition of the proposed handwritten Arabic-Maghrebi characters database

Djaghbellou Soumia Attia Abdelouahab Bouziane Abderraouf Akhtar Zahid 《Multimedia Tools and Applications》2022,81(22):31553-31571

相似文献

5.

Optimized leaky ReLU for handwritten Arabic character recognition using convolution neural networks

Nayef Bahera H. Abdullah Siti Norul Huda Sheikh Sulaiman Rossilawati Alyasseri Zaid Abdi Alkareem 《Multimedia Tools and Applications》2022,81(2):2065-2094

Multimedia Tools and Applications - Object classification, such as handwritten Arabic character recognition, is a computer vision application. Deep learning techniques such as convolutional neural... 相似文献

6.

Online versus offline Arabic script classification

Tanzila Saba Abdulaziz S. Almazyad Amjad Rehman 《Neural computing & applications》2016,27(7):1797-1804

相似文献

7.

Recognition of handwritten cursive Arabic characters

Abuhaiba I.S.I. Mahmoud S.A. Green R.J. 《IEEE transactions on pattern analysis and machine intelligence》1994,16(6):664-672

An automatic off-line character recognition system for handwritten cursive Arabic characters is presented. A robust noise-independent algorithm is developed that yields skeletons that reflect the structural relationships of the character components. The character skeleton is converted to a tree structure suitable for recognition. A set of fuzzy constrained character graph models (FCCGM's), which tolerate large variability in writing, is designed. These models are graphs, with fuzzily labeled arcs used as prototypes for the characters. A set of rules is applied in sequence to match a character tree to an FCCGM. Arabic handwritings of four writers were used in the learning and testing stages. The system proved to be powerful in tolerance to variable writing, speed, and recognition rate 相似文献

8.

Comparative evaluation of text classification techniques using a large diverse Arabic dataset

Mohammad S. Khorsheed Abdulmohsen O. Al-Thubaity 《Language Resources and Evaluation》2013,47(2):513-538

A vast amount of valuable human knowledge is recorded in documents. The rapid growth in the number of machine-readable documents for public or private access necessitates the use of automatic text classification. While a lot of effort has been put into Western languages—mostly English—minimal experimentation has been done with Arabic. This paper presents, first, an up-to-date review of the work done in the field of Arabic text classification and, second, a large and diverse dataset that can be used for benchmarking Arabic text classification algorithms. The different techniques derived from the literature review are illustrated by their application to the proposed dataset. The results of various feature selections, weighting methods, and classification algorithms show, on average, the superiority of support vector machine, followed by the decision tree algorithm (C4.5) and Naïve Bayes. The best classification accuracy was 97 % for the Islamic Topics dataset, and the least accurate was 61 % for the Arabic Poems dataset. 相似文献

9.

Printing Arabic text using dot matrix printers

M. G. Khayat 《Software》1986,16(2):165-172

相似文献

10.

Processing of Off-Line Handwritten Text: Polygonal Approximation and Enforcement of Temporal Information

《CVGIP: Graphical Models and Image Processing》1994,56(4):324-335

Algorithms to process off-line Arabic handwriting prior to recognition are presented. The first algorithm converts smoothed and thinned images into polygonal approximations. The second algorithm determines the start vertex of writing. The third algorithm enforces temporal information by traversing the graph of the stroke in an order consistent with Arabic handwriting. It implements the following heuristic rule: the minimum distance path that traverses the stroke′s polygon from the start vertex to the end vertex has its vertices ordered as they were generated when the stroke was written. This third algorithm is developed from a standard solution of the Chinese postman′s problem applied to the graph of the stroke. Special rules to enforce temporal information on the stroke to obtain the most likely traversal that is consistent with Arabic handwriting are applied. Unconstrained handwritten strokes written by five subjects, (n = 4065) were used in testing. In 92.6% of the samples, the proposed algorithms restored the actual temporal information. 相似文献

11.

Classification of Arabic script using multiple sources of information: State of the art and perspectives

Najoua?Essoukri?Ben?Amara Email author Faouzi?Bouslama 《International Journal on Document Analysis and Recognition》2003,5(4):195-212

相似文献

12.

An image-based automatic Arabic translation system

Yi Chang^{Author Vitae} Datong Chen Author VitaeAuthor Vitae Jie Yang Author Vitae 《Pattern recognition》2009,42(9):2127-1138

In this paper, we present a system that automatically translates Arabic text embedded in images into English. The system consists of three components: text detection from images, character recognition, and machine translation. We formulate the text detection as a binary classification problem and apply gradient boosting tree (GBT), support vector machine (SVM), and location-based prior knowledge to improve the F1 score of text detection from 78.95% to 87.05%. The detected text images are processed by off-the-shelf optical character recognition (OCR) software. We employ an error correction model to post-process the noisy OCR output, and apply a bigram language model to reduce word segmentation errors. The translation module is tailored with compact data structure for hand-held devices. The experimental results show substantial improvements in both word recognition accuracy and translation quality. For instance, in the experiment of Arabic transparent font, the BLEU score increases from 18.70 to 33.47 with use of the error correction module. 相似文献

13.

Off-line recognition of Chinese handwriting by multifeature andmultilevel classification 总被引：1，自引：0，他引：1

Yuan Y. Tang Lo-Ting Tu Jiming Liu Seong-Whan Lee Win-Win Lin 《IEEE transactions on pattern analysis and machine intelligence》1998,20(5):556-561

In this paper, an off-line recognition system based on multifeature and multilevel classification is presented for handwritten Chinese characters. Ten classes of multifeatures, such as peripheral shape features, stroke density features, and stroke direction features, are used in this system. The multilevel classification scheme consists of a group classifier and a five-level character classifier, where two new technologies, overlap clustering and Gaussian distribution selector are developed. Experiments have been conducted to recognize 5,401 daily-used Chinese characters. The recognition rate is about 90 percent for a unique candidate, and 98 percent for multichoice with 10 candidates 相似文献

14.

The use of Hartley transform in OCR with application to printed Arabic character recognition

Sabri A. Mahmoud Ashraf S. Mahmoud 《Pattern Analysis & Applications》2009,12(4):353-365

相似文献

15.

Scene word recognition from pieces to whole

Anna ZHU Seiichi UCHIDA 《Frontiers of Computer Science》2019,13(2):292

Convolutional neural networks (CNNs) have had great success with regard to the object classification problem. For character classification, we found that training and testing using accurately segmented character regions with CNNs resulted in higher accuracy than when roughly segmented regions were used. Therefore, we expect to extract complete character regions from scene images. Text in natural scene images has an obvious contrast with its attachments. Many methods attempt to extract characters through different segmentation techniques. However, for blurred, occluded, and complex background cases, those methods may result in adjoined or over segmented characters. In this paper, we propose a scene word recognition model that integrates words from small pieces to entire after-cluster-based segmentation. The segmented connected components are classified as four types: background, individual character proposals, adjoined characters, and stroke proposals. Individual character proposals are directly inputted to a CNN that is trained using accurately segmented character images. The sliding window strategy is applied to adjoined character regions. Stroke proposals are considered as fragments of entire characters whose locations are estimated by a stroke spatial distribution system. Then, the estimated characters from adjoined characters and stroke proposals are classified by a CNN that is trained on roughly segmented character images. Finally, a lexicondriven integration method is performed to obtain the final word recognition results. Compared to other word recognition methods, our method achieves a comparable performance on Street View Text and the ICDAR 2003 and ICDAR 2013 benchmark databases. Moreover, our method can deal with recognizing text images of occlusion and improperly segmented text images. 相似文献

16.

Recognizing arabic handwritten characters using deep learning and genetic algorithms

Balaha Hossam Magdy Ali Hesham Arafat Youssef Esraa Khaled Elsayed Asmaa Elsayed Samak Reem Adel Abdelhaleem Mohammed Samy Tolba Mohammed Mosa Shehata Mahmoud Ragab Mahmoud Mahmoud Refa’at Abdelhameed Mariam Mahmoud Mohammed Mostafa Mahmoud 《Multimedia Tools and Applications》2021,80(21-23):32473-32509

Automated techniques for Arabic content recognition are at a beginning period contrasted with their partners for the Latin and Chinese contents recognition. There is a bulk of handwritten Arabic archives available in libraries, data centers, historical centers, and workplaces. Digitization of these documents facilitates (1) to preserve and transfer the country’s history electronically, (2) to save the physical storage space, (3) to proper handling of the documents, and (4) to enhance the retrieval of information through the Internet and other mediums. Arabic handwritten character recognition (AHCR) systems face several challenges including the unlimited variations in human handwriting and the leakage of large and public databases. In the current study, the segmentation and recognition phases are addressed. The text segmentation challenges and a set of solutions for each challenge are presented. The convolutional neural network (CNN), deep learning approach, is used in the recognition phase. The usage of CNN leads to significant improvements across different machine learning classification algorithms. It facilitates the automatic feature extraction of images. 14 different native CNN architectures are proposed after a set of try-and-error trials. They are trained and tested on the HMBD database that contains 54,115 of the handwritten Arabic characters. Experiments are performed on the native CNN architectures and the best-reported testing accuracy is 91.96%. A transfer learning (TF) and genetic algorithm (GA) approach named “HMB-AHCR-DLGA” is suggested to optimize the training parameters and hyperparameters in the recognition phase. The pre-trained CNN models (VGG16, VGG19, and MobileNetV2) are used in the later approach. Five optimization experiments are performed and the best combinations are reported. The highest reported testing accuracy is 92.88%.

相似文献

17.

Recognition of Arabic characters 总被引：1，自引：0，他引：1

Al-Yousefi H. Udpa S.S. 《IEEE transactions on pattern analysis and machine intelligence》1992,14(8):853-857

A statistical approach for the recognition of Arabic characters is introduced. As a first step, the character is segmented into primary and secondary parts (dots and zigzags). The secondary parts of the character are then isolated and identified separately, thereby reducing the number of classes from 28 to 18. The moments of the horizontal and vertical projections of the remaining primary characters are then calculated and normalized with respect to the zero-order moment. Simple measures of the shape are obtained from the normalized moments. A 9-D feature vector is obtained for each character. Classification is accomplished using quadratic discriminant functions. The approach was evaluated using isolated, handwritten, and printed characters from a database established for this purpose. The results indicate that the technique offers better classification rates in comparison with existing methods 相似文献

18.

On-line recognition of handwritten Arabic characters

Al-Emami S. Usher M. 《IEEE transactions on pattern analysis and machine intelligence》1990,12(7):704-710

相似文献

19.

Region growing based segmentation algorithm for typewritten and handwritten text recognition

Khalid Saeed Majida Albakoor 《Applied Soft Computing》2009,9(2):608-617

This paper presents a new technique of high accuracy to recognize both typewritten and handwritten English and Arabic texts without thinning. After segmenting the text into lines (horizontal segmentation) and the lines into words, it separates the word into its letters. Separating a text line (row) into words and a word into letters is performed by using the region growing technique (implicit segmentation) on the basis of three essential lines in a text row. This saves time as there is no need to skeletonize or to physically isolate letters from the tested word whilst the input data involves only the basic information—the scanned text. The baseline is detected, the word contour is defined and the word is implicitly segmented into its letters according to a novel algorithm described in the paper. The extracted letter with its dots is used as one unit in the system of recognition. It is resized into a 9 × 9 matrix following bilinear interpolation after applying a lowpass filter to reduce aliasing. Then the elements are scaled to the interval [0,1]. The resulting array is considered as the input to the designed neural network. For typewritten texts, three types of Arabic letter fonts are used—Arial, Arabic Transparent and Simplified Arabic. The results showed an average recognition success rate of 93% for Arabic typewriting. This segmentation approach has also found its application in handwritten text where words are classified with a relatively high recognition rate for both Arabic and English languages. The experiments were performed in MATLAB and have shown promising results that can be a good base for further analysis and considerations of Arabic and other cursive language text recognition as well as English handwritten texts. For English handwritten classification, a success rate of about 80% in average was achieved while for Arabic handwritten text, the algorithm performance was successful in about 90%. The recent results have shown increasing success for both Arabic and English texts. 相似文献

20.

Spoken character classification using abductive network

Isah Abdullahi Lawal 《International Journal of Speech Technology》2017,20(4):881-890

In this paper, we address the problem of learning a classifier for the classification of spoken character. We present a solution based on Group Method of Data Handling (GMDH) learning paradigm for the development of a robust abductive network classifier. We improve the reliability of the classification process by introducing the concept of multiple abductive network classifier system. We evaluate the performance of the proposed classifier using three different speech datasets including spoken Arabic digit, spoken English letter, and spoken Pashto digit. The performance of the proposed classifier surpasses that reported in the literature for other classification techniques on the same speech datasets. 相似文献