首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
In recent years, Deep Learning models have become indispensable in several fields such as computer vision, automatic object recognition, and automatic natural language processing. The implementation of a robust and efficient handwritten text recognition system remains a challenge for the research community in this field, especially for the Arabic language, which, compared to other languages, has a dearth of published works. In this work, we presented an efficient and new system for offline Arabic handwritten text recognition. Our new approach is based on the combination of a Convolutional Neural Network (CNN) and a Bidirectional Long-Term Memory (BLSTM) followed by a Connectionist Temporal Classification layer (CTC). Moreover, during the training phase of the model, we introduce an algorithm of data augmentation to increase the quality of data. Our proposed approach can recognize Arabic handwritten texts without the need to segment the characters, thus overcoming several problems related to this point. To train and test (evaluate) our approach, we used two Arabic handwritten text recognition databases, which are IFN/ENIT and KHATT. The Experimental results show that our new approach, compared to other methods in the literature, gives better results.  相似文献   

2.
3.
Handwritten character recognition systems are used in every field of life nowadays, including shopping malls, banks, educational institutes, etc. Urdu is the national language of Pakistan, and it is the fourth spoken language in the world. However, it is still challenging to recognize Urdu handwritten characters owing to their cursive nature. Our paper presents a Convolutional Neural Networks (CNN) model to recognize Urdu handwritten alphabet recognition (UHAR) offline and online characters. Our research contributes an Urdu handwritten dataset (aka UHDS) to empower future works in this field. For offline systems, optical readers are used for extracting the alphabets, while diagonal-based extraction methods are implemented in online systems. Moreover, our research tackled the issue concerning the lack of comprehensive and standard Urdu alphabet datasets to empower research activities in the area of Urdu text recognition. To this end, we collected 1000 handwritten samples for each alphabet and a total of 38000 samples from 12 to 25 age groups to train our CNN model using online and offline mediums. Subsequently, we carried out detailed experiments for character recognition, as detailed in the results. The proposed CNN model outperformed as compared to previously published approaches.  相似文献   

4.
The recognition of the Arabic characters is a crucial task in computer vision and Natural Language Processing fields. Some major complications in recognizing handwritten texts include distortion and pattern variabilities. So, the feature extraction process is a significant task in NLP models. If the features are automatically selected, it might result in the unavailability of adequate data for accurately forecasting the character classes. But, many features usually create difficulties due to high dimensionality issues. Against this background, the current study develops a Sailfish Optimizer with Deep Transfer Learning-Enabled Arabic Handwriting Character Recognition (SFODTL-AHCR) model. The projected SFODTL-AHCR model primarily focuses on identifying the handwritten Arabic characters in the input image. The proposed SFODTL-AHCR model pre-processes the input image by following the Histogram Equalization approach to attain this objective. The Inception with ResNet-v2 model examines the pre-processed image to produce the feature vectors. The Deep Wavelet Neural Network (DWNN) model is utilized to recognize the handwritten Arabic characters. At last, the SFO algorithm is utilized for fine-tuning the parameters involved in the DWNN model to attain better performance. The performance of the proposed SFODTL-AHCR model was validated using a series of images. Extensive comparative analyses were conducted. The proposed method achieved a maximum accuracy of 99.73%. The outcomes inferred the supremacy of the proposed SFODTL-AHCR model over other approaches.  相似文献   

5.
This paper presents a handwritten document recognition system based on the convolutional neural network technique. In today’s world, handwritten document recognition is rapidly attaining the attention of researchers due to its promising behavior as assisting technology for visually impaired users. This technology is also helpful for the automatic data entry system. In the proposed system prepared a dataset of English language handwritten character images. The proposed system has been trained for the large set of sample data and tested on the sample images of user-defined handwritten documents. In this research, multiple experiments get very worthy recognition results. The proposed system will first perform image pre-processing stages to prepare data for training using a convolutional neural network. After this processing, the input document is segmented using line, word and character segmentation. The proposed system get the accuracy during the character segmentation up to 86%. Then these segmented characters are sent to a convolutional neural network for their recognition. The recognition and segmentation technique proposed in this paper is providing the most acceptable accurate results on a given dataset. The proposed work approaches to the accuracy of the result during convolutional neural network training up to 93%, and for validation that accuracy slightly decreases with 90.42%.  相似文献   

6.
The paper discusses the segmentation of words into characters, which is an essential task in the development process of character recognition systems, as poorly segmented characters will automatically be unrecognized. The segmentation of offline handwritten Arabic text poses a greater challenge because of its cursive nature and different writing styles. In this article, we propose a new approach to segment handwritten Arabic characters using an efficient analysis of the vertical projection histogram. Our approach was tested using a set of handwritten Arabic words from the IFN/ENIT database, and promising results were obtained.  相似文献   

7.
This paper presents a language-based efficient post-processing algorithm for the recognition of online unconstrained handwritten Gurmukhi characters. A total of 93 stroke classes have been identified to recognize the Gurmukhi character set in this work. Support Vector Machine (SVM) classifier has been employed for stroke classification. The main objective of this paper is to improve the character level recognition accuracy using an efficient Finite State Automata (FSA)-based formation of Gurmukhi characters algorithm. A database of 21,945 online handwritten Gurmukhi words is primarily collected in this experiment. After analysing the collected database, we have observed that a character can be written using one or more strokes. Therefore, a total of 65,946 strokes have been annotated using the 93 identified stroke classes. Among these strokes, 15,069 stroke samples are considered for training the classifier. The proposed system achieved promising recognition accuracy of 97.3% for Gurmukhi characters, when tested with a new database of 8,200 characters, written by 20 different writers.  相似文献   

8.
Named Entity Recognition (NER) is one of the fundamental tasks in Natural Language Processing (NLP), which aims to locate, extract, and classify named entities into a predefined category such as person, organization and location. Most of the earlier research for identifying named entities relied on using handcrafted features and very large knowledge resources, which is time consuming and not adequate for resource-scarce languages such as Arabic. Recently, deep learning achieved state-of-the-art performance on many NLP tasks including NER without requiring hand-crafted features. In addition, transfer learning has also proven its efficiency in several NLP tasks by exploiting pretrained language models that are used to transfer knowledge learned from large-scale datasets to domain-specific tasks. Bidirectional Encoder Representation from Transformer (BERT) is a contextual language model that generates the semantic vectors dynamically according to the context of the words. BERT architecture relay on multi-head attention that allows it to capture global dependencies between words. In this paper, we propose a deep learning-based model by fine-tuning BERT model to recognize and classify Arabic named entities. The pre-trained BERT context embeddings were used as input features to a Bidirectional Gated Recurrent Unit (BGRU) and were fine-tuned using two annotated Arabic Named Entity Recognition (ANER) datasets. Experimental results demonstrate that the proposed model outperformed state-of-the-art ANER models achieving 92.28% and 90.68% F-measure values on the ANERCorp dataset and the merged ANERCorp and AQMAR dataset, respectively.  相似文献   

9.
We propose to perform an image-based framework for electrical energy meter reading. Our aim is to extract the image region that depicts the digits and then recognize them to record the consumed units. Combining the readings of serial numbers and energy meter units, an automatic billing system using the Internet of Things and a graphical user interface is deployable in a real-time setup. However, such region extraction and character recognition become challenging due to image variations caused by several factors such as partial occlusion due to dust on the meter display, orientation and scale variations caused by camera positioning, and non-uniform illumination caused by shades. To this end, our work evaluates and compares the state-of-the art deep learning algorithm You Only Look Once (YOLO ) along with traditional handcrafted features for text extraction and recognition. Our image dataset contains 10,000 images of electrical energy meters and is further expanded by data augmentation such as in-plane rotation and scaling to make the deep learning algorithms robust to these image variations. For training and evaluation, the image dataset is annotated to produce the ground truth of all the images. Consequently, YOLO achieves superior performance over the traditional handcrafted features with an average recognition rate of 98% for all the digits. It proves to be robust against the mentioned image variations compared with the traditional handcrafted features. Our proposed method can be highly instrumental in reducing the time and effort involved in the current meter reading, where workers visit door to door, take images of meters and manually extract readings from these images.  相似文献   

10.
李颖  刘菊华  易尧华 《包装工程》2018,39(5):168-172
目的基于大津算法(Otsu算法)对图像进行分割,利用光学字符识别方法对自然场景图像中的英文字符进行识别。方法首先用分块Otsu算法对图像进行初步的二值化,然后通过对二值化结果的分析,把原始的输入图片分割成单个字符的子图,再对各子图重新用Otsu算法进行二值化,最后对最终得到的二值化结果进行识别,再结合之前得到的每幅图的字符数量信息和词典信息,对识别结果进行修正,得到最终的识别结果。结果在ICDAR2013数据集上测试文中算法,单词正确识别率为46.03%,总编辑距离为474.5。结论文中提出的以Otsu为基础的分块识别算法,能够更好地分割复杂背景图像的背景和文本,同时结合词典信息对识别结果进行了修正,改善了识别效果。  相似文献   

11.
许秦蓉 《包装工程》2014,35(21):80-85
目的在脱机手写体文字识别系统中,由于自由书写的字符不可避免地受到图像背景不均匀、图像倾斜和字符粘连及大小不一等因素的影响,为了确保字符切分和识别的正确性,对EMS表单中手写体汉字字符图像预处理方法进行探讨,展示了EMS表单图像预处理的全过程。方法采用最小二乘法作拟合直线的方法,对目标图像进行定位和分割,用基于大津阈值的分块阈值算法处理目标图像的背景不均问题,并减少噪声干扰。结果该图像预处理方法在1020张真实EMS图像上进行测试,识别正确率达到了86.3%。结论该方法有一定的灵活性和抗干扰性,减少了图像噪声对汉字字符切分和识别的影响。  相似文献   

12.
The text of the Quran is principally dependent on the Arabic language. Therefore, improving the security and reliability of the Quran’s text when it is exchanged via internet networks has become one of the most difficult challenges that researchers face today. Consequently, the diacritical marks in the Holy Quran which represent Arabic vowels () known as the kashida (or “extended letters”) must be protected from changes. The cover text of the Quran and its watermarked text are different due to the low values of the Peak Signal to Noise Ratio (PSNR), and Normalized Cross-Correlation (NCC); thus, the location for tamper detection accuracy is low. The gap addressed in this paper to improve the security of Arabic text in the Holy Quran by using vowels with kashida. To enhance the watermarking scheme of the text of the Quran based on hybrid techniques (XOR and queuing techniques) of the purposed scheme. The methodology propose scheme consists of four phases: The first phase is pre-processing. This is followed by the second phase where an embedding process takes place to hide the data after the vowel letters wherein if the secret bit is “1”, it inserts the kashida but does not insert the kashida if the bit is “0”. The third phase is an extraction process and the last phase is to evaluate the performance of the proposed scheme by using PSNR (for the imperceptibility), and NCC (for the security of the watermarking). Experiments were performed on three datasets of varying lengths under multiple random locations of insertion, reorder and deletion attacks. The experimental results were revealed the improvement of the NCC by 1.76%, PSNR by 9.6% compared to available current schemes.  相似文献   

13.
14.
15.
N. Tripathy  U. Pal 《Sadhana》2006,31(6):755-769
Segmentation of handwritten text into lines, words and characters is one of the important steps in the handwritten text recognition process. In this paper we propose a water reservoir concept-based scheme for segmentation of unconstrained Oriya handwritten text into individual characters. Here, at first, the text image is segmented into lines, and the lines are then segmented into individual words. For line segmentation, the document is divided into vertical stripes. Analysing the heights of the water reservoirs obtained from different components of the document, the width of a stripe is calculated. Stripe-wise horizontal histograms are then computed and the relationship of the peak-valley points of the histograms is used for line segmentation. Based on vertical projection profiles and structural features of Oriya characters, text lines are segmented into words. For character segmentation, at first, the isolated and connected (touching) characters in a word are detected. Using structural, topological and water reservoir concept-based features, characters of the word that touch are then segmented. From experiments we have observed that the proposed “touching character” segmentation module has 96.7% accuracy for two-character touching strings.  相似文献   

16.
《成像科学杂志》2013,61(3):177-182
Abstract

In composite document image, handwritten and printed text is often found to be overlapped with printed lines. The problem becomes critical for obscure and broken lines at multiple positions. Consequently, line removal is unavoidable pre-processing stage in the development of robust object recognisers. Moreover, the restoration of the smash-up characters after removal of lines still persists to be a problem of interest. This paper presents a new approach to detect and remove unwanted printed line inherited in the text image at any position without character distortion to avoid restoration stage. The proposed technique is based on connected component analysis. Experiments are conducted using single line images that scanned and extracted manually from several documents and forms. It is demonstrated that our approach is equally suitable to deal with line removal in printed and handwritten text written in any language circumvent restoration stage. Promising results are reported in comparison with the other researchers in the state of the arts.  相似文献   

17.
Human activity recognition is commonly used in several Internet of Things applications to recognize different contexts and respond to them. Deep learning has gained momentum for identifying activities through sensors, smartphones or even surveillance cameras. However, it is often difficult to train deep learning models on constrained IoT devices. The focus of this paper is to propose an alternative model by constructing a Deep Learning-based Human Activity Recognition framework for edge computing, which we call DL-HAR. The goal of this framework is to exploit the capabilities of cloud computing to train a deep learning model and deploy it on lesspowerful edge devices for recognition. The idea is to conduct the training of the model in the Cloud and distribute it to the edge nodes. We demonstrate how the DL-HAR can perform human activity recognition at the edge while improving efficiency and accuracy. In order to evaluate the proposed framework, we conducted a comprehensive set of experiments to validate the applicability of DL-HAR. Experimental results on the benchmark dataset show a significant increase in performance compared with the state-of-the-art models.  相似文献   

18.
Lip-reading technologies are rapidly progressing following the breakthrough of deep learning. It plays a vital role in its many applications, such as: human-machine communication practices or security applications. In this paper, we propose to develop an effective lip-reading recognition model for Arabic visual speech recognition by implementing deep learning algorithms. The Arabic visual datasets that have been collected contains 2400 records of Arabic digits and 960 records of Arabic phrases from 24 native speakers. The primary purpose is to provide a high-performance model in terms of enhancing the preprocessing phase. Firstly, we extract keyframes from our dataset. Secondly, we produce a Concatenated Frame Images (CFIs) that represent the utterance sequence in one single image. Finally, the VGG-19 is employed for visual features extraction in our proposed model. We have examined different keyframes: 10, 15, and 20 for comparing two types of approaches in the proposed model: (1) the VGG-19 base model and (2) VGG-19 base model with batch normalization. The results show that the second approach achieves greater accuracy: 94% for digit recognition, 97% for phrase recognition, and 93% for digits and phrases recognition in the test dataset. Therefore, our proposed model is superior to models based on CFIs input.  相似文献   

19.
Internet of Things (IoT) devices incorporate a large amount of data in several fields, including those of medicine, business, and engineering. User authentication is paramount in the IoT era to assure connected devices’ security. However, traditional authentication methods and conventional biometrics-based authentication approaches such as face recognition, fingerprints, and password are vulnerable to various attacks, including smudge attacks, heat attacks, and shoulder surfing attacks. Behavioral biometrics is introduced by the powerful sensing capabilities of IoT devices such as smart wearables and smartphones, enabling continuous authentication. Artificial Intelligence (AI)-based approaches introduce a bright future in refining large amounts of homogeneous biometric data to provide innovative user authentication solutions. This paper presents a new continuous passive authentication approach capable of learning the signatures of IoT users utilizing smartphone sensors such as a gyroscope, magnetometer, and accelerometer to recognize users by their physical activities. This approach integrates the convolutional neural network (CNN) and recurrent neural network (RNN) models to learn signatures of human activities from different users. A series of experiments are conducted using the MotionSense dataset to validate the effectiveness of the proposed method. Our technique offers a competitive verification accuracy equal to 98.4%. We compared the proposed method with several conventional machine learning and CNN models and found that our proposed model achieves higher identification accuracy than the recently developed verification systems. The high accuracy achieved by the proposed method proves its effectiveness in recognizing IoT users passively through their physical activity patterns.  相似文献   

20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号