首页 | 本学科首页   官方微博 | 高级检索  
     


Recognizing handwritten Arabic words using grapheme segmentation and recurrent neural networks
Authors:Gheith A Abandah  Fuad T Jamour  Esam A Qaralleh
Affiliation:1. Computer Engineering Department, The University of Jordan, Amman, 11942, Jordan
2. King Abdullah University of Science and Technology, Thuwal, Saudi Arabia
3. Princess Sumaya University for Technology, Amman, Jordan
Abstract:The Arabic alphabet is used in around 27 languages, including Arabic, Persian, Kurdish, Urdu, and Jawi. Many researchers have developed systems for recognizing cursive handwritten Arabic words, using both holistic and segmentation-based approaches. This paper introduces a system that achieves high accuracy using efficient segmentation, feature extraction, and recurrent neural network (RNN). We describe a robust rule-based segmentation algorithm that uses special feature points identified in the word skeleton to segment the cursive words into graphemes. We show that careful selection from a wide range of features extracted during and after the segmentation stage produces a feature set that significantly reduces the label error. We demonstrate that using same RNN recognition engine, the segmentation approach with efficient feature extraction gives better results than a holistic approach that extracts features from raw pixels. We evaluated this segmentation approach against an improved version of the holistic system MDLSTM that won the ICDAR 2009 Arabic handwritten word recognition competition. On the IfN/ENIT database of handwritten Arabic words, the segmentation approach reduces the average label error by 18.5 %, the sequence error by 22.3 %, and the execution time by 31 %, relative to MDLSTM. This approach also has the best published accuracies on two IfN/ENIT test sets.
Keywords:
本文献已被 SpringerLink 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号