首页 | 本学科首页   官方微博 | 高级检索  
     


A new Quranic Corpus rich in morphosyntactical information
Authors:Imad Zeroual  Abdelhak Lakhouaja
Affiliation:1.Computer Sciences Laboratory Faculty of Sciences,Mohammed First University,Oujda,Morocco
Abstract:There is not a widely amount of available annotated Arabic corpora. This leads us to contribute to the enrichment of Arabic corpora resources. In this regard, we have decided to start working with correct and carefully selected texts. Thus, beginning with the Quranic Arabic text is the best way to start for such an effort. Furthermore, the annotating linguistic resources, such as Quranic Corpus, are important for researchers working in all Arabic natural language processing fields. To the best of our knowledge, the only available Quranic Arabic corpora are from the University of Leeds, University of Jordan and the University of Haifa. Unfortunately, these corpora have several problems and they do not contain enough grammatical and syntactical information. To build a new Corpus of the Quran, the work used a semi-automatic technique, which consists in using the morphsyntactic of standard Arabic words “AlKhalil Morpho Sys” followed by a manual treatment. As a result of this work, we have built a new Quranic Corpus rich in morphosyntactical information.
Keywords:
本文献已被 SpringerLink 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号