首页 | 本学科首页   官方微博 | 高级检索  
     


Extraction of terms and semantic relationships from Arabic texts for automatic construction of an ontology
Authors:Ali Benabdallah  Mohammed AlaEddine Abderrahim  Mohammed El-Amine Abderrahim
Affiliation:1.Department of Computer Science,University Abou Bekr Belkaid-Tlemcen,Tlemcen,Algeria;2.Department of Technology,University Abou Bekr Belkaid-Tlemcen,Tlemcen,Algeria
Abstract:The task of building an ontology from a textual corpus starts with the conceptualization phase, which extracts ontology concepts. These concepts are linked by semantic relationships. In this paper, we describe an approach to the construction of an ontology from an Arabic textual corpus, starting first with the collection and preparation of the corpus through normalization, removing stop words and stemming; then, to extract terms of our ontology, a statistical method for extracting simple and complex terms, called “the repeated segments method” are applied. To select segments with sufficient weight we apply the weighting method term frequency–inverse document frequency (TF–IDF), and to link these terms by semantic relationships we apply an automatic method of learning linguistic markers from text. This method requires a dataset of relationship pairs, which are extracted from two external resources: an Arabic dictionary of synonyms and antonyms and the lexical database Arabic WordNet. Finally, we present the results of our experimentation using our textual corpus. The evaluation of our approach shows encouraging results in terms of recall and precision.
Keywords:
本文献已被 SpringerLink 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号