首页 | 本学科首页   官方微博 | 高级检索  
     


ArA*summarizer: An Arabic text summarization system based on subtopic segmentation and using an A* algorithm for reduction
Authors:Belahcene Bahloul  Hassina Aliane  Mohamed Benmohammed
Affiliation:1. Department of Mathematics and Computer Science, University Djilali Bounaama of Khemis Miliana, Ain Defla, Algeria;2. Information Sciences R&D Laboratory, Research Center on Scientific and Technical Information, Algiers, Algeria;3. Department of Software Technologies and Information Systems, University of Abdelhamid Mehri Constantine 2, Constantine, Algeria
Abstract:Automatic text summarization is a field situated at the intersection of natural language processing and information retrieval. Its main objective is to automatically produce a condensed representative form of documents. This paper presents ArA*summarizer, an automatic system for Arabic single document summarization. The system is based on an unsupervised hybrid approach that combines statistical, cluster-based, and graph-based techniques. The main idea is to divide text into subtopics then select the most relevant sentences in the most relevant subtopics. The selection process is done by an A* algorithm executed on a graph representing the different lexical–semantic relationships between sentences. Experimentation is conducted on Essex Arabic summaries corpus and using recall-oriented understudy for gisting evaluation, automatic summarization engineering, merged model graphs, and n-gram graph powered evaluation via regression evaluation metrics. The evaluation results showed the good performance of our system compared with existing works.
Keywords:A* algorithm  graph theory  natural language processing  text summarization  topic identification
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号