首页 | 本学科首页   官方微博 | 高级检索  
     


Extractive single-document summarization based on genetic operators and guided local search
Affiliation:1. Information Technology Research Group (GTI), Universidad del Cauca, Sector Tulcán Office 450, Popayán, Colombia;2. Computer Science Department, Electronic and Telecommunications Engineering Faculty, Universidad del Cauca, Colombia;3. Data Mining Research Group (MIDAS), Engineering Faculty, Universidad Nacional de Colombia, Bogotá, Colombia;1. School of Telecommunication and Information Engineering, Xi’an University of Posts and Telecommunications, Xi’an, PR China;2. School of Computer Science, Shaanxi Normal University, Xi’an, PR China;1. Department of Business and Entrepreneurial Management, Kainan University, 1, Kainan Road, Luchu Shiang, Taoyuan 33857, Taiwan;2. Graduate Institute of Management Science, National Chiao Tung University, 1001, Ta-Hsueh Road, Hsinchu 300, Taiwan;3. Graduate Institute of Urban Planning, College of Public Affairs, National Taipei University, 151, University Road, San Shia 237, Taiwan;1. Graduate Program in Computer Science, PPGI, UFES Federal University of Espirito Santo, Av. Fernando Ferrari, 514, CEP 29075-910 Vitória, Espírito Santo, ES, Brazil;2. Department of Production Engineering & Graduate Program in Computer Science, PPGI, UFES Federal University of Espirito Santo, Av. Fernando Ferrari, 514, CEP 29075-910 Vitória, Espírito Santo, ES, Brazil;1. Department of Computing Languages and Systems, University of Sevilla, ETSII, Avda. de la Reina Mercedes s/n, 41012 Sevilla, Spain;1. University of Pinar del Rio “Hermanos Saiz Montes de Oca”, Road Marti, No. 272, Pinar del Rio, Cuba;2. University “Pablo de Olavide”, Road Utrera, km 1, 41013 Sevilla, Spain
Abstract:Due to the exponential growth of textual information available on the Web, end users need to be able to access information in summary form – and without losing the most important information in the document when generating the summaries. Automatic generation of extractive summaries from a single document has traditionally been given the task of extracting the most relevant sentences from the original document. The methods employed generally allocate a score to each sentence in the document, taking into account certain features. The most relevant sentences are then selected, according to the score obtained for each sentence. These features include the position of the sentence in the document, its similarity to the title, the sentence length, and the frequency of the terms in the sentence. However, it has still not been possible to achieve a quality of summary that matches that performed by humans and therefore methods continue to be brought forward that aim to improve on the results. This paper addresses the generation of extractive summaries from a single document as a binary optimization problem where the quality (fitness) of the solutions is based on the weighting of individual statistical features of each sentence – such as position, sentence length and the relationship of the summary to the title, combined with group features of similarity between candidate sentences in the summary and the original document, and among the candidate sentences of the summary. This paper proposes a method of extractive single-document summarization based on genetic operators and guided local search, called MA-SingleDocSum. A memetic algorithm is used to integrate the own-population-based search of evolutionary algorithms with a guided local search strategy. The proposed method was compared with the state of the art methods UnifiedRank, DE, FEOM, NetSum, CRF, QCS, SVM, and Manifold Ranking, using ROUGE measures on the datasets DUC2001 and DUC2002. The results showed that MA-SingleDocSum outperforms the state of the art methods.
Keywords:Extractive summarization  Single document  Memetic algorithm  Guided local search
本文献已被 ScienceDirect 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号