首页 | 本学科首页   官方微博 | 高级检索  
     

基于布谷鸟搜索优化算法的多文档摘要方法
引用本文:周诗源,王英林.基于布谷鸟搜索优化算法的多文档摘要方法[J].计算机工程,2020,46(7):58-64,71.
作者姓名:周诗源  王英林
作者单位:上海财经大学信息管理与工程学院,上海200433;上海财经大学信息管理与工程学院,上海200433
摘    要:为最大化生成摘要的信息量,提出一种基于布谷鸟搜索(CS)算法与多目标函数的多文档摘要方法。对多文档数据进行预处理,通过句子分割、分词、移除停用词和词干化将文档转化为词语的基本处理形式,计算经数据预处理后的句子信息量得分并将其作为CS算法的输入,再基于多目标函数生成包含原始文档重要信息的句子以组成最终的摘要。实验结果表明,与基于粒子群优化算法和双层K最近邻算法的多文档摘要方法相比,该方法在最大化生成摘要信息量的前提下,保证了高可读性和低冗余性,并且在DUC基准数据集上的摘要平均准确度高达0.99。

关 键 词:多文档摘要  布谷鸟搜索算法  数据预处理  多目标函数  信息量

Multiple Document Summarization Method Based on Optimized Cuckoo Search Algorithm
ZHOU Shiyuan,WANG Yinglin.Multiple Document Summarization Method Based on Optimized Cuckoo Search Algorithm[J].Computer Engineering,2020,46(7):58-64,71.
Authors:ZHOU Shiyuan  WANG Yinglin
Affiliation:(School of Information Management and Engineering,Shanghai University of Finance and Economics,Shanghai 200433,China)
Abstract:To maximize the amount of information of generated summary,this paper proposes a multiple document summarization method based on the Cuckoo Search(CS)algorithm and multiple objective function.The method pre-processes data of multiple documents by using sentence segmentation,word segmentation,removal of stop words and word drying to transform the documents into a basic processed form of words.Then the score of information amount of pre-processed sentences is calculated to serve as the input of the CS algorithm.Based on the multiple objective function,the sentences including key information of original texts are generated to form the ultimate summarization.Results show that compared with multiple document summarization methods based on Particle Swarm Optimization(PSO)algorithm and Double-layer K Nearest Neighbor(DKNN)algorithm,the proposed summarization method maximizes the amount of information in the generated summary while keeping high readability and low redundancy.Its average accuracy rate on the DUC benchmark dataset reaches 0.99.
Keywords:multiple document summarization  Cuckoo Search(CS)algorithm  data preprocessing  multiple objective function  amount of information
本文献已被 维普 万方数据 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号