首页 | 本学科首页   官方微博 | 高级检索  
     

基于主题划分的网页自动摘要
引用本文:陈志敏,沈洁,林颖,周峰.基于主题划分的网页自动摘要[J].计算机应用,2006,26(3):641-0644.
作者姓名:陈志敏  沈洁  林颖  周峰
作者单位:扬州大学,信息工程学院,江苏,扬州,225009
基金项目:江苏省高校自然科学基金
摘    要:提出了一种以网页结构为指导的自动摘要方法。对页面源文件进行解析时,利用文档的结构信息生成DOM树,并在此基础上划分文档主题。同时充分挖掘网页标记对主题词提取和句子重要性计算的价值。最后以主题块为单位,根据句子间的相似度调整句子权重,动态生成摘要。实验结果表明该方法能有效解决文档摘要分布不平衡问题,减少了文摘内容的冗余。

关 键 词:Web信息检索  文档对象模型  主题划分  句子重要度
文章编号:1001-9081(2006)03-0641-04
收稿时间:2005-09-18
修稿时间:2005-09-182005-12-02

Automatic summarization of Web document based on topic segmentation
CHEN Zhi-min,SHEN Jie,LIN Ying,ZHOU Feng.Automatic summarization of Web document based on topic segmentation[J].journal of Computer Applications,2006,26(3):641-0644.
Authors:CHEN Zhi-min  SHEN Jie  LIN Ying  ZHOU Feng
Affiliation:College of Information Engineering, Yangzhou University, Yangzhou Jiangsu 225009, China
Abstract:A method of automatic summarization in Web information retrieval was proposed based on the structrue of the Web document. The document was partitioned into several topic blocks through parsing the document into DOM(Document Object Model) tree and comparing the semantic similarity. The tag information was fully used to extract topic words and key sentences. Finally the Abstract was created dynamically through adjusting the weights of sentences.The experiment results show that the new method can slove the imbalance problem of Abstract and reduce the redundancy of the content effectively.
Keywords:Web information retrieval  DOM  topic segmentation  sentence significance
本文献已被 CNKI 维普 万方数据 等数据库收录!
点击此处可从《计算机应用》浏览原始摘要信息
点击此处可从《计算机应用》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号