首页 | 本学科首页   官方微博 | 高级检索  
     

突发事件新闻网页的去重方法研究
引用本文:罗永莲,Luo Yongxiu,张永奎.突发事件新闻网页的去重方法研究[J].计算机应用与软件,2008,25(8).
作者姓名:罗永莲  Luo Yongxiu  张永奎
作者单位:1. 晋中学院计算机系,山西,晋中,030600
2. 山西大学计算机与信息技术学院,山西,太原,030006
基金项目:国家自然科学基金,山西省高等学校科研开发基金
摘    要:随着人们对突发事件新闻的日益关注,需要对其进行有效地分类、索引、加工、处理.参考传统文本处理技术,结合网页结构特征和特定领域文本特征,提出在提取主题内容的基础上,根据突发事件特有的重复规律实现网页去重.实验结果表明,该方法能有效地提高网页去重准确率.

关 键 词:突发事件新闻  权值计算  网页去重

ON DELETION OF DUPLICATED BREAKING NEWS' WEBPAGES
Luo Yonglian,Luo Yongxiu,Zhang Yongkui.ON DELETION OF DUPLICATED BREAKING NEWS'' WEBPAGES[J].Computer Applications and Software,2008,25(8).
Authors:Luo Yonglian  Luo Yongxiu  Zhang Yongkui
Affiliation:Luo Yonglian1 Luo Yongxiu2 Zhang Yongkui31(School of Computer Science , Technology,Jinzhong University,Jinzhong 030600,Shanxi,China)2(School of Jinhua,China)3(School of Computer , Information Technology,Shanxi University,Taiyuan 030006,China)
Abstract:With people's increasing attention on breaking news, these news have to be effectively classified, indexed, processed, and dealt with. In this paper we put forward an approach for duplicated webpage deletion according to the peculiar rule in repetition of the sudden events based on picking up the subject contents, in combination with the characteristics of webpage structure and special field text, and in reference to the traditional technology of text treatment. It is shown by the experimental result that t...
Keywords:Breaking news Weight calculating Duplicated webpages deletion  
本文献已被 CNKI 维普 万方数据 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号