首页 | 本学科首页   官方微博 | 高级检索  
     

可定制的聚焦网络爬虫
引用本文:邹海亮,孙莉.可定制的聚焦网络爬虫[J].电子科技,2009,22(1).
作者姓名:邹海亮  孙莉
作者单位:东华大学,计算机科学与技术学院,上海,200051
摘    要:网络资源信息的爆炸式增长、用户越来越个性化的需求,使得针对特定主题的搜索引擎越来越受到青睐.聚焦网络爬虫是主题搜索引擎的重要组成部分,它从Web上下栽针对某一主题的文档.可定制的聚焦网络爬虫是具有主题的可选择性、可定制性的主题爬虫.文中介绍了一套更加有效的爬虫算法,它具有高效(优先下栽主题相关度高的资源)、资源占用少(减少URL队列长度)、主题易移植(主题的可定制性)等特点.

关 键 词:信息收集  搜索引擎  网络爬虫

A Customized Focusing Crawler
Zou Hailiang,Sun Li.A Customized Focusing Crawler[J].Electronic Science and Technology,2009,22(1).
Authors:Zou Hailiang  Sun Li
Affiliation:Department of Computer Science and Technology;Donghua University;Shanghai 200051;China
Abstract:As the internet is developing explosively and the requirement of the users is becoming individualized,the search engine on a specialized topic is warmly received.A focusing crawler is an important part of the search engine which only fetches the web pages on a specific topic.The customized focusing web crawler is a topic crawler with optional and customizable topic.In this paper a crawler with new algorithms is designed,which possesses the advantages of high productivity(resources with better quality have h...
Keywords:information collection  search engine  web crawler  
本文献已被 CNKI 维普 万方数据 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号