首页 | 本学科首页   官方微博 | 高级检索  
     

主题爬虫的搜索策略研究
引用本文:刘汉兴,刘财兴.主题爬虫的搜索策略研究[J].计算机工程与设计,2008,29(12).
作者姓名:刘汉兴  刘财兴
作者单位:华南农业大学,信息学院,广东,广州,510642
基金项目:国家高技术研究发展计划(863计划)
摘    要:主题爬虫收集主题相关信息时,需要评价网页的主题相关度,并优先爬取相关度较高的网页,在决定了搜索路径的同时也决定了主题爬虫的搜索效率.针对不同的网页评价算法,对现有的主题爬虫的搜索策略进行分类,指出了各类搜索策略的特点和优缺点,总结了能够提高主题爬虫搜索效率的几方面内容.

关 键 词:主题爬虫  搜索策略  页面评价  搜索引擎  优化

Survey on searching strategies of focused crawler
LIU Han-xing,LIU Cai-xing.Survey on searching strategies of focused crawler[J].Computer Engineering and Design,2008,29(12).
Authors:LIU Han-xing  LIU Cai-xing
Affiliation:LIU Han-xing,LIU Cai-xing(College of Informatics,South China Agricultural University,Guangzhou 510642,China)
Abstract:While focused Crawler collect information,it needs to evaluate the relevance of web pages,and process firstly pages which have higher relevance,thus deciding the search path and efficiency of crawler.Web crawler's searching strategies based on the way they evaluate the web page is categorized.The character of each class of searching strategy is described and the advantage and disadvantage is discussed,several ways to improving the efficiency of web crawlers are summed up.
Keywords:focused crawler  searching strategy  page evaluating  search engine  optimization  
本文献已被 CNKI 维普 万方数据 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号