首页 | 本学科首页   官方微博 | 高级检索  
     

林业主题爬虫的算法研究与设计
引用本文:袁津生,郭艳芬.林业主题爬虫的算法研究与设计[J].计算机工程与设计,2011,32(6):2003-2006.
作者姓名:袁津生  郭艳芬
作者单位:北京林业大学信息学院,北京,100083
摘    要:针对目前通用搜索引擎对林业主题信息覆盖率和查准率较低的不足,提出了一种基于Shark-Search算法的林业主题爬虫设计方案。详细讨论了该主题爬虫的爬行策略、算法描述及实现,并通过实践构建了林业主题搜索引擎"搜林"。实验结果表明,相对于通用搜索引擎,"搜林"减少了搜索结果的信息量,提高了林业主题信息搜索的准确率。

关 键 词:林业  主题爬虫  搜索引擎  鲨鱼算法  相关性

Algorithm research and design of forestry focused web crawler
YUAN Jin-sheng,GUO Yan-fen.Algorithm research and design of forestry focused web crawler[J].Computer Engineering and Design,2011,32(6):2003-2006.
Authors:YUAN Jin-sheng  GUO Yan-fen
Affiliation:YUAN Jin-sheng,GUO Yan-fen(College of Information,Beijing Forestry University,Beijing 100083,China)
Abstract:In order to improve that when people searching forestry information,general search engines often return too much but non-relevance information,an forestry information focused web crawler is proposed based on Shark-Search.It's crawling strategy,algorithm and implementation is discussed,then a forestry domain specific search engine Search Forestry is constructed.The experimental results show that compared to general search engines,Search Forestry reduced redundant information and improved accuracy greatly.
Keywords:forestry  focused web crawler  search engine  Shark-Search algorithm  relevance  
本文献已被 CNKI 万方数据 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号