主题爬虫的关键技术 |
| |
引用本文: | 赵强.主题爬虫的关键技术[J].电脑与微电子技术,2014(2):19-22. |
| |
作者姓名: | 赵强 |
| |
作者单位: | 四川大学计算机学院,成都,610065 |
| |
摘 要: | 随着Internet的快速发展,越来越多的用户提出与主题或者领域相关的查询需求,而传统通用搜索引擎已经无法满足这一需求。为了克服传统通用搜索引擎的不足,研究者提出面向主题的爬虫。首先给出主题网络爬虫的定义,接着提出主题爬虫的三个关键技术:抓取目标、网页搜索策略和网页主题相关性算法,最后给出主题爬虫在今后的一些研究方向。
|
关 键 词: | 搜索引擎 主题爬虫 网页分析 搜索策略 |
Topic-Focused Crawling Technology |
| |
Authors: | ZHAO Qiang |
| |
Abstract: | With the high development of the Internet, the survey of topic-focused crawling starts to meet the new demands of people. And below is a basic introduction on concepts of topic-focused crawling. Lists some key technologies in topic-focused crawling, such as the searching strategy and the webpage analyzing algorithm. And finally indicates some future works for topic-focused crawling research. |
| |
Keywords: | Search Engine Topic-Focused Crawler Webpage Analysis Searching Strategy |
本文献已被 维普 等数据库收录! |
|