Web信息采集研究进展 A Survey on Web Crawling期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

Web信息采集研究进展

引用本文：	李盛韬,余智华,等.Web信息采集研究进展[J].计算机科学,2003,30(2):151-157.

作者姓名：	李盛韬余智华

作者单位：	中国科学院计算机技术研究所,北京,100080

摘要：	1.简介随着Internet/Intranet的迅速发展,网络正深刻地改变着我们的生活。而在网上发展最为迅猛的WWW(World Wide Web)技术,以其直观、方便的使用方式和丰富的表达能力,已逐渐成为Internet上最重要的信息发布和传输方式。然而,Web信息的急速膨胀,在给人们提供丰富的资源的同时,又使人们在对它们的有效使用方面面临一个巨大的挑战。为此,人们发展了以Web搜索引擎为主的检索服务,并且随着
关键词：	Web 信息采集信息发布 Internet Intranet 计算机网络
A Survey on Web Crawling

LI Sheng-Tao YU Zhi-Hua CHENG Xue-Qi BAI Shuo.A Survey on Web Crawling[J].Computer Science,2003,30(2):151-157.

Authors:	LI Sheng-Tao YU Zhi-Hua CHENG Xue-Qi BAI Shuo

Abstract:	As a basic component of search engine and a series of other services on Web,Web crawler is playing an important role. Roughly,a Web crawler is a program which automatically traverses the Web by downloading documents and following links from page to page. This article detailedly explains the principles and difficulties on the Web crawler, comprehensively argues several hot directions of Web crawler,and at last views the new direction of Web crawler.

Keywords:	Web crawling Web gathering Search engine WWW Agent
本文献已被 CNKI 维普万方数据等数据库收录！
	点击此处可从《计算机科学》下载全文

设为首页 | 免责声明 | 关于勤云 | 加入收藏