一种安全验证模式下Deep Web爬虫的研究 STUDY ON A DEEP WEB CRAWLER IN SECURITY VALIDATION MODE期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

一种安全验证模式下Deep Web爬虫的研究

引用本文：	徐和祥,张永忠,胡运发.一种安全验证模式下Deep Web爬虫的研究[J].计算机应用与软件,2010,27(5):9-11,26.

作者姓名：	徐和祥张永忠胡运发

作者单位：	1. 上海远程教育集团,上海,200433 2. 复旦大学计算机与信息技术系,上海,200433

基金项目：	国家自然科学基金重大项目(60736016);;全国教育科学”十一五”规划教育部课题(FCB070468);;上海市教育委员会科研创新项目(09YZ462)

摘要：	Deep Web信息大约是Surface Web信息的400到500倍,这些信息对传统搜索引擎不可见。Deep Web爬虫的研究,是搜索引擎获得Deep Web信息的重要步骤,仍处于研究的早期阶段。目前对于爬虫的研究,主要成果集中在Surface Web,而很少有对Deep Web爬虫的研究。分析Deep Web的访问模式,并在此基础上提出一种安全验证模式下Deep Web爬虫的算法。试验表明:该算法可以有效实现特定安全验证模式下的Deep Web信息的抓取。
关键词：	Deep Web 安全模式爬虫信息抽取
STUDY ON A DEEP WEB CRAWLER IN SECURITY VALIDATION MODE

Xu Hexiang,Zhang Yongzhong,Hu Yunfa.STUDY ON A DEEP WEB CRAWLER IN SECURITY VALIDATION MODE[J].Computer Applications and Software,2010,27(5):9-11,26.

Authors:	Xu Hexiang Zhang Yongzhong Hu Yunfa

Affiliation:	Shanghai Distance Education Group/a>;Shanghai 200433/a>;China;Department of Computer and Information Technology/a>;Fudan University/a>;Shanghai/a>;200433/a>;China

Abstract:	The bulk of Deep Web pages is about 400 to 500 times larger than that of the Surface Web,but they are still invisible to traditional search engine.Deep Web crawler study is an important step for search engine in getting Deep Web information,which is still at its early stage.To date,in terms of the crawler study,existing outcomes mainly focus on Surface Web documents,and there is little on Deep Web.In this paper,we first analyze the access patterns of Deep Web,and then present a novel crawler algorithm for D...

Keywords:	Deep web Security mode Crawler Information extraction
本文献已被 CNKI 维普万方数据等数据库收录！

设为首页 | 免责声明 | 关于勤云 | 加入收藏