首页 | 本学科首页   官方微博 | 高级检索  
     

民航主题Hidden-Web爬虫的设计与实现
引用本文:张校慧,徐彬,陈国强,陈珊.民航主题Hidden-Web爬虫的设计与实现[J].计算机应用与软件,2008,25(7).
作者姓名:张校慧  徐彬  陈国强  陈珊
作者单位:河南大学计算机与信息工程学院,河南,开封,475004
摘    要:分析了现今搜索引擎技术在民航主题Hidden-Web获取方面的缺陷,以此为鉴设计并实现了一个民航主题Hidden-Web爬虫.此爬虫使用主题分类等相关技术发现并抓取民航主题Hidden-Web所对应的前台Form,生成相应的Form库,然后利用启发式规则对Form库中的Form进行填写并搜集含有匹配结果的页面集.实验证明此爬虫的性能令人满意且对其它Hidden-Web的应用研究具有借鉴意义.

关 键 词:Hidden-Web  Form  民航  爬虫

DESIGN AND IMPLEMENTATION OF CIVIL AVIATION-ORIENTED HIDDEN-WEB CRAWLER
Zhang Xiaohui,Xu Bin,Chen Guoqiang,Chen Shan.DESIGN AND IMPLEMENTATION OF CIVIL AVIATION-ORIENTED HIDDEN-WEB CRAWLER[J].Computer Applications and Software,2008,25(7).
Authors:Zhang Xiaohui  Xu Bin  Chen Guoqiang  Chen Shan
Affiliation:Zhang Xiaohui Xu Bin Chen Guoqiang Chen Shan(College of Computer , Information Engineering,Henan University,Kaifeng 475004,Henan,China)
Abstract:In the paper it analyzed the limitation of search engines in acquisition of civil aviation-oriented Hidden-Web,and based on this designed and implemented a civil aviation-oriented Hidden-Web crawler.The crawler fetches and gets by using subject classification technology the foreground Forms which leads to the Hidden-Web,and builds relevant knowledge-database of Form,then it automatically fills the Forms in Form database using heuristic rules and searches page set with the matching outcomes.Experiments show ...
Keywords:Hidden-Web Form Civil aviation Crawler  
本文献已被 CNKI 维普 万方数据 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号