基于爬虫的有害网站发现与判别系统的实现 Implementation of Harmful Websites Discovery and Identification System based on Web Crawler期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

基于爬虫的有害网站发现与判别系统的实现

引用本文：	王庆广,何力,韩伟红.基于爬虫的有害网站发现与判别系统的实现[J].信息网络安全,2012(8):140-142.

作者姓名：	王庆广何力韩伟红

作者单位：	国防科学技术大学计算机学院,湖南长沙 410073

基金项目：	国家863计划项目[2010AA012505、2011AA010702、2012AA01A401、2012AA01A402];国家自然科学基金项目[60933005];国家科技支撑计划项目[2012BAH38B04];国家242信息安全计划项目[2011A010]

摘要：	互联网上充斥着大量的色情、暴力、反动等有害信息，为了能够主动发现和判别包含这些有害信息的网站，文章实现了一个基于网络爬虫的有害网站的发现和判别系统。该系统可以通过网络爬虫技术主动去发现有害网站，并通过内容安全过滤技术来判别网站的合法性，把有害网站名单告知给用户。同时，可以根据给定的网站URL判断该网站的合法性，并把判断理由展示给用户。文章提出了一个快速发现、自动推荐、专家确认有害网站的有效解决方案，该系统的应用将会给网络用户提供一个良好可信的网络环境。
关键词：	网络爬虫内容安全过滤技术
Implementation of Harmful Websites Discovery and Identification System based on Web Crawler

WANG Qing-guang, HE Li, HAN Wei-hong.Implementation of Harmful Websites Discovery and Identification System based on Web Crawler[J].Netinfo Security,2012(8):140-142.

Authors:	WANG Qing-guang HE Li HAN Wei-hong

Affiliation:	( School of Computer Science, National University of Defense Technology, Changsha Hunan 410073, China )

Abstract:	There is a huge number of pornographic, violent, reactionary and other harmful information on the Internet. In order to proactively discover and identify the website containing harmful information, we have implementedaharmful website discovery and identification system based on web crawler. One the one hand, the web crawler proactively discovers the harmful websites, the content security filtering technology determines the legality of the websites, and at last the most likely harmful websites are recommended to us. One the other hand the identification technology can identify the legality of a given website, and shows the judgment reasons to us. The study provides efficient methods of rapidly discovery, automatic recommendation and experts’ confirmation for harmful websites. It will provide a good and trusted network environment to network users.

Keywords:	spider content security filtering technology
本文献已被 CNKI 维普万方数据等数据库收录！

设为首页 | 免责声明 | 关于勤云 | 加入收藏