基于集成学习的钓鱼网页深度检测系统 Depth Detection System for Phishing Web Pages Based on Ensemble Learning期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

基于集成学习的钓鱼网页深度检测系统

引用本文：	冯庆,连一峰,张颖君.基于集成学习的钓鱼网页深度检测系统[J].计算机系统应用,2016,25(10):47-56.

作者姓名：	冯庆连一峰张颖君

作者单位：	中国科学院软件研究所可信计算与信息保障实验室, 北京 100190,中国科学院软件研究所可信计算与信息保障实验室, 北京 100190,中国科学院软件研究所可信计算与信息保障实验室, 北京 100190

基金项目：	国家高技术研究计划（863）（2015AA016006）；国家自然科学基金（61303248，U1536106）；北京市自然科学基金（4144089）；信息网络安全公安部重点实验室开放课题（C15604）

摘要：	网络钓鱼是一种在线欺诈行为，它利用钓鱼网页仿冒正常合法的网页，窃取用户敏感信息从而达到非法目的.提出了基于集成学习的钓鱼网页深度检测方法，采用网页渲染来应对常见的页面伪装手段，提取渲染后网页的URL信息特征、链接信息特征以及页面文本特征，利用集成学习的方法，针对不同的特征信息构造并训练不同的基础分类器模型，最后利用分类集成策略综合多个基础分类器生成最终的结果.针对PhishTank钓鱼网页的检测实验表明，本文提出的检测方法具有较好的准确率与召回率.
关键词：	钓鱼网页集成学习深度检测特征提取
收稿时间：	2016/1/12 0:00:00
修稿时间：	2016/2/29 0:00:00
Depth Detection System for Phishing Web Pages Based on Ensemble Learning

FENG Qing,LIAN Yi-Feng and ZHANG Ying-Jun.Depth Detection System for Phishing Web Pages Based on Ensemble Learning[J].Computer Systems& Applications,2016,25(10):47-56.

Authors:	FENG Qing LIAN Yi-Feng and ZHANG Ying-Jun

Affiliation:	Trusted Computing and Information Assurance Laboratory, Institute of Software, Chinese Academy of Sciences, Beijing 100190, China,Trusted Computing and Information Assurance Laboratory, Institute of Software, Chinese Academy of Sciences, Beijing 100190, China and Trusted Computing and Information Assurance Laboratory, Institute of Software, Chinese Academy of Sciences, Beijing 100190, China

Abstract:	Phishing is a kind of online fraud that combines social engineering techniques and sophisticated attack vectors to steal the users'' sensitive information to achieve the illegal purpose. In order to detect phishing web pages quickly and efficiently, this paper presents a model for depth detection of phishing web pages based on ensemble learning. The model uses page rendering to deal with common page camouflage, extract several sensitive features including URL and domain features, link and reference information, and contents of text messages; and then constructs and trains several base learning models with ensemble learning method using the features above; finally, generates the final result with base models using classification and integration method. Experiments on PhishTank indicate that the detection model this paper proposed has good accuracy and recall rate.

Keywords:	phishing detect ensemble learning feature extract

	点击此处可从《计算机系统应用》浏览原始摘要信息
	点击此处可从《计算机系统应用》下载全文

设为首页 | 免责声明 | 关于勤云 | 加入收藏