首页 | 本学科首页   官方微博 | 高级检索  
     

基于集成学习的钓鱼网页深度检测系统
引用本文:冯庆,连一峰,张颖君.基于集成学习的钓鱼网页深度检测系统[J].计算机系统应用,2016,25(10):47-56.
作者姓名:冯庆  连一峰  张颖君
作者单位:中国科学院软件研究所 可信计算与信息保障实验室, 北京 100190,中国科学院软件研究所 可信计算与信息保障实验室, 北京 100190,中国科学院软件研究所 可信计算与信息保障实验室, 北京 100190
基金项目:国家高技术研究计划(863)(2015AA016006);国家自然科学基金(61303248,U1536106);北京市自然科学基金(4144089);信息网络安全公安部重点实验室开放课题(C15604)
摘    要:网络钓鱼是一种在线欺诈行为,它利用钓鱼网页仿冒正常合法的网页,窃取用户敏感信息从而达到非法目的.提出了基于集成学习的钓鱼网页深度检测方法,采用网页渲染来应对常见的页面伪装手段,提取渲染后网页的URL信息特征、链接信息特征以及页面文本特征,利用集成学习的方法,针对不同的特征信息构造并训练不同的基础分类器模型,最后利用分类集成策略综合多个基础分类器生成最终的结果.针对PhishTank钓鱼网页的检测实验表明,本文提出的检测方法具有较好的准确率与召回率.

关 键 词:钓鱼网页  集成学习  深度检测  特征提取
收稿时间:2016/1/12 0:00:00
修稿时间:2016/2/29 0:00:00

Depth Detection System for Phishing Web Pages Based on Ensemble Learning
FENG Qing,LIAN Yi-Feng and ZHANG Ying-Jun.Depth Detection System for Phishing Web Pages Based on Ensemble Learning[J].Computer Systems& Applications,2016,25(10):47-56.
Authors:FENG Qing  LIAN Yi-Feng and ZHANG Ying-Jun
Affiliation:Trusted Computing and Information Assurance Laboratory, Institute of Software, Chinese Academy of Sciences, Beijing 100190, China,Trusted Computing and Information Assurance Laboratory, Institute of Software, Chinese Academy of Sciences, Beijing 100190, China and Trusted Computing and Information Assurance Laboratory, Institute of Software, Chinese Academy of Sciences, Beijing 100190, China
Abstract:Phishing is a kind of online fraud that combines social engineering techniques and sophisticated attack vectors to steal the users'' sensitive information to achieve the illegal purpose. In order to detect phishing web pages quickly and efficiently, this paper presents a model for depth detection of phishing web pages based on ensemble learning. The model uses page rendering to deal with common page camouflage, extract several sensitive features including URL and domain features, link and reference information, and contents of text messages; and then constructs and trains several base learning models with ensemble learning method using the features above; finally, generates the final result with base models using classification and integration method. Experiments on PhishTank indicate that the detection model this paper proposed has good accuracy and recall rate.
Keywords:phishing detect  ensemble learning  feature extract
点击此处可从《计算机系统应用》浏览原始摘要信息
点击此处可从《计算机系统应用》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号