首页 | 本学科首页   官方微博 | 高级检索  
     

基于双向LSTM的误植域名滥用检测方法
引用本文:吕品,李全刚,柳厅文,宁振虎,王玉斌,时金桥,方滨兴. 基于双向LSTM的误植域名滥用检测方法[J]. 电子学报, 2018, 46(9): 2081-2086. DOI: 10.3969/j.issn.0372-2112.2018.09.006
作者姓名:吕品  李全刚  柳厅文  宁振虎  王玉斌  时金桥  方滨兴
作者单位:1. 中国科学院信息工程研究所, 北京 100093;2. 中国科学院大学网络空间安全学院, 北京 100049;3. 北京工业大学信息学部, 北京 100124;4. 电子科技大学广东电子信息工程研究院, 广东东莞 523808
摘    要:当前,误植域名检测主要以计算域名对之间的编辑距离为基础,未能充分挖掘域名的上下文信息,且对短域名的检测易产生大量的假阳性结果。采集域名相关信息进行判定虽然有助于提高检测效果,却会引入较大的额外开销.本文采用了基于域名字符串的轻量级检测策略,并引入双向长短时记忆模型(LSTM,Long Short-Term Memory)来充分利用域名上下文,提升检测效果.本文还设计了面向域名的局部敏感哈希函数,以提高在大规模域名集合上进行误植域名检测的速度.在大量真实数据集上的实验结果表明,本文的工作改进了基于编辑距离检测方法的不足,能够有效地进行误植域名滥用检测.

关 键 词:误植域名  编辑距离  双向LSTM  上下文信息  局部敏感哈希  
收稿时间:2017-06-15

Towards Typosquatting Abuse Detection using Bi-directional LSTM
L,#,Pin,LI Quan-gang,LIU Ting-wen,NING Zhen-hu,WANG Yu-bin,SHI Jin-qiao,FANG Bin-xing. Towards Typosquatting Abuse Detection using Bi-directional LSTM[J]. Acta Electronica Sinica, 2018, 46(9): 2081-2086. DOI: 10.3969/j.issn.0372-2112.2018.09.006
Authors:L&#  Pin  LI Quan-gang  LIU Ting-wen  NING Zhen-hu  WANG Yu-bin  SHI Jin-qiao  FANG Bin-xing
Affiliation:1. School of Cyber Security, University of Chinese Academy of Sciences, Beijing 100049, China;2. Institute of Information Engineering, Chinese Academy of Sciences, Beijing 100093, China;3. Faculty of Information Technology, Beijing University of Technology, Beijing 100124, China;4. University of Electronic Science and Technology Guangdong Institute of Electronic Information Engineering, Dongguan, Guangdong 523808, China
Abstract:Prior works on detection of typosquatting abuse are based on the calculation of edit distance between domains.They do not fully utilize the context information of domains,and usually give many false positive results for short domains.Actively crawling much related information of the given domains can help improving the results,but introduce a heavy overhead.Therefore,we design a lightweight detecting strategy based on domain names,and introduce the bi-directional long short-term memory (LSTM) model to make full use of the domain context information.Furthermore,we give a locality sensitive hashing function for domain names,in order to increase the speed of typosquatting abuse detection over large-scale domain sets.Experimental results on a real data set show that the proposed method can overcome the shortcomings of edit distance based methods,and can detect typosquatting abuse efficiently.
Keywords:typosquatting domain  edit distance  bi-directional LSTM  context information  locality sensitive hashing  
点击此处可从《电子学报》浏览原始摘要信息
点击此处可从《电子学报》下载免费的PDF全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号