首页 | 本学科首页   官方微博 | 高级检索  
     

基于哈希学习的异常SQL检测
引用本文:李明威,蒋庆远,解银朋,何金栋,吴丹.基于哈希学习的异常SQL检测[J].计算机应用,2021,41(1):121-126.
作者姓名:李明威  蒋庆远  解银朋  何金栋  吴丹
作者单位:1. 计算机软件新技术国家重点实验室(南京大学), 南京 210023;2. 国家电网福建省电力有限公司 电力科学研究院, 福州 350007
摘    要:针对最近邻(NN)方法在异常结构化查询语句(SQL)检测应用中面临的存储开销大、检索速度慢的问题,提出了一种基于哈希学习的异常SQL检测(HMSD)方法。该算法利用哈希学习来学习查询SQL语句的二值编码表示。首先,对查询SQL语句进行清洗去重,从而将查询SQL语句表示为实值特征形式;然后利用等方差哈希方法来学习查询SQL语句的二值编码表示;最后,通过二值编码表示进行检索并提高异常SQL检测的速度。实验结果表明,在异常SQL检测数据集Wafamole上,将数据集进行随机划分,使训练集包含10 000条SQL语句,测试集包含30 000条SQL语句,在128比特长度下,与最近邻方法相比,所提算法的检测精度提高了1.3%,假正例率(FPR)降低了0.19%,假负例率(FNR)降低了2.41%,检索时间减少了94%,存储开销降低了97.5%;与支持向量机方法相比,所提算法的检测精度提高了0.17%,验证了所提算法能解决最近邻方法在异常SQL检测中存在的问题。

关 键 词:异常SQL检测  最近邻  二值编码表示  哈希学习  大规模检索  
收稿时间:2020-05-31
修稿时间:2020-08-03

Hash learning based malicious SQL detection
LI Mingwei,JIANG Qingyuan,XIE Yinpeng,HE Jindong,WU Dan.Hash learning based malicious SQL detection[J].journal of Computer Applications,2021,41(1):121-126.
Authors:LI Mingwei  JIANG Qingyuan  XIE Yinpeng  HE Jindong  WU Dan
Affiliation:1. National Key Laboratory for Novel Software Technology(Nanjing university), Nanjing Jiangsu 210023, China;2. Electric Power Science Research Institute, State Grid Fujian Electric Power Company Limited, Fuzhou Fujian 350007, China
Abstract:To solve the high storage cost and low retrieval speed problems in malicious Structure Query Language(SQL)detection faced by Nearest Neighbor(NN)method,a Hash learning based Malicious SQL Detection(HMSD)method was proposed.In this algorithm,Hash learning was used to learn the binary coding representation for SQL statements.Firstly,the SQL statements were presented as real-valued features by washing and deleting the duplicated SQL statements.Secondly,the isotropic hashing was used to learn the binary coding representation for SQL statements.Lastly,the retrieval procedure was performed and the detection speed was improved by using binary coding representation.Experimental results show that on the malicious SQL detection dataset Wafamole,the dataset is randomly divided so that the training set contains 10000 SQL statements and the test set contains 30000 SQL statements,at the length of 128 bits,compared with nearest neighbor method,the proposed algorithm has the detection accuracy increased by 1.3%,the False Positive Rate(FPR)reduced by 0.19%,the False Negative Rate(FNR)decreased by 2.41%,the retrieval time reduced by 94%,the storage cost dropped by 97.5%;compared with support vector machine method,the proposed algorithm has the detection accuracy increased by 0.17%,which demonstrate that the proposed algorithm can solve the problems of nearest neighbor method in malicious SQL detection.
Keywords:malicious SQL detection  Nearest Neighbor(NN)  binary coding representation  Hash learning  largescale retrieval
本文献已被 维普 万方数据 等数据库收录!
点击此处可从《计算机应用》浏览原始摘要信息
点击此处可从《计算机应用》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号