首页 | 本学科首页   官方微博 | 高级检索  
     

基于多分辨率网格的异常检测方法
引用本文:刘文芬,穆晓东,黄月华.基于多分辨率网格的异常检测方法[J].计算机工程与应用,2020,56(17):78-85.
作者姓名:刘文芬  穆晓东  黄月华
作者单位:1.桂林电子科技大学 广西密码学与信息安全重点实验室,广西 桂林 541004 2.桂林航天工业学院 计算机科学与工程学院,广西 桂林 541004
基金项目:广西自然科学基金;实验室研究项目;研究生创新项目;国家自然科学基金
摘    要:作为一种重要的数据挖掘手段,异常检测在数据分析领域有着广泛的应用。然而现有的异常检测算法针对不同的数据,往往需要调整不同的参数才能达到相应的检测效果,在面对大型数据时,现有算法检测的时间效率也不尽如人意。基于网格的异常检测技术,可以很好地解决低维数据异常检测的时间效率问题,然而检测精度严重依赖于网格的划分尺度和密度阈值参数,该参数鲁棒性较差,不能很好地推广到不同类型数据集上。基于上述问题,提出了一种基于多分辨率网格的异常检测方法,该方法引入一个鲁棒性较好的子矩阵划分参数,将高维数据划分到多个低维的子空间,使异常检测算法在子空间上进行,从而保证了高维数据的适用性;通过从稀疏到密集的多分辨率网格划分,综合权衡了数据点在不同尺度网格下的局部异常因子,最终输出全局异常值的得分排序。实验结果表明,新引入的子矩阵划分参数具有较好的鲁棒性,该方法能较好地适应高维数据,并在多个公开数据集上都能得到良好的检测效果,为解决高维数据异常检测的相关问题提供了一种高效的解决方案。

关 键 词:异常检测  多分辨率网格  高维数据  子空间  数据挖掘  

Anomaly Detection Method Based on Multi-resolution Grid
LIU Wenfen,MU Xiaodong,HUANG Yuehua.Anomaly Detection Method Based on Multi-resolution Grid[J].Computer Engineering and Applications,2020,56(17):78-85.
Authors:LIU Wenfen  MU Xiaodong  HUANG Yuehua
Affiliation:1.Guangxi Key Laboratory of Cryptography and Information Security, Guilin University of Electronic Technology, Guilin, Guangxi 541004, China 2.College of Computer Science and Engineering, Guilin University of Aerospace Technology, Guilin, Guangxi 541004, China
Abstract:As an important means of data mining, anomaly detection is widely used in the field of data analysis. However, existing anomaly detection algorithms often need to adjust different parameters for different data to achieve the corresponding detection effect. In the face of big data, the detection time efficiency of existing algorithms is not satisfactory. The anomaly detection technology based on grid can well solve the problem of time efficiency of low-dimensional data anomaly detection. However, the detection accuracy depends heavily on the grid partition scale and density threshold parameters, which have poor robustness and cannot be well extended to different types of data sets. Based on the above problems, the proposed method firstly introduces a submatrix partition parameter with good robustness, divides high-dimensional data into several low-dimensional subspaces, and makes the anomaly detection algorithm carry out on the subspaces, so as to ensure the applicability of high-dimensional data. Then, an anomaly detection algorithm based on multi-resolution grid is proposed. Through the multi-resolution grid division from sparse to dense, the local anomaly factors of data points in different scale grids are comprehensively weighed, and the final output is the score ranking of global outliers. Experimental results show that the newly introduced submatrix partition parameters have good robustness, and the method can adapt to high-dimensional data well, and can get good detection effect on multiple public data sets, providing an efficient solution for solving the problems related to anomaly detection of high-dimensional data.
Keywords:anomaly detection  multi-resolution grid  high dimensional data  subspace  data mining  
本文献已被 万方数据 等数据库收录!
点击此处可从《计算机工程与应用》浏览原始摘要信息
点击此处可从《计算机工程与应用》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号