首页 | 本学科首页   官方微博 | 高级检索  
     

基于随机子空间的扩展隔离林算法
引用本文:谢雨,蒋瑜,龙超奇. 基于随机子空间的扩展隔离林算法[J]. 计算机应用, 2021, 41(6): 1679-1685. DOI: 10.11772/j.issn.1001-9081.2020091436
作者姓名:谢雨  蒋瑜  龙超奇
作者单位:成都信息工程大学 软件工程学院, 成都 610225
摘    要:针对扩展隔离林(EIF)算法时间开销过大的问题,提出了一种基于随机子空间的扩展隔离林(RS-EIF)算法.首先,在原数据空间确定多个随机子空间;然后,在不同的随机子空间中通过计算每个节点的截距向量与斜率来构建扩展孤立树,并将多棵扩展孤立树集成为子空间扩展隔离林;最后,通过计算数据点在扩展隔离林中的平均遍历深度来确定数据...

关 键 词:异常检测  随机子空间  扩展隔离林算法  扩展孤立树  平均遍历深度
收稿时间:2020-09-15
修稿时间:2020-11-27

Extended isolation forest algorithm based on random subspace
XIE Yu,JIANG Yu,LONG Chaoqi. Extended isolation forest algorithm based on random subspace[J]. Journal of Computer Applications, 2021, 41(6): 1679-1685. DOI: 10.11772/j.issn.1001-9081.2020091436
Authors:XIE Yu  JIANG Yu  LONG Chaoqi
Affiliation:School of Software Engineering, Chengdu University of Information Technology, Chengdu Sichuan 610225, China
Abstract:Aiming at the problem of excessive time overhead of the Extended Isolation Forest (EIF) algorithm, a new algorithm named Extended Isolation Forest based on Random Subspace (RS-EIF) was proposed. Firstly, multiple random subspaces were determined in the original data space. Then, in each random subspace, the extended isolated tree was constructed by calculating the intercept vector and slope of each node, and multiple extended isolated trees were integrated into a subspace extended isolation forest. Finally, the average traversal depth of data point in the extended isolation forest was calculated to determine whether the data point was abnormal. Experimental results on 9 real datasets in Outliter Detection DataSet (ODDS) and 7 synthetic datasets with multivariate distribution show that, the RS-EIF algorithm is sensitive to local anomalies and reduces the time overhead by about 60% compared with the EIF algorithm; on the ODDS datasets with many samples, its recognition accuracy is 2 percentage points to 12 percentage points higher than those of the isolation Forest (iForest) algorithm, Lightweight On-line Detection of Anomalies (LODA) algorithm and COPula-based Outlier Detection (COPOD) algorithm. The RS-EIF algorithm has the higher recognition efficiency in the dataset with a large number of samples.
Keywords:anomaly detection  random subspace  Extended Isolation Forest (EIF) algorithm  extended isolated tree  average traversal depth  
本文献已被 万方数据 等数据库收录!
点击此处可从《计算机应用》浏览原始摘要信息
点击此处可从《计算机应用》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号