首页 | 本学科首页   官方微博 | 高级检索  
     

高维类别属性数据流离群点快速检测算法
引用本文:周晓云,孙志挥,张柏礼,杨宜东. 高维类别属性数据流离群点快速检测算法[J]. 软件学报, 2007, 18(4): 933-942
作者姓名:周晓云  孙志挥  张柏礼  杨宜东
作者单位:东南大学,计算机科学与工程系,江苏,南京,210096
基金项目:国家自然科学基金;国家教育部高等学校博士学科点科研基金
摘    要:提出类别属性数据流数据离群度量--加权频繁模式离群因子(weighted frequent pattern outlier factor,简称WFPOF),并在此基础上给出一种快速数据流离群点检测算法FODFP-Stream(fast outlier detection for high dimensional categorical data streams based on frequent pattern).该算法通过动态发现和维护频繁模式来计算离群度,能够有效地处理高维类别属性数据流,并可进一步扩

关 键 词:数据流  离群点检测  频繁模式  高维  概念转移
收稿时间:2005-11-02
修稿时间:2006-02-23

A Fast Outlier Detection Algorithm for High Dimensional Categorical Data Streams
ZHOU Xiao-Yun,SUN Zhi-Hui,ZHANG Bai-Li and YANG Yi-Dong. A Fast Outlier Detection Algorithm for High Dimensional Categorical Data Streams[J]. Journal of Software, 2007, 18(4): 933-942
Authors:ZHOU Xiao-Yun  SUN Zhi-Hui  ZHANG Bai-Li  YANG Yi-Dong
Affiliation:Department of Computer Science and Engineering, Southeast University, Nanjing 210096, China
Abstract:This paper considers the problem of outlier detection in data stream, proposes a new metric called weighted frequent pattern outlier factor for categorical data streams, and presents a novel fast outlier detection algorithm named FODFP-Stream (fast outlier detection for high dimensional categorical data streams based on frequent pattern). FODFP-Stream computes the outlier measure through discovering and maintaining the frequent patterns dynamically, and can deal with the high dimensional categorical data streams effectively. FODFP-Stream can also be extended to resolve continuous attributes and mixed attributes data streams. The experimental results on synthetic and real data sets show the promising availabilities of the approaches.
Keywords:data stream  outlier detection  frequent pattern  high dimension  concept drift
本文献已被 CNKI 维普 万方数据 等数据库收录!
点击此处可从《软件学报》浏览原始摘要信息
点击此处可从《软件学报》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号