首页 | 本学科首页   官方微博 | 高级检索  
     

概念漂移数据流挖掘算法综述
引用本文:丁剑,韩萌,李娟.概念漂移数据流挖掘算法综述[J].计算机科学,2016,43(12):24-29, 62.
作者姓名:丁剑  韩萌  李娟
作者单位:北方民族大学计算机科学与工程学院 银川750021,北方民族大学计算机科学与工程学院 银川750021,北方民族大学计算机科学与工程学院 银川750021
基金项目:本文受国家自然科学基金项目(61563001),北方民族大学科研基金项目(2014XYZ13)资助
摘    要:数据流是一种新型的数据模型,具有动态、无限、高维、有序、高速和变化等特性。在真实的数据流环境中,一些数据分布是随着时间改变的,即具有概念漂移特征,称为可变数据流或概念漂移数据流。因此处理数据流模型的方法需要处理时空约束和自适应调整概念变化。对概念漂移问题和概念漂移数据流分类、聚类和模式挖掘等内容进行综述。首先介绍概念漂移的类型和常用概念改变检测方法。为了解决概念漂移问题,数据流挖掘中常使用滑动窗口模型对新近事务进行处理。数据流分类常用的模型包括单分类模型和集成分类模型,常用的方法包括决策树、分类关联规则等。数据流聚类方式通常包括基于k- means的和非基于k- means的。模式挖掘可以为分类、聚类和关联规则等提供有用信息。概念漂移数据流中的模式包括频繁模式、序列模式、episode、模式树、模式图和高效用模式等。最后详细介绍其中的频繁模式挖掘算法和高效用模式挖掘算法。

关 键 词:数据流挖掘  分类  聚类  模式挖掘  概念漂移
收稿时间:2016/1/18 0:00:00
修稿时间:2016/6/14 0:00:00

Review of Concept Drift Data Streams Mining Techniques
DING Jian,HAN Meng and LI Juan.Review of Concept Drift Data Streams Mining Techniques[J].Computer Science,2016,43(12):24-29, 62.
Authors:DING Jian  HAN Meng and LI Juan
Affiliation:School of Computer Science and Engineering,Beifang University of Nationalities,Yinchuan 750021,China,School of Computer Science and Engineering,Beifang University of Nationalities,Yinchuan 750021,China and School of Computer Science and Engineering,Beifang University of Nationalities,Yinchuan 750021,China
Abstract:Data stream is a new data model proposed in recent years.It has different characteristics such as dynamic,infinite,high dimensional,orderly,high speed and evolving.In some data stream applications,the information embedded in the data is evolving over time that has the characteristics of concept drift or change.These data streams are known as evolving data streams or concept drift data streams.Therefore,the algorithms that mine data streams have space and time restrictions,and need to adapt change automatically.In this paper,we provided the survey of concept drift and classification,clustering and pattern mining on concept drift data streams.Firstly,we introduced the types and detection methods about concept drift.In order to deal with the concept drift,the sliding window model is used to mining data stream.The data stream classification model includes single model and ensemble model.The common methods include decision tree,classification association rules and so on.Data stream clustering methods can be divided into k-means based method and not.Pattern mining can provide useful patterns for classification,clustering,association rules and so on.Patterns include frequent patterns,sequential patterns,episode,sub-tree,sub-graph,high utility patterns and so on.Finally,we introduced the frequent patterns and high utility patterns in detail.
Keywords:Data stream mining  Classification  Clustering  Frequent pattern mining  Concept drift
点击此处可从《计算机科学》浏览原始摘要信息
点击此处可从《计算机科学》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号