首页 | 本学科首页   官方微博 | 高级检索  
     

基于主要特征抽取的重现概念漂移处理算法
引用本文:冯超 文益民 汤凌冰. 基于主要特征抽取的重现概念漂移处理算法[J]. 数据采集与处理, 2016, 31(2): 315-324
作者姓名:冯超 文益民 汤凌冰
作者单位:1.桂林电子科技大学计算机科学与工程学院,桂林,541004;2.桂林电子科技大学广西可信软件重点实验室,桂林,541004;3.湖南商学院计信学院,长沙,410205
摘    要:针对重现概念漂移检测中的概念表征和分类器选择问题,提出了一种适用于含重现概念漂移的数据流分类的算法——基于主要特征抽取的概念聚类和预测算法(Conceptual clustering and prediction through main feature extraction, MFCCP)。MFCCP通过计算不同批次样本的主要特征及影响因子的差异度以识别重复出现的概念,为每个概念维持且及时更新一个分类器,并依据Hoeffding不等式选择最合适的分类器对当前样本集实施分类,以提高对概念漂移的反应能力。在3个数据集上的实验表明:MFCCP在含重现概念漂移的数据集上的分类准确率,对概念漂移的反应能力及对概念漂移检测的准确率均明显优于其他4种对比算法,且MFCCP也适用于对不含重现概念漂移的数据流进行分类。

关 键 词:重现概念漂移;主要特征;影响因子;数据流;Hoeffding不等式

Algorithm of Recurring Concept Drift Based on Main Feature Extraction
Feng Chao,Wen Yimin,Tang Lingbing. Algorithm of Recurring Concept Drift Based on Main Feature Extraction[J]. Journal of Data Acquisition & Processing, 2016, 31(2): 315-324
Authors:Feng Chao  Wen Yimin  Tang Lingbing
Abstract:Recurring concept drift is one of the sub-types of concept drift. In recurring concept drift detection, it is very important to represent concepts and select the most appropriate classifier to classify. We propose an algorithm, conceptual clustering and prediction through main feature extraction (MFCCP), for classifying data stream with recurring concept drifts. MFCCP can recognize recurring concepts by computing the differences of main features and impact factors of different batches of samples. It maintains a classifier for each concept and monitors the classification accuracy to select classifier according to hoeffding inequality in order to enhance the ability of adapting to concept drift. The experimental results over the three datasets illustrate that MFCCP achieves better classification accuracy, adapts faster to concept drift, and detects concept drift more accurately than the other four algorithms on the data streams with recurring concept drifts, and therefore, MFCCP is apt to classify data stream without recurring concept drift.
Keywords:recurring concept drift   main feature   impact factor   data stream   hoeffding inequality
点击此处可从《数据采集与处理》浏览原始摘要信息
点击此处可从《数据采集与处理》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号