首页 | 本学科首页   官方微博 | 高级检索  
     

一种基于概念重复性的数据流集成分类算法
引用本文:尹绍宏,张盼盼.一种基于概念重复性的数据流集成分类算法[J].计算机工程与应用,2016,52(12):80-84.
作者姓名:尹绍宏  张盼盼
作者单位:天津工业大学,天津 300387
摘    要:目前关于概念漂移数据流的分类研究已经取得了许多成果,但大部分没有充分考虑到数据流中概念重复出现的情况,这将耗费大量的计算和内存资源,增加了分类错误的可能性。为此,基于概念的重复性提出了一种数据流集成分类算法,该算法运用集成分类思想处理数据流中的概念漂移,但在学习过程中不会将暂时失效的概念及对应基分类器删除,而是把它们的基本信息存储起来,方便以后调用,并可根据概念间的转换关系预测即将到来的概念,在提高分类精度的同时又提高了时间效率。实验结果验证了算法的有效性。

关 键 词:数据挖掘  数据流  集成分类  概念漂移  重复性  

Ensemble classification algorithm for data stream based on repeatability of concept
YIN Shaohong,ZHANG Panpan.Ensemble classification algorithm for data stream based on repeatability of concept[J].Computer Engineering and Applications,2016,52(12):80-84.
Authors:YIN Shaohong  ZHANG Panpan
Affiliation:Tianjin Polytechnic University, Tianjin 300387, China
Abstract:Nowadays, the data stream classification research about concept drift has gained a lot of achievements. However, because of neglecting of the situation that concepts recur in the data steam, most of research methods will not only lead to high computation complexity and large memory overhead, but affect the classification accuracy. To solve this problem, based on the repeatability of concept, this paper proposes an ensemble classification algorithm for data stream, which applies ensemble classification theory to process the concept drift in data stream. On the one hand, the algorithm stores the essential information of temporary failure concepts and their corresponding base classifiers for later calls instead of deleting them during the learning process. On the other hand, it predicts the oncoming concept according to transitions between concepts. Therefore, the proposed algorithm can improve the classification accuracy and efficiency. Finally, the experimental results demonstrate the effectiveness of the new algorithm.
Keywords:data mining  data stream  ensemble classification  concept drift  repeatability  
点击此处可从《计算机工程与应用》浏览原始摘要信息
点击此处可从《计算机工程与应用》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号