首页 | 本学科首页   官方微博 | 高级检索  
     

基于自适应随机森林的数据流分类算法
引用本文:张馨予,安建成,曹锐. 基于自适应随机森林的数据流分类算法[J]. 计算机工程与科学, 2020, 42(3): 543-549
作者姓名:张馨予  安建成  曹锐
作者单位:(太原理工大学软件学院,山西 晋中 030600)
摘    要:自适应随机森林分类器在每个基础分类器上分别设置了警告探测器和漂移探测器,实例训练时常常会同时触发多个警告探测器,引起多棵背景树同步训练,使得运行所需的内存大、时间长。针对此问题,提出了一种改进的自适应随机森林集成分类算法,将概念漂移探测器设置在集成学习器端,移除各基础树端的漂移探测器,并根据集成器预测准确率确定需要训练的背景树的数量。用改进后的算法对较平衡的数据流进行分类,在保证分类性能的前提下,与改进前的算法相比,运行时间有所降低,消耗内存有所减少,能更快适应数据流中出现的概念漂移。

关 键 词:数据流  概念漂移  随机森林  漂移探测器  集成分类器  
收稿时间:2019-08-04
修稿时间:2019-11-01

A data stream classification algorithm based on adaptive random forest ensemble model
ZHANG Xin-yu,AN Jian-cheng,CAO Rui. A data stream classification algorithm based on adaptive random forest ensemble model[J]. Computer Engineering & Science, 2020, 42(3): 543-549
Authors:ZHANG Xin-yu  AN Jian-cheng  CAO Rui
Affiliation:(School of Software,Taiyuan University of Technology,Jinzhong 030600,China)  
Abstract:The adaptive random forest classifier sets a warning detector and a drift detector on each basic classifier. When the instance is being trained, multiple warning detectors are often triggered at the same time, causing multiple background trees to be trained simultaneously, which requires large memory and long running time. Aiming at this problem, this paper proposes an improved adaptive random forest ensemble classification algorithm. It sets the concept drift detector in the ensemble learning device, removes the drift detectors at each base tree, and determines the number of background trees according to the ensemble prediction accuracy. The improved algorithm classifies balanced data streams. Under the premise of ensuring the classification performance, the running time and the memory consumption is reduced, and the concept drift appearing in the data stream can be more quickly adapted.
Keywords:data stream  concept drift  random forest  drift detector  ensemble classifier  
本文献已被 万方数据 等数据库收录!
点击此处可从《计算机工程与科学》浏览原始摘要信息
点击此处可从《计算机工程与科学》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号