首页 | 本学科首页   官方微博 | 高级检索  
     

基于信息熵更新权重的数据流集成分类算法
引用本文:夏源,赵蕴龙,范其林.基于信息熵更新权重的数据流集成分类算法[J].计算机科学,2022,49(3):92-98.
作者姓名:夏源  赵蕴龙  范其林
作者单位:南京航空航天大学计算机科学与技术学院 南京211106,南京航空航天大学计算机科学与技术学院 南京211106;软件新技术与产业化协同创新中心 南京210023
摘    要:在动态的数据流中,由于其不稳定性以及存在概念漂移等问题,集成分类模型需要有及时适应新环境的能力.目前通常使用监督信息对基分类器的权重进行更新,以此来赋予符合当前环境的基分类器更高的权重,然而监督信息在真实数据流环境下无法立即获得.为了解决这个问题,文中提出了一种基于信息熵更新基分类器权重的数据流集成分类算法.首先使用随...

关 键 词:数据流  概念漂移  信息熵  分类  集成算法

Data Stream Ensemble Classification Algorithm Based on Information Entropy Updating Weight
XIA Yuan,ZHAO Yun-long,FAN Qi-lin.Data Stream Ensemble Classification Algorithm Based on Information Entropy Updating Weight[J].Computer Science,2022,49(3):92-98.
Authors:XIA Yuan  ZHAO Yun-long  FAN Qi-lin
Affiliation:(School of Computer Science and Technology,Nanjing University of Aeronautics and Astronautics,Nanjing 211106,China;Collaborative Innovation Center of Novel Software Technology and Industrialization,Nanjing 210023,China)
Abstract:In the dynamic data stream,due to its instability and the existence of concept drift,the ensemble classification model needs the ability to adapt to the new environment in time.At present,the weight of the base classifier is usually updated by using the supervision information,so as to give higher weight to the base classifier suitable for the current environment.However,supervision information cannot be obtained immediately in a real data stream environment.In order to solve this problem,this paper presents a data stream ensemble classification algorithm,which updates the weight of the base classifier through information entropy.Firstly,the random feature subspace is used to initialize each base classifier to construct the ensemble classifier.Secondly,a new base classifier is constructed based on each new data block to replace the base classifier with the lowest weight in the ensemble.Then,the weight update strategy based on information entropy will update the weights in the base classifier in real time.Finally,the base classifier that meets the requirements participates in weighted voting to obtain the classification result.Comparing the proposed algorithm with several other classic learning algorithms,the experimental results show that the proposed method has obvious advantages in classification accuracy and is suitable for various types of concept drift environments.
Keywords:Data stream  Concept drift  Information entropy  Classification  Ensemble algorithm
本文献已被 维普 万方数据 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号