首页 | 本学科首页   官方微博 | 高级检索  
     


Learning accurate very fast decision trees from uncertain data streams
Authors:Chunquan Liang  Peng Shi  Zhengguo Hu
Affiliation:1. College of Mechanical and Electronic Engineering, Northwest A&2. F University, Shaanxi, China;3. College of Information Engineering, Northwest A&4. College of Automation, Harbin Engineering University, Harbin, China;5. College of Engineering and Science, Victoria University, Melbourne, Australia
Abstract:Most existing works on data stream classification assume the streaming data is precise and definite. Such assumption, however, does not always hold in practice, since data uncertainty is ubiquitous in data stream applications due to imprecise measurement, missing values, privacy protection, etc. The goal of this paper is to learn accurate decision tree models from uncertain data streams for classification analysis. On the basis of very fast decision tree (VFDT) algorithms, we proposed an algorithm for constructing an uncertain VFDT tree with classifiers at tree leaves (uVFDTc). The uVFDTc algorithm can exploit uncertain information effectively and efficiently in both the learning and the classification phases. In the learning phase, it uses Hoeffding bound theory to learn from uncertain data streams and yield fast and reasonable decision trees. In the classification phase, at tree leaves it uses uncertain naive Bayes (UNB) classifiers to improve the classification performance. Experimental results on both synthetic and real-life datasets demonstrate the strong ability of uVFDTc to classify uncertain data streams. The use of UNB at tree leaves has improved the performance of uVFDTc, especially the any-time property, the benefit of exploiting uncertain information, and the robustness against uncertainty.
Keywords:uncertain data streams  very fast decision tree  functional tree leaf  uncertain numerical attributes
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号