A hybrid decision tree training method using data streams |
| |
Authors: | Michal Wozniak |
| |
Affiliation: | 1.Department of Systems and Computer Networks, Faculty of Electronics,Wroclaw University of Technology,Wroclaw,Poland |
| |
Abstract: | Classical classification methods usually assume that pattern recognition models do not depend on the timing of the data. However,
this assumption is not valid in cases where new data frequently become available. Such situations are common in practice,
for example, spam filtering or fraud detection, where dependencies between feature values and class numbers are continually
changing. Unfortunately, most classical machine learning methods (such as decision trees) do not take into consideration the
possibility of the model changing, as a result of so-called concept drift and they cannot adapt to a new classification model. This paper focuses on the problem of concept drift, which is a very
important issue, especially in data mining methods that use complex structures (such as decision trees) for making decisions.
We propose an algorithm that is able to co-train decision trees using a modified NGE (Nested Generalized Exemplar) algorithm. The potential for adaptation of the proposed algorithm and the quality thereof are evaluated through computer
experiments, carried out on benchmark datasets from the UCI Machine Learning Repository. |
| |
Keywords: | |
本文献已被 SpringerLink 等数据库收录! |
|