The use of vicinal-risk minimization for training decision trees |
| |
Affiliation: | 1. Sidra Medical and Research Center, Translational Medicine, Qatar Foundation, P.O. Box 26999, Doha, Qatar;2. Qatar Biomedical Research Institute, Hamad Bin Khalifa University, Qatar Foundation, P.O. Box 5825, Doha, Qatar;3. Department of Biological and Environmental Sciences, College of Arts and Sciences, Qatar University, Doha, Qatar;4. Department of Biomedical Sciences, College of Health Sciences, Qatar University, Doha, Qatar;1. Computer Science, Jamia Hamdard, India;2. Electrical Engineering Department, IIT Delhi, India;1. Institute of Cardiology, Cardiothoracovascular Department, Ferrarotto Hospital, University of Catania, Catania, Italy;2. Excellence Through Newest Advances Foundation, Catania, Italy;1. Instituto Tecnológico de Hermosillo, Av. Tecnológico y Periférico Poniente s/n, 83170, Mexico;2. Instituto Tecnológico de Tijuana, Calz. del Tecnológico s/n, Tomas Aquino, 22414, Mexico |
| |
Abstract: | We propose the use of Vapnik's vicinal risk minimization (VRM) for training decision trees to approximately maximize decision margins. We implement VRM by propagating uncertainties in the input attributes into the labeling decisions. In this way, we perform a global regularization over the decision tree structure. During a training phase, a decision tree is constructed to minimize the total probability of misclassifying the labeled training examples, a process which approximately maximizes the margins of the resulting classifier. We perform the necessary minimization using an appropriate meta-heuristic (genetic programming) and present results over a range of synthetic and benchmark real datasets. We demonstrate the statistical superiority of VRM training over conventional empirical risk minimization (ERM) and the well-known C4.5 algorithm, for a range of synthetic and real datasets. We also conclude that there is no statistical difference between trees trained by ERM and using C4.5. Training with VRM is shown to be more stable and repeatable than by ERM. |
| |
Keywords: | Decision trees Vicinal-risk minimization Decision trees Classification |
本文献已被 ScienceDirect 等数据库收录! |
|