首页 | 本学科首页   官方微博 | 高级检索  
     


Implementation and Evaluation of Decision Trees with Range and Region Splitting
Authors:Yasuhiko Morimoto  Takeshi Fukuda  Shinichi Morishita  Takeshi Tokuyama
Affiliation:(1) IBM Tokyo Research Laboratory, 1623-14, Shimo-tsuruma, Yamato City, Kanagawa Pref, 242, JAPAN
Abstract:
We propose an extension of an entropy-based heuristic for constructing a decision tree from a large database with many numeric attributes. When it comes to handling numeric attributes, conventional methods are inefficient if any numeric attributes are strongly correlated. Our approach offers one solution to this problem. For each pair of numeric attributes with strong correlation, we compute a two-dimensional association rule with respect to these attributes and the objective attribute of the decision tree. In particular, we consider a family R of grid-regions in the plane associated with the pairof attributes. For R isin R, the data canbe split into two classes: data inside R and dataoutside R. We compute the region Roptisin R that minimizes the entropy of the splitting,and add the splitting associated with Ropt (foreach pair of strongly correlated attributes) to the set of candidatetests in an entropy-based heuristic. We give efficient algorithmsfor cases in which R is (1) x-monotone connected regions, (2) based-monotone regions, (3) rectangles, and (4) rectilinear convex regions. The algorithm has been implemented as a subsystem of SONAR (System for Optimized Numeric Association Rules) developed by the authors. We have confirmed that we can compute the optimal region efficiently. And diverse experiments show that our approach can create compact trees whose accuracy is comparable with or better than that of conventional trees. More importantly, we can grasp non-linear correlation among numeric attributes which could not be found without our region splitting.
Keywords:decision trees  multivariate tests  range splitting  region splitting
本文献已被 SpringerLink 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号