首页 | 本学科首页   官方微博 | 高级检索  
     

潜在属性空间树分类器
引用本文:何 萍,徐晓华,陈 崚.潜在属性空间树分类器[J].软件学报,2009,20(7):1735-1745.
作者姓名:何 萍  徐晓华  陈 崚
作者单位:1. 南京航空航天大学,信息科学与技术学院,计算机科学与工程系,江苏,南京,210016
2. 扬州大学,信息工程学院,计算机科学与工程系,江苏,扬州,225009
3. 南京航空航天大学,信息科学与技术学院,计算机科学与工程系,江苏,南京,210016;扬州大学,信息工程学院,计算机科学与工程系,江苏,扬州,225009
基金项目:Supported by the National Natural Science Foundation of China under Grant No.60673060 (国家自然科学基金); the Natural Science Foundation of Jiangsu Province of China under Grant No.BK2008206 (江苏省自然科学基金)
摘    要:提出一种潜在属性空间树分类器(latent attribute space tree classifier,简称LAST)框架,通过将原属性空间变换到更容易分离数据或更符合决策树分类特点的潜在属性空间,突破传统决策树算法的决策面局限,改善树分类器的泛化性能.在LAST 框架下,提出了两种奇异值分解斜决策树(SVD (singular value decomposition) oblique decision tree,简称SODT)算法,通过对全局或局部数据进行奇异值分解,构建正交的潜在属性空间,然后在潜在属性空间内构建传统的单变量决策树或树节点,从而间接获得原空间内近似最优的斜决策树.SODT 算法既能够处理整体数据与局部数据分布相同或不同的数据集,又可以充分利用有标签和无标签数据的结构信息,分类结果不受样本随机重排的影响,而且时间复杂度还与单变量决策树算法相同.在复杂数据集上的实验结果表明,与传统的单变量决策树算法和其他斜决策树算法相比,SODT 算法的分类准确率更高,构建的决策树大小更稳定,整体分类性能更鲁棒,决策树构建时间与C4.5 算法相近,而远小于其他斜决策树算法.

关 键 词:分类  决策树  潜在属性空间  奇异值分解
收稿时间:2007/5/28 0:00:00
修稿时间:3/6/2008 12:00:00 AM

Latent Attribute Space Tree Classifiers
HE Ping,XU Xiao-Hua,CHEN Ling.Latent Attribute Space Tree Classifiers[J].Journal of Software,2009,20(7):1735-1745.
Authors:HE Ping  XU Xiao-Hua  CHEN Ling
Affiliation:Department of Computer Science and Engineering;Nanjing University of Aeronautics and Astronautics;Nanjing 210016;China;Department of Computer Science and Engineering;Yangzhou University;Yangzhou 225009;China
Abstract:A framework of latent attribute space tree classifier (LAST) is proposed in this paper. LAST transforms data from the original attribute space into the latent attribute space, which is easier for data separation or more suitable for tree classifier, so that the decision boundary of the traditional decision tree can be extended and its generalization ability can be improved. This paper presents two SVD (singular value decomposition) oblique decision tree (SODT) algorithms based on the LAST framework. SODT first performs SVD on global and/or local data to construct orthogonal latent attribute space. Then, traditional decision tree or tree nodes are built in that space.Finally, SODT obtains the approximately optimal oblique decision tree of the original space. SODT can not only handle datasets with similar or different distribution between global and local data, but also can make full use of the structure information of the labelled and unlabelled data and produce the same classification results no matter how the observations are arranged. Besides, the time complexity of SODT is identical to that of the univariate decision tree. Experimental results show that compared with the traditional univariate decision tree algorithm C4.5 and the oblique decision tree algorithms OC1 and CART-LC, SODT gives higher classification accuracy, more stable decision tree size and comparable tree-construction time as C4.5, which is much less than that of OC1 and CART-LC.
Keywords:classification  decision tree  latent attribute space  singular value decomposition
本文献已被 CNKI 万方数据 等数据库收录!
点击此处可从《软件学报》浏览原始摘要信息
点击此处可从《软件学报》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号