首页 | 本学科首页   官方微博 | 高级检索  
     

一种新的无监督连续属性离散化方法
引用本文:花海洋,赵怀慈.一种新的无监督连续属性离散化方法[J].计算机工程与应用,2011,47(6):208-211.
作者姓名:花海洋  赵怀慈
作者单位:中国科学院 沈阳自动化研究所,沈阳 110016
摘    要:提出了一种基于聚类方法的无监督连续属性离散化算法,称为CAMNA(Clustering and Merging on Numerical Attribute)算法。CAMNA算法通过聚类过程将数值值域划分为多个离散区间,根据类分布的指导信息优化合并相邻区间,实现理想的离散方案。通过实验证明该算法在保持执行效率较高的前提下,离散结果更加合理,生成的决策树结构简单,获得较少的分类规则,分类准确率也有提高。

关 键 词:决策树  数值型属性  聚类区间  分类  
修稿时间: 

New discretization method for numerical attributes based on clustering and merging
HUA Haiyang,ZHAO Huaici.New discretization method for numerical attributes based on clustering and merging[J].Computer Engineering and Applications,2011,47(6):208-211.
Authors:HUA Haiyang  ZHAO Huaici
Affiliation:Shenyang Institute of Automation,Chinese Academy of Sciences,Shenyang 110016,China
Abstract:This paper proposes such an algorithm,called CAMNA(Clustering and Merging on Numerical Attributes),which is a new algorithm of unsupervised discretization based on clustering.The method divides a set of the numerical attribute values into many intervals based on clustering in the first step.Then in the second step,the cluster quality is optimized by computing the class label of the adjacent intervals.This procedure can not stop until a satisfactory, discretization schema is reached.Experimental evaluation of several discretization algorithms shows that the proposed algorithm is more efficient and can generate a better discretization schema.Comparing the output of C4.5,resulting tree is smaller, less classification rules, and high accuracy of classification.
Keywords:decision tree  numerical attributes  clustering intervals  classification
本文献已被 CNKI 维普 万方数据 等数据库收录!
点击此处可从《计算机工程与应用》浏览原始摘要信息
点击此处可从《计算机工程与应用》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号