一种有效的分类型数据聚类方法 Efficient categorical data clustering method期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

一种有效的分类型数据聚类方法

引用本文：	罗可,洪亮亮,童小娇.一种有效的分类型数据聚类方法[J].控制与决策,2011,26(10):1542-1544.

作者姓名：	罗可洪亮亮童小娇

作者单位：	长沙理工大学计算机与通信工程学院,长沙,410114

基金项目：	国家自然科学基金项目(10926189,10871031); 湖南省自然科学衡阳联合基金项目(10JJ8008)

摘要：	鉴于传统的K-means聚类算法只限于处理数值型数据,将K-means算法扩展到分类型数据域,提出一种分类型数据聚类方法.根据与每个分类属性的每个值相关的数据分布信息,同时结合数据的纵向与横向分布来评价数据对象与类之间的差异性,定义了一种新的距离度量.该方法能发现同一属性不同值间的内在关系,并能有效地度量对象间的差异性.用UCI中的数据集对所提算法进行验证,实验结果表明了该算法具有较好的聚类效果.
关键词：	聚类分析分类型数据差异性域值共生
收稿时间：	2010/6/21 0:00:00
修稿时间：	2010/9/2 0:00:00
Efficient categorical data clustering method

LUO Ke,HONG Liang-liang,TONG Xiao-jiao.Efficient categorical data clustering method[J].Control and Decision,2011,26(10):1542-1544.

Authors:	LUO Ke HONG Liang-liang TONG Xiao-jiao

Affiliation:	LUO Ke,HONG Liang-liang,TONG Xiao-jiao(Institute of Computer and Communication Engineering,Changsha University of Sciences and Technology,Changsha 410114,China)

Abstract:	The traditional K-means clustering algorithm is only for numerical data.Therefore,a categorical data clustering method is proposed through extending the K-means algorithm to categorical data domain.In accordance with the information of data distribution correlated to each value of each categorical attribute,and at the same time combined with the vertical and horizontal distribution of the data to measure the difference between data object and the class,a new distance metric is defined.This method can find t...

Keywords:	cluster analysis categorical data dissimilarity domain value co-occurrence
本文献已被 CNKI 万方数据等数据库收录！
	点击此处可从《控制与决策》浏览原始摘要信息
	点击此处可从《控制与决策》下载全文

设为首页 | 免责声明 | 关于勤云 | 加入收藏