首页 | 本学科首页   官方微博 | 高级检索  
     

一种有效的用于数据挖掘的动态概念聚类算法
引用本文:郭建生,赵奕,施鹏飞. 一种有效的用于数据挖掘的动态概念聚类算法[J]. 软件学报, 2001, 12(4): 582-591
作者姓名:郭建生  赵奕  施鹏飞
作者单位:上海交通大学图像处理与模式识别研究所,
基金项目:国家自然科学基金资助项目(69835010)
摘    要:概念聚类适用于领域知识不完整或领域知识缺乏时的数据挖掘任务.定义了一种基于语义的距离判定函数,结合领域知识对连续属性值进行概念化处理,对于用分类属性和数值属性混合描述数据对象的情况,提出了一种动态概念聚类算法DDCA(domain-baseddynamicclusteringalgorithm).该算法能够自动确定聚类数目,依据聚类内部属性值的频繁程度修正聚类中心,通过概念归纳处理,用概念合取表达式解释聚类输出.研究表明,基于语义距离判定函数和基于领域知识的动态概念聚类的算法DDCA是有效的.

关 键 词:数据挖掘;动态概念聚类;语义距离;领域知识
收稿时间:1999-07-27
修稿时间:2000-02-01

An Efficient Dynamic Conceptual Clustering Algorithm for Data Mining
GUO Jian-sheng,ZHAO Yi and SHI Peng-fei. An Efficient Dynamic Conceptual Clustering Algorithm for Data Mining[J]. Journal of Software, 2001, 12(4): 582-591
Authors:GUO Jian-sheng  ZHAO Yi  SHI Peng-fei
Abstract:Conceptual clustering analysis is suitable to discover the knowledge in database with incomplete or absent domain background information. It is difficult for original conceptual clustering method to deal with the data objects described by numerical attribute values. A new criterion function based on semantic distance is proposed in this paper, and a novel domain-based dynamic conceptual clustering algorithm (DDCA) is also presented. With the discretization of the continuous attribute values, it works well on the datasets that are described by mixed numerical attributes and categorical attributes. The algorithm automatically determines the number of clusters, modifies the demoid according to the frequency of the attribute values within each cluster and gives out the interpretations of the clustering with the conceptual complex expression. The experiments demonstrate that the semantic-based criterion function and the dynamic conceptual clustering algorithm are effective and efficient.
Keywords:data mining   dynamic conceptual clustering   semantic distance   domain knowledge
本文献已被 维普 万方数据 等数据库收录!
点击此处可从《软件学报》浏览原始摘要信息
点击此处可从《软件学报》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号