首页 | 本学科首页   官方微博 | 高级检索  
     

一种基于密度的快速聚类算法
引用本文:周水庚,周傲英,曹晶,胡运发.一种基于密度的快速聚类算法[J].计算机研究与发展,2000,37(11):1287-1292.
作者姓名:周水庚  周傲英  曹晶  胡运发
作者单位:复旦大学计算机科学系,上海,200433
基金项目:国家自然科学基金项目!(项目编号 6 97430 0 1),国家教委博士点教育基金
摘    要:聚类是数据挖掘领域中的一个重要研究方向,聚类技术在统计数据分析、模式识别、图像处理等领域有广泛应用,迄今为止人们提出了许多用于大规模数据库的聚类算法。基于密度的聚类算法DBSCAN就是一个典型代表。以DBSCAN为基础,提出了一种基于密度的快速聚类算法。新算法以核心对象领域中所有对象的代表对象为种子对象来扩展类,从而减少区域查询次数,降低I/O开销,实现快速聚类,对二维空间数据测试表明:快速算法能够有效地对大规模数据库进行聚类,速度上数倍于已有DBSCAN算法。

关 键 词:数据挖掘  聚类  密度  快速算法  数据库

A FAST DENSITY-BASED CLUSTERING ALGORITHM
ZHOU Shui-Geng,ZHOU Ao-Ying,CAO Jing,HU Yun-Fa.A FAST DENSITY-BASED CLUSTERING ALGORITHM[J].Journal of Computer Research and Development,2000,37(11):1287-1292.
Authors:ZHOU Shui-Geng  ZHOU Ao-Ying  CAO Jing  HU Yun-Fa
Abstract:Clustering is a promising application area for many fields including data mining, statistical data analysis, pattern recognition, image processing, etc. In this paper, a fast density based clustering algorithm is developed, which considerably speeds up the original DBSCAN algorithm. Unlike DBSCAN, the new DBSCAN uses only a small number of representative objects in a core object's neighborhood as seeds to expand the cluster so that the execution frequency of region query can be decreased, and consequently the I/O cost is reduced. Experimental results show that the new algorithm is effective and efficient in clustering large scale databases, and it is faster than the original DBSCAN by several times.
Keywords:spatial database  data mining  clustering  density  fast algorithm  representative objects  
本文献已被 CNKI 维普 万方数据 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号