Approaches for scaling DBSCAN algorithm to large spatial databases期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

Approaches for scaling DBSCAN algorithm to large spatial databases

Authors:	Aoying Zhou Shuigeng Zhou Jing Cao Ye Fan Yunfa Hu

Affiliation:	(1) Department of Computer Science, Fudan University, 200433 Shanghai, P.R. China

Abstract:	The huge amount of information stored in databases owned by corporations (e.g., retail, financial, telecom) has spurred a tremendous interest in the area of knowledge discovery and data mining. Clustering, in data mining, is a useful technique for discovering interesting data distributions and patterns in the underlying data, and has many application fields, such as statistical data analysis, pattern recognition, image processing, and other business applications. Although researchers have been working on clustering algorithms for decades, and a lot of algorithms for clustering have been developed, there is still no efficient algorithm for clustering very large databases and high dimensional data. As an outstanding representative of clustering algorithms, DBSCAN algorithm shows good performance in spatial data clustering. However, for large spatial databases, DBSCAN requires large volume of memory support and could incur substantial I/O costs because it operates directly on the entire database. In this paper, several approaches are proposed to scale DBSCAN algorithm to large spatial databases. To begin with, a fast DBSCAN algorithm is developed, which considerably speeds up the original DBSCAN algorithm. Then a sampling based DBSCAN algorithm, a partitioning-based DBSCAN algorithm, and a parallel DBSCAN algorithm are introduced consecutively. Following that, based on the above-proposed algorithms, a synthetic algorithm is also given. Finally, some experimental results are given to demonstrate the effectiveness and efficiency of these algorithms.

Keywords:	spatial database clustering fast DBSCAN algorithm data sampling data partitioning parallel
本文献已被 CNKI 维普万方数据 SpringerLink 等数据库收录！
	点击此处可从《计算机科学技术学报》浏览原始摘要信息
	点击此处可从《计算机科学技术学报》下载免费的PDF全文

设为首页 | 免责声明 | 关于勤云 | 加入收藏