首页 | 本学科首页   官方微博 | 高级检索  
     


Rough-DBSCAN: A fast hybrid density based clustering method for large data sets
Authors:P Viswanath  V Suresh Babu  
Affiliation:aPattern Recognition Research Lab, Department of Computer Science and Engineering, NRI Institute of Technology, Guntur 522 009, Andhra Pradesh, India;bInstitute for Research in Applicable Computing, Department of Computing and Information Systems, University of Bedfordshire, Luton Campus, Park Square, Luton, LU1 3JU, UK
Abstract:Density based clustering techniques like DBSCAN are attractive because it can find arbitrary shaped clusters along with noisy outliers. Its time requirement is O(n2) where n is the size of the dataset, and because of this it is not a suitable one to work with large datasets. A solution proposed in the paper is to apply the leaders clustering method first to derive the prototypes called leaders from the dataset which along with prototypes preserves the density information also, then to use these leaders to derive the density based clusters. The proposed hybrid clustering technique called rough-DBSCAN has a time complexity of O(n) only and is analyzed using rough set theory. Experimental studies are done using both synthetic and real world datasets to compare rough-DBSCAN with DBSCAN. It is shown that for large datasets rough-DBSCAN can find a similar clustering as found by the DBSCAN, but is consistently faster than DBSCAN. Also some properties of the leaders as prototypes are formally established.
Keywords:Clustering  Density based clustering  DBSCAN  Leaders  Rough sets
本文献已被 ScienceDirect 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号