首页 | 本学科首页   官方微博 | 高级检索  
     


A distance-relatedness dynamic model for clustering high dimensional data of arbitrary shapes and densities
Authors:Noha A Yousri [Author Vitae]  Mohamed S Kamel [Author Vitae]
Affiliation:a Computers and System Engineering, University of Alexandria, Egypt
b Electrical and Computer Engineering, University of Waterloo, Ontario, Canada
Abstract:It is important to find the natural clusters in high dimensional data where visualization becomes difficult. A natural cluster is a cluster of any shape and density, and it should not be restricted to a globular shape as a wide number of algorithms assume, or to a specific user-defined density as some density-based algorithms require.In this work, it is proposed to solve the problem by maximizing the relatedness of distances between patterns in the same cluster. It is then possible to distinguish clusters based on their distance-based densities. A novel dynamic model is proposed based on new distance-relatedness measures and clustering criteria. The proposed algorithm “Mitosis” is able to discover clusters of arbitrary shapes and arbitrary densities in high dimensional data. It has a good computational complexity compared to related algorithms. It performs very well on high dimensional data, discovering clusters that cannot be found by known algorithms. It also identifies outliers in the data as a by-product of the cluster formation process. A validity measure that depends on the main clustering criterion is also proposed to tune the algorithm's parameters. The theoretical bases of the algorithm and its steps are presented. Its performance is illustrated by comparing it with related algorithms on several data sets.
Keywords:Clustering  Dynamic model  Arbitrary shaped clusters  Arbitrary density clusters  High dimensional data  Distance-relatedness
本文献已被 ScienceDirect 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号