Robust projected clustering |
| |
Authors: | Gabriela Moise Jörg Sander Martin Ester |
| |
Affiliation: | (1) Department of Computing Science, University of Alberta, Edmonton, AB, T6G 2E8, Canada;(2) School of Computing Science, Simon Fraser University, Burnaby, BC, Canada |
| |
Abstract: | Projected clustering partitions a data set into several disjoint clusters, plus outliers, so that each cluster exists in a
subspace. Subspace clustering enumerates clusters of objects in all subspaces of a data set, and it tends to produce many
overlapping clusters. Such algorithms have been extensively studied for numerical data, but only a few have been proposed
for categorical data. Typical drawbacks of existing projected and subspace clustering algorithms for numerical or categorical
data are that they rely on parameters whose appropriate values are difficult to set appropriately or that they are unable
to identify projected clusters with few relevant attributes. We present P3C, a robust algorithm for projected clustering that
can effectively discover projected clusters in the data while minimizing the number of required parameters. P3C does not need
the number of projected clusters as input, and can discover, under very general conditions, the true number of projected clusters.
P3C is effective in detecting very low-dimensional projected clusters embedded in high dimensional spaces. P3C positions itself
between projected and subspace clustering in that it can compute both disjoint or overlapping clusters. P3C is the first projected
clustering algorithm for both numerical and categorical data. |
| |
Keywords: | Projected clustering Subspace clustering Clustering numerical and categorical data |
本文献已被 SpringerLink 等数据库收录! |
|