Rough subspace-based clustering ensemble for categorical data |
| |
Authors: | Can Gao Witold Pedrycz Duoqian Miao |
| |
Affiliation: | 1. Department of Computer Science and Technology, Tongji University, Shanghai, 201804, People’s Republic of China 2. Department of Electrical and Computer Engineering, University of Alberta, Edmonton, AB, T6G 2G7, Canada 3. System Research Institute, Polish Academy of Sciences, Warsaw, Poland
|
| |
Abstract: | Clustering categorical data arising as an important problem of data mining has recently attracted much attention. In this paper, the problem of unsupervised dimensionality reduction for categorical data is first studied. Based on the theory of rough sets, the attributes of categorical data are decomposed into a number of rough subspaces. A novel clustering ensemble algorithm based on rough subspaces is then proposed to deal with categorical data. The algorithm employs some of rough subspaces with high quality to cluster the data and yields a robust and stable solution by exploiting the resulting partitions. We also introduce a cluster index to evaluate the solution of clustering algorithm for categorical data. Experimental results for selected UCI data sets show that the proposed method produces better results than those obtained by other methods when being evaluated in terms of cluster validity indexes. |
| |
Keywords: | |
本文献已被 SpringerLink 等数据库收录! |
|