James-Stein shrinkage to improve k-means cluster analysis |
| |
Authors: | Jinxin Gao |
| |
Affiliation: | a Eli Lilly and Company Indianapolis, IN, United States b University of South Carolina, Department of Statistics, United States |
| |
Abstract: | We study a general algorithm to improve the accuracy in cluster analysis that employs the James-Stein shrinkage effect in k-means clustering. We shrink the centroids of clusters toward the overall mean of all data using a James-Stein-type adjustment, and then the James-Stein shrinkage estimators act as the new centroids in the next clustering iteration until convergence. We compare the shrinkage results to the traditional k-means method. A Monte Carlo simulation shows that the magnitude of the improvement depends on the within-cluster variance and especially on the effective dimension of the covariance matrix. Using the Rand index, we demonstrate that accuracy increases significantly in simulated data and in a real data example. |
| |
Keywords: | Centroids Effective dimension k-means clustering Stein estimation |
本文献已被 ScienceDirect 等数据库收录! |
|