首页 | 本学科首页   官方微博 | 高级检索  
     


Scale-invariant clustering with minimum volume ellipsoids
Authors:Mahesh Kumar  James B Orlin
Affiliation:1. Rutgers Business School, Rutgers University, 180 University Avenue, Newark, NJ 07102, USA;2. Operations Research Center, Massachusetts Institute of Technology, 77 Massachusetts Avenue, Bldg. E40-149, Cambridge, MA 02139, USA
Abstract:This paper develops theory and algorithms concerning a new metric for clustering data. The metric minimizes the total volume of clusters, where the volume of a cluster is defined as the volume of the minimum volume ellipsoid (MVE) enclosing all data points in the cluster. This metric is scale-invariant, that is, the optimal clusters are invariant under an affine transformation of the data space. We introduce the concept of outliers in the new metric and show that the proposed method of treating outliers asymptotically recovers the data distribution when the data comes from a single multivariate Gaussian distribution. Two heuristic algorithms are presented that attempt to optimize the new metric. On a series of empirical studies with Gaussian distributed simulated data, we show that volume-based clustering outperforms well-known clustering methods such as k-means, Ward's method, SOM, and model-based clustering.
Keywords:Minimum volume ellipsoid  Outliers  Scale-invariant clustering  Robust clustering
本文献已被 ScienceDirect 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号