首页 | 本学科首页   官方微博 | 高级检索  
     


New and efficient DCA based algorithms for minimum sum-of-squares clustering
Authors:Le Thi Hoai An  Le Hoai Minh  Pham Dinh Tao
Affiliation:1. Laboratory of Theoretical and Applied Computer Science – LITA EA 3097, University of Lorraine, Ile de Saulcy, 57045 Metz, France;2. Laboratory of Mathematics, National Institute for Applied Sciences – Rouen, Avenue de l''Université, 76801 Saint-Etienne-du-Rouvray cedex, France
Abstract:The purpose of this paper is to develop new efficient approaches based on DC (Difference of Convex functions) programming and DCA (DC Algorithm) to perform clustering via minimum sum-of-squares Euclidean distance. We consider the two most widely used models for the so-called Minimum Sum-of-Squares Clustering (MSSC in short) that are a bilevel programming problem and a mixed integer program. Firstly, the mixed integer formulation of MSSC is carefully studied and is reformulated as a continuous optimization problem via a new result on exact penalty technique in DC programming. DCA is then investigated to the resulting problem. Secondly, we introduce a Gaussian kernel version of the bilevel programming formulation of MSSC, named GKMSSC. The GKMSSC problem is formulated as a DC program for which a simple and efficient DCA scheme is developed. A regularization technique is investigated for exploiting the nice effect of DC decomposition and a simple procedure for finding good starting points of DCA is developed. The proposed DCA schemes are original and very inexpensive because they amount to computing, at each iteration, the projection of points onto a simplex and/or onto a ball, and/or onto a box, which are all determined in the explicit form. Numerical results on real word datasets show the efficiency, the scalability of DCA and its great superiority with respect to k-means and kernel k-means, standard methods for clustering.
Keywords:Clustering  MSSC  Gaussian kernel  Combinatorial optimization  Nonsmooth nonconvex programming  DC programming  DCA  Exact penalty
本文献已被 ScienceDirect 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号