首页 | 本学科首页   官方微博 | 高级检索  
     


Spectral clustering with eigenvector selection
Authors:Tao Xiang  Shaogang Gong
Affiliation:1. Department of Computing, The Hong Kong Polytechnic University, Hung Hom, Kowloon, Hong Kong;2. School of Computer Science and Software Engineering, Tianjin Polytechnic University, Tianjin, China;3. School of Mathematical Sciences, Nankai University, Tianjin, China;1. School of Digital Media, Jiangnan University, Wuxi, Jiangsu, PR China;2. Department of Biomedical Engineering, University of California, Davis, USA\n;3. The Centre for Smart Health, School of Nursing, The Hong Kong Polytechnic University, Hong Kong;1. Center for Future Media at University of Electronic Science and Technology of China, Chengdu 611731, China;2. School of Natural and Computational Sciences at Massey University Auckland Campus, Auckland 0745, New Zealand;1. School of Computer Science and Center for OPTical IMagery Analysis and Learning (OPTIMAL), Northwestern Polytechnical University, Xi’an 710072, Shaanxi, P.R. China;2. School of Cybersecurity and Center for OPTical IMagery Analysis and Learning (OPTIMAL), Northwestern Polytechnical University, Xi’an 710072, Shaanxi, P.R. China.;1. Department of Computer Science, University of Massachusetts Lowell, MA, USA;2. Department of Applied Mathematics, University of Colorado Boulder, CO, USA
Abstract:The task of discovering natural groupings of input patterns, or clustering, is an important aspect of machine learning and pattern analysis. In this paper, we study the widely used spectral clustering algorithm which clusters data using eigenvectors of a similarity/affinity matrix derived from a data set. In particular, we aim to solve two critical issues in spectral clustering: (1) how to automatically determine the number of clusters, and (2) how to perform effective clustering given noisy and sparse data. An analysis of the characteristics of eigenspace is carried out which shows that (a) not every eigenvectors of a data affinity matrix is informative and relevant for clustering; (b) eigenvector selection is critical because using uninformative/irrelevant eigenvectors could lead to poor clustering results; and (c) the corresponding eigenvalues cannot be used for relevant eigenvector selection given a realistic data set. Motivated by the analysis, a novel spectral clustering algorithm is proposed which differs from previous approaches in that only informative/relevant eigenvectors are employed for determining the number of clusters and performing clustering. The key element of the proposed algorithm is a simple but effective relevance learning method which measures the relevance of an eigenvector according to how well it can separate the data set into different clusters. Our algorithm was evaluated using synthetic data sets as well as real-world data sets generated from two challenging visual learning problems. The results demonstrated that our algorithm is able to estimate the cluster number correctly and reveal natural grouping of the input data/patterns even given sparse and noisy data.
Keywords:
本文献已被 ScienceDirect 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号