On using prototype reduction schemes to optimize kernel-based nonlinear subspace methods |
| |
Authors: | Sang-Woon Kim [Author Vitae] B.John Oommen [Author Vitae] |
| |
Affiliation: | a Department of Computer Science and Engineering, Myongji University, Yongin 449-728, South Korea b School of Computer Science-IEEE, Carleton University, 1125 Colonel By Dr., Ottawa, Ont., Canada K1S 5B6 |
| |
Abstract: | The subspace method of pattern recognition is a classification technique in which pattern classes are specified in terms of linear subspaces spanned by their respective class-based basis vectors. To overcome the limitations of the linear methods, kernel-based nonlinear subspace (KNS) methods have been recently proposed in the literature. In KNS, the kernel principal component analysis (kPCA) has been employed to get principal components, not in an input space, but in a high-dimensional space, where the components of the space are nonlinearly related to the input variables. The length of projections onto the basis vectors in the kPCA are computed using a kernel matrix K, whose dimension is equivalent to the number of sample data points. Clearly this is problematic, especially, for large data sets.In this paper, we suggest a computationally superior mechanism to solve the problem. Rather than define the matrix K with the whole data set and compute the principal components, we propose that the data be reduced into a smaller representative subset using a prototype reduction scheme (PRS). Since a PRS has the capability of extracting vectors that satisfactorily represent the global distribution structure, we demonstrate that data points which are ineffective in the classification can be eliminated to obtain a reduced kernel matrix, K, without degrading the performance. Our experimental results demonstrate that the proposed mechanism dramatically reduces the computation time without sacrificing the classification accuracy for samples involving real-life data sets as well as artificial data sets. The results especially demonstrate the computational advantage for large data sets, such as those involved in data mining and text categorization applications. |
| |
Keywords: | Principal component analysis (PCA) Linear subspace method (LSM) Kernel principal component analysis (kPCA) Kernel-based nonlinear subspace (KNS) method Prototype reduction schemes (PRS) |
本文献已被 ScienceDirect 等数据库收录! |
|