On classification of signals represented with data-dependent overcomplete dictionaries |
| |
Authors: | Rosalia Maglietta |
| |
Affiliation: | Istituto di Studi sui Sistemi Intelligenti per l'Automazione, CNR, National Research Council , Via Amendola 122/D-I, 70126 , Bari , Italy |
| |
Abstract: | This paper focuses on the problem of how data representation influences the generalization error of kernel-based learning machines like support vector machines (SVMs). We analyse the effects of sparse and dense data representations on the generalization error of SVM. We show that using sparse representations the performances of classifiers belonging to hypothesis spaces induced by polynomial or Gaussian kernel functions reduce to the performances of linear classifiers. Sparse representations reduce the generalization error as long as the representation is not too sparse as with very large dictionaries. Dense data representations reduce the generalization error also using very large dictionaries. We use two schemes for representing data in data-independent overcomplete Haar and Gabor dictionaries, and measure the generalization error of SVMs on benchmark datasets. We study sparse and dense representations in the case of data-dependent overcomplete dictionaries and we show how this leads to principal component analysis. |
| |
Keywords: | supervised learning classification support vector machines generalization leave-one-out error sparse and dense data representation |
|
|