A second order cone programming approach for semi-supervised learning |
| |
Authors: | Gao Huang Shiji Song Jatinder N.D. Gupta Cheng Wu |
| |
Affiliation: | 1. Department of Automation, Tsinghua University, Beijing 100084, China;2. College of Business Administration, The University of Alabama in Huntsville, Huntsville, AL 35899, USA |
| |
Abstract: | Semi-supervised learning (SSL) involves the training of a decision rule from both labeled and unlabeled data. In this paper, we propose a novel SSL algorithm based on the multiple clusters per class assumption. The proposed algorithm consists of two stages. In the first stage, we aim to capture the local cluster structure of the training data by using the k-nearest-neighbor (kNN) algorithm to split the data into a number of disjoint subsets. In the second stage, a maximal margin classifier based on the second order cone programming (SOCP) is introduced to learn an inductive decision function from the obtained subsets globally. For linear classification problems, once the kNN algorithm has been performed, the proposed algorithm trains a classifier using only the first and second order moments of the subsets without considering individual data points. Since the number of subsets is usually much smaller than the number of training points, the proposed algorithm is efficient for handling big data sets with a large amount of unlabeled data. Despite its simplicity, the classification performance of the proposed algorithm is guaranteed by the maximal margin classifier. We demonstrate the efficiency and effectiveness of the proposed algorithm on both synthetic and real-world data sets. |
| |
Keywords: | Semi-supervised learning K-nearest-neighbor Support vector machine Second order cone programming |
本文献已被 ScienceDirect 等数据库收录! |
|