Ensemble methods for multi-label classification期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

Ensemble methods for multi-label classification

Affiliation:	1. Department of Information Systems Engineering, Ben-Gurion University of the Negev, P.O.B. 653, Beer-Sheva 84105, Israel;2. School of Computer Science, Academic College of Tel-Aviv Yafo, P.O.B. 8401, Tel Aviv 61083, Israel;1. Faculty of Engineering and Computer Science, Concordia University, Canada;2. Faculty of Computers and Information, Menofia University, Egypt;3. Department of Automatic Control and Systems Engineering, Sheffield University, UK;1. Research and Higher Studies Center, National Polytechnic Institute, A.P. 14-740, 07000 Mexico City, Mexico;2. Centro de Investigación en Computación, Instituto Politécnico Nacional, Av. Juan de Dios Batiz w/n and Miguel Othon de Mendizabal, P.O. 07738, Mexico City, Mexico

Abstract:	Ensemble methods have been shown to be an effective tool for solving multi-label classification tasks. In the RAndom k-labELsets (RAKEL) algorithm, each member of the ensemble is associated with a small randomly-selected subset of k labels. Then, a single label classifier is trained according to each combination of elements in the subset. In this paper we adopt a similar approach, however, instead of randomly choosing subsets, we select the minimum required subsets of k labels that cover all labels and meet additional constraints such as coverage of inter-label correlations. Construction of the cover is achieved by formulating the subset selection as a minimum set covering problem (SCP) and solving it by using approximation algorithms. Every cover needs only to be prepared once by offline algorithms. Once prepared, a cover may be applied to the classification of any given multi-label dataset whose properties conform with those of the cover. The contribution of this paper is two-fold. First, we introduce SCP as a general framework for constructing label covers while allowing the user to incorporate cover construction constraints. We demonstrate the effectiveness of this framework by proposing two construction constraints whose enforcement produces covers that improve the prediction performance of random selection by achieving better coverage of labels and inter-label correlations. Second, we provide theoretical bounds that quantify the probabilities of random selection to produce covers that meet the proposed construction criteria. The experimental results indicate that the proposed methods improve multi-label classification accuracy and stability compared to the RAKEL algorithm and to other state-of-the-art algorithms.

Keywords:	Multi-label classification Ensemble learning
本文献已被 ScienceDirect 等数据库收录！

设为首页 | 免责声明 | 关于勤云 | 加入收藏