Solving multi-instance problems with classifier ensemble based on constructive clustering |
| |
Authors: | Zhi-Hua Zhou Min-Ling Zhang |
| |
Affiliation: | (1) National Laboratory for Novel Software Technology, Nanjing University, Nanjing, 210093, China |
| |
Abstract: | In multi-instance learning, the training set is composed of labeled bags each consists of many unlabeled instances, that is, an object is represented by a set of feature vectors instead of only
one feature vector. Most current multi-instance learning algorithms work through adapting single-instance learning algorithms
to the multi-instance representation, while this paper proposes a new solution which goes at an opposite way, that is, adapting
the multi-instance representation to single-instance learning algorithms. In detail, the instances of all the bags are collected
together and clustered into d groups first. Each bag is then re-represented by d binary features, where the value of the ith feature is set to one if the concerned bag has instances falling into the ith group and zero otherwise. Thus, each bag is represented by one feature vector so that single-instance classifiers can be
used to distinguish different classes of bags. Through repeating the above process with different values of d, many classifiers can be generated and then they can be combined into an ensemble for prediction. Experiments show that the
proposed method works well on standard as well as generalized multi-instance problems.
Zhi-Hua Zhou is currently Professor in the Department of Computer Science & Technology and head of the LAMDA group at Nanjing University.
His main research interests include machine learning, data mining, information retrieval, and pattern recognition. He is associate
editor of Knowledge and Information Systems and on the editorial boards of Artificial Intelligence in Medicine, International Journal of Data Warehousing and Mining, Journal of Computer Science & Technology, and Journal of Software. He has also been involved in various conferences.
Min-Ling Zhang received his B.Sc. and M.Sc. degrees in computer science from Nanjing University, China, in 2001 and 2004, respectively.
Currently he is a Ph.D. candidate in the Department of Computer Science & Technology at Nanjing University and a member of
the LAMDA group. His main research interests include machine learning and data mining, especially in multi-instance learning
and multi-label learning. |
| |
Keywords: | Machine learning Multi-instance learning Classification Clustering Ensemble learning Knowledge representation Constructive induction |
本文献已被 SpringerLink 等数据库收录! |
|