Learning Instance Weighted Naive Bayes from labeled and unlabeled data |
| |
Authors: | Liangxiao Jiang |
| |
Affiliation: | (1) Department of Computer Science, China University of Geosciences, Wuhan, Hubei, China, 430074 |
| |
Abstract: | In real-world data mining applications, it is often the case that unlabeled instances are abundant, while available labeled
instances are very limited. Thus, semi-supervised learning, which attempts to benefit from large amount of unlabeled data
together with labeled data, has attracted much attention from researchers. In this paper, we propose a very fast and yet highly
effective semi-supervised learning algorithm. We call our proposed algorithm Instance Weighted Naive Bayes (simply IWNB).
IWNB firstly trains a naive Bayes using the labeled instances only. And the trained naive Bayes is used to estimate the class
membership probabilities of the unlabeled instances. Then, the estimated class membership probabilities are used to label
and weight unlabeled instances. At last, a naive Bayes is trained again using both the originally labeled data and the (newly
labeled and weighted) unlabeled data. Our experimental results based on a large number of UCI data sets show that IWNB often
improves the classification accuracy of original naive Bayes when available labeled data are very limited. |
| |
Keywords: | |
本文献已被 SpringerLink 等数据库收录! |
|