首页 | 本学科首页   官方微博 | 高级检索  
     


Weighted samples based semi-supervised classification
Abstract:Graph-based semi-supervised classification (GSSC) takes labeled and unlabeled samples as vertices in a graph, and edge weights as the similarity between samples. Most GSSC methods handle each labeled sample as equally important in the graph, and they mainly focus on optimizing the graph to improve the performance. In fact, samples are not always evenly distributed. Labeled samples close to the decision boundary of different classes are generally more important than labeled samples far away from the boundary. To account for the different importances, we propose an approach called Weighted Samples based Semi-Supervised Classification (WS3C for short). WS3C firstly executes multiple clusterings on the dataset to explore the structure of samples and summarizes these clustering results. Second, it quantifies the hard-to-cluster index of each labeled sample with respect to other samples based on the summarized results and employs the index to weight that sample. Next, it constructs a graph whose edge weights are equal to the frequency of two samples grouped into the same clusters in multiple clusterings. After that, it performs semi-supervised classification based on the constructed graph and weighted samples. Empirical study on synthesized and real datasets demonstrates that assigning labeled samples with different weights significantly improves the accuracy than equally treating labeled samples. WS3C not only has better performance than other related comparing methods, but also is robust to the input parameters.
Keywords:Semi-supervised classification  Graph optimization  Weighted samples  Hard-to-cluster index
本文献已被 ScienceDirect 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号