A nonnegative matrix factorization framework for semi-supervised document clustering with dual constraints |
| |
Authors: | Huifang Ma Weizhong Zhao Zhongzhi Shi |
| |
Affiliation: | 1. College of Computer Science and Engineering, Northwest Normal University, Lanzhou, 730070, Gansu, China 2. College of Information Engineering, Xiangtan University, Xiangtan, 411105, China 3. The Key Laboratory of Intelligent Information Processing, Institute of Computing Technology, Chinese Academy of Sciences, Beijing, 100190, China
|
| |
Abstract: | In this paper, we propose a new semi-supervised co-clustering algorithm Orthogonal Semi-Supervised Nonnegative Matrix Factorization (OSS-NMF) for document clustering. In this new approach, the clustering process is carried out by incorporating both prior domain knowledge of data points (documents) in the form of pair-wise constraints and category knowledge of features (words) into the NMF co-clustering framework. Under this framework, the clustering problem is formulated as the problem of finding the local minimizer of objective function, taking into account the dual prior knowledge. The update rules are derived, and an iterative algorithm is designed for the co-clustering process. Theoretically, we prove the correctness and convergence of our algorithm and demonstrate its mathematical rigorous. Our experimental evaluations show that the proposed document clustering model presents remarkable performance improvements with those constraints. |
| |
Keywords: | |
本文献已被 SpringerLink 等数据库收录! |
|