An effective and interpretable method for document classification期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

An effective and interpretable method for document classification

Authors:	Ngo Van Linh Nguyen Kim Anh Khoat Than Chien Nguyen Dang

Affiliation:	1.KDE Lab and Department of Information Systems, School of Information and Communication Technology,Hanoi University of Science and Technology,Hanoi,Vietnam;2.Vietnam-Japan International Institute of Science and Technology,Hanoi University of Science and Technology,Hanoi,Vietnam;3.KDE Lab, School of Information and Communication Technology,Hanoi University of Science and Technology,Hanoi,Vietnam

Abstract:	As the number of documents has been rapidly increasing in recent time, automatic text categorization is becoming a more important and fundamental task in information retrieval and text mining. Accuracy and interpretability are two important aspects of a text classifier. While the accuracy of a classifier measures the ability to correctly classify unseen data, interpretability is the ability of the classifier to be understood by humans and provide reasons why each data instance is assigned to a label. This paper proposes an interpretable classification method by exploiting the Dirichlet process mixture model of von Mises–Fisher distributions for directional data. By using the labeled information of the training data explicitly and determining automatically the number of topics for each class, the learned topics are coherent, relevant and discriminative. They help interpret as well as distinguish classes. Our experimental results showed the advantages of our approach in terms of separability, interpretability and effectiveness in classification task of datasets with high dimension and complex distribution. Our method is highly competitive with state-of-the-art approaches.

Keywords:
本文献已被 SpringerLink 等数据库收录！

设为首页 | 免责声明 | 关于勤云 | 加入收藏