首页 | 本学科首页   官方微博 | 高级检索  
     


An effective and interpretable method for document classification
Authors:Ngo Van Linh  Nguyen Kim Anh  Khoat Than  Chien Nguyen Dang
Affiliation:1.KDE Lab and Department of Information Systems, School of Information and Communication Technology,Hanoi University of Science and Technology,Hanoi,Vietnam;2.Vietnam-Japan International Institute of Science and Technology,Hanoi University of Science and Technology,Hanoi,Vietnam;3.KDE Lab, School of Information and Communication Technology,Hanoi University of Science and Technology,Hanoi,Vietnam
Abstract:As the number of documents has been rapidly increasing in recent time, automatic text categorization is becoming a more important and fundamental task in information retrieval and text mining. Accuracy and interpretability are two important aspects of a text classifier. While the accuracy of a classifier measures the ability to correctly classify unseen data, interpretability is the ability of the classifier to be understood by humans and provide reasons why each data instance is assigned to a label. This paper proposes an interpretable classification method by exploiting the Dirichlet process mixture model of von Mises–Fisher distributions for directional data. By using the labeled information of the training data explicitly and determining automatically the number of topics for each class, the learned topics are coherent, relevant and discriminative. They help interpret as well as distinguish classes. Our experimental results showed the advantages of our approach in terms of separability, interpretability and effectiveness in classification task of datasets with high dimension and complex distribution. Our method is highly competitive with state-of-the-art approaches.
Keywords:
本文献已被 SpringerLink 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号