首页 | 本学科首页   官方微博 | 高级检索  
     


Combining rough decisions for intelligent text mining using Dempster’s rule
Authors:Yaxin Bi  Sally McClean  Terry Anderson
Affiliation:(1) School of Computing and Mathematics, University of Ulster, Newtownabbey, Antrim, BT37 0QB, Northern Ireland, UK;(2) School of Computing and Information Engineering, University of Ulster, Coleraine, Londonderry, BT52 1SA, Northern Ireland, UK
Abstract:An important issue in text mining is how to make use of multiple pieces knowledge discovered to improve future decisions. In this paper, we propose a new approach to combining multiple sets of rules for text categorization using Dempster’s rule of combination. We develop a boosting-like technique for generating multiple sets of rules based on rough set theory and model classification decisions from multiple sets of rules as pieces of evidence which can be combined by Dempster’s rule of combination. We apply these methods to 10 of the 20-newsgroups—a benchmark data collection (Baker and McCallum 1998), individually and in combination. Our experimental results show that the performance of the best combination of the multiple sets of rules on the 10 groups of the benchmark data is statistically significant and better than that of the best single set of rules. The comparative analysis between the Dempster–Shafer and the majority voting (MV) methods along with an overfitting study confirm the advantage and the robustness of our approach.
Keywords:Rule induction  Text mining  Rough set  Dempster’  s rule of combination
本文献已被 SpringerLink 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号