首页 | 本学科首页   官方微博 | 高级检索  
     

基于后验概率制导的B-KNN文本分类方法
引用本文:周红鹃,祖永亮. 基于后验概率制导的B-KNN文本分类方法[J]. 计算机工程, 2011, 37(21): 114-116. DOI: 10.3969/j.issn.1000-3428.2011.21.039
作者姓名:周红鹃  祖永亮
作者单位:合肥工业大学计算机与信息学院,合肥,230009
基金项目:国家自然科学基金资助项目
摘    要:针对K最近邻(KNN)方法分类准确率高但分类效率较低的特点,提出基于后验概率制导的贝叶斯K最近邻(B-KNN)方法。利用测试文本的后验概率信息对训练集多路静态搜索树进行剪枝,在被压缩的候选类型空间内查找样本的K个最近邻,从而在保证分类准确率的同时提高KNN方法的效率。实验结果表明,与KNN相比,B-KNN的性能有较大提升,更适用于具有较深层次类型空间的文本分类应用。

关 键 词:文本分类  后验概率  贝叶斯分类器  K最近邻方法  贝叶斯K最近邻方法
收稿时间:2011-04-20

B-KNN Text Categorization Method Based on Posterior Probabilitv Guidance
ZHOU Hong-juan,ZU Yong-liang. B-KNN Text Categorization Method Based on Posterior Probabilitv Guidance[J]. Computer Engineering, 2011, 37(21): 114-116. DOI: 10.3969/j.issn.1000-3428.2011.21.039
Authors:ZHOU Hong-juan  ZU Yong-liang
Affiliation:(School of Computer & Information,Hefei University of Technology,Hefei 230009,China)
Abstract:Considering K Nearest Neighbor(KNN) method has high accuracy but poor efficiency,this paper proposes a text categorization method based on the guidance of posterior probability named B-KNN.By using the posterior probabilities collected from the training text,B-KNN prunes the multi-branch-static-searching tree of the training dataset and reduces the candidate class set where K nearest neighbors can be found so that the efficiency of KNN method can be improved while preserving its classification accuracy.Experimental results show that B-KNN method remarkably outperforms KNN method,and it is more suitable for classification tasks with deep hierarchy categorization space.
Keywords:text categorization  posterior probability  Bayesian classifier  K Nearest Neighbor(KNN) method  B-KNN method
本文献已被 CNKI 维普 万方数据 等数据库收录!
点击此处可从《计算机工程》浏览原始摘要信息
点击此处可从《计算机工程》下载免费的PDF全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号