首页 | 本学科首页   官方微博 | 高级检索  
     

基于Bootstrap Averaging的贝叶斯分类算法
引用本文:白莉媛,肖乐,黄晖,丁伟.基于Bootstrap Averaging的贝叶斯分类算法[J].计算机应用与软件,2007,24(9):189-190,199.
作者姓名:白莉媛  肖乐  黄晖  丁伟
作者单位:1. 河南工业大学信息科学与工程学院,河南,郑州,450052
2. 河南工业大学理学院,河南,郑州,450052
摘    要:针对单词簇上训练朴素贝叶斯文本分类器概率估计偏差较大所导致的分类精度较低问题.在使用概率分布聚类算法得到的单词簇的基础上,根据单词与簇间互信息建立有序单词子序列,采用有放回随机抽样对单词序列构造规模相当的样本集并将估计出的参数的平均值作为训练得到的最终参数对未知文本进行分类.公共文本实验数据集上的实验结果表明,提出的训练方法相对于传统的朴素贝叶斯分类器训练方法能够获得更高的分类精度且过程相对简单.

关 键 词:概率分布聚类  文本分类  朴素贝叶斯分类器  自助平均  Averaging  Bootstrap  朴素贝叶斯分类器  分类算法  BASED  CLASSIFICATION  ALGORITHM  过程  训练方法  结果  文本实验  数据集  最终参数  平均值  估计  样本集  规模  构造  子序列  抽样对  随机
修稿时间:2006-09-25

A BAYES CLASSIFICATION ALGORITHM BASED ON BOOTSTRAP AVERAGING
Bai Liyuan,Xiao Le,Huang Hui,Ding Wei.A BAYES CLASSIFICATION ALGORITHM BASED ON BOOTSTRAP AVERAGING[J].Computer Applications and Software,2007,24(9):189-190,199.
Authors:Bai Liyuan  Xiao Le  Huang Hui  Ding Wei
Affiliation:1. School of Information Science and Engineering, Henan Insititute of Technology, Zhengzhou 450052,Henan, China;2. School of Science,Henan Insititute of Technology,Zhengzhou 450052, Henan, China
Abstract:Aimed to solve the problem of low classification accuracy caused by poor distribution estimation for training naive bayes text classfier on word clusters, we make a word list based on mutual information between word and clusters, then construct a sample set with the same size with bootstrap method and use the average of the parameters estimated from it as the last parameter to classify unknown text. Experiment results on benchmark text dataset show that the method gain higher classification accuracy comparing to naive bayes classifier.
Keywords:Distributional clustering Text classification Naive bayes classifier Bootstrap averaging
本文献已被 CNKI 维普 万方数据 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号