首页 | 本学科首页   官方微博 | 高级检索  
     

一种针对非平衡数据的贝叶斯分类算法
引用本文:汪春亮,伏玉琛.一种针对非平衡数据的贝叶斯分类算法[J].计算机工程与科学,2010,32(7):95-98.
作者姓名:汪春亮  伏玉琛
作者单位:1. 苏州大学附属第二医院,江苏,苏州,215004;苏州大学计算机科学与技术学院,江苏,苏州,215006
2. 苏州大学计算机科学与技术学院,江苏,苏州,215006
摘    要:借鉴半监督分类的思想,本文提出一种基于改进EM算法的贝叶斯分类模型,对移动通信网络中存在的大量随机缺失的非平衡数据进行分类。首先,从实际数据中经过初步统计分析得到能在一定程度上反应变量状态的先验概率,并以此作为贝叶斯分类模型的初始值进行EM迭代训练,从而减少EM算法的迭代次数并改善EM算法对初始值的敏感性以及局部收敛的缺陷;然后,利用对历史移动通信数据进行训练得到的叶斯网络分类模型,对测试数据进行预测分类。实验结果表明,该方法大大提高了移动通信数据中负类样本的预测成功率,与传统的数理统计分析方法相比较,表现出了更好的性能。

关 键 词:半监督学习  贝叶斯网络  EM算法  非平衡数据
收稿时间:2009-03-13
修稿时间:2009-08-26

A New Bayesian Classification Algorithm for Non-Balance Datasets
WANG Chun-liang,FU Yu-chen.A New Bayesian Classification Algorithm for Non-Balance Datasets[J].Computer Engineering & Science,2010,32(7):95-98.
Authors:WANG Chun-liang  FU Yu-chen
Affiliation:(1.No.2 Hospital Affiliated to Suzhou University,Suzhou 215004; (2.School of Computer Science and Technology,Suzhou University,Suzhou 215006,China)
Abstract:Based on the idea of semi supervised learning, a new Bayesian classifier model by using an improved EM (Expectation Maximum) algorithm is proposed to classify and predict non balance data gathered from mobile communication networks. Firstly, a statistical analysis is performed to calculate the priori probabilities based on the actual data. By using these priori probabilities as the initial values of the Bayesian model, we can speed up the convergence process of the EM algorithm. Secondly, a classifier based on the Bayesian network is constructed to learn the category characteristics of the historic communication data by improving the EM (Expectation Maximum) steps. Thirdly, by using this classifier, the label of the current data sample is predicted. The experimental results demonstrate that, the proposed method highly increases the prediction accuracy of the negative label, and gains better performance than the traditional statistical methods.
Keywords:EM
本文献已被 万方数据 等数据库收录!
点击此处可从《计算机工程与科学》浏览原始摘要信息
点击此处可从《计算机工程与科学》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号