首页 | 本学科首页   官方微博 | 高级检索  
     

基于数据填补和连续属性的朴素贝叶斯算法
引用本文:李忠波,杨建华,刘文琦.基于数据填补和连续属性的朴素贝叶斯算法[J].计算机工程与应用,2016,52(1):133-140.
作者姓名:李忠波  杨建华  刘文琦
作者单位:大连理工大学 控制科学与控制工程学院,辽宁 大连 116024
摘    要:朴素贝叶斯算法(NB)在处理分类问题时通常假设训练样本的数值型连续属性满足正态分布,其分类精度也受到训练数据完整性的影响,而实际采样数据很难满足上述要求。针对数据缺失问题,基于期望最大值算法(EM),将朴素贝叶斯分类器利用已有的不完整数据进行参数学习;针对样本数值型连续属性非正态分布的情况,基于核密度估计,利用其分布密度(Distribution Density)和新的分析计算方法来求最大后验分布,同时用标准数据集的分类实验验证了改进的有效性。将改良的算法EM-DNB应用在生物工程蛋白质纯化工艺预测中,实验结果表明,预测精度有所提高。

关 键 词:朴素贝叶斯(NB)  期望最大值(EM)算法  连续属性  核密度估计  蛋白质纯化  

Naive Bayes based on data filling and continuous attribute
LI Zhongbo,YANG Jianhua,LIU Wenqi.Naive Bayes based on data filling and continuous attribute[J].Computer Engineering and Applications,2016,52(1):133-140.
Authors:LI Zhongbo  YANG Jianhua  LIU Wenqi
Affiliation:School of Control Science and Engineering, Dalian University of Technology, Dalian, Liaoning 116024, China
Abstract:When dealing with classification problem, Naive Bayes(NB) usually assumes that the numerical continuous attributes follow normal distribution, the classification accuracy is also affected by the integrity of training data. But the actual sampled data are difficult to meet the above requirements. For missing data, the Naive Bayesian classifier uses existing incomplete data to implement parameter learning based on the Expectation-Maximum(EM) algorithm; for non-
normal numerical continuous attributes, distribution density based on kernel density estimation and a new method are used to calculate the maximum posterior probability, meanwhile, the classification experiment using standard data sets verifies the effectiveness of the improvement. Finally, the improved algorithm(EM-DNB) is applied to the prediction of the protein purification technologies in biological engineering. The experimental results show that the accuracy is improved.
Keywords:Naive Bayes(NB)  Expectation-Maximum(EM) algorithm  continuous attributes  kernel?density?estimation  protein purification  
本文献已被 万方数据 等数据库收录!
点击此处可从《计算机工程与应用》浏览原始摘要信息
点击此处可从《计算机工程与应用》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号