首页 | 本学科首页   官方微博 | 高级检索  
     


Technical Note: Naive Bayes for Regression
Authors:Frank  Eibe  Trigg  Leonard  Holmes  Geoffrey  Witten  Ian H.
Affiliation:(1) Department of Computer Science, University of Waikato, Hamilton, New Zealand;(2) Department of Computer Science, University of Waikato, Hamilton, New Zealand;(3) Department of Computer Science, University of Waikato, Hamilton, New Zealand;(4) Department of Computer Science, University of Waikato, Hamilton, New Zealand
Abstract:Despite its simplicity, the naive Bayes learning scheme performs well on most classification tasks, and is often significantly more accurate than more sophisticated methods. Although the probability estimates that it produces can be inaccurate, it often assigns maximum probability to the correct class. This suggests that its good performance might be restricted to situations where the output is categorical. It is therefore interesting to see how it performs in domains where the predicted value is numeric, because in this case, predictions are more sensitive to inaccurate probability estimates.This paper shows how to apply the naive Bayes methodology to numeric prediction (i.e., regression) tasks by modeling the probability distribution of the target value with kernel density estimators, and compares it to linear regression, locally weighted linear regression, and a method that produces ldquomodel treesrdquo—decision trees with linear regression functions at the leaves. Although we exhibit an artificial dataset for which naive Bayes is the method of choice, on real-world datasets it is almost uniformly worse than locally weighted linear regression and model trees. The comparison with linear regression depends on the error measure: for one measure naive Bayes performs similarly, while for another it is worse. We also show that standard naive Bayes applied to regression problems by discretizing the target value performs similarly badly. We then present empirical evidence that isolates naive Bayes' independence assumption as the culprit for its poor performance in the regression setting. These results indicate that the simplistic statistical assumption that naive Bayes makes is indeed more restrictive for regression than for classification.
Keywords:naive Bayes  regression  model trees  linear regression  locally weighted regression
本文献已被 SpringerLink 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号