首页 | 本学科首页   官方微博 | 高级检索  
     


Learning from Batched Data: Model Combination Versus Data Combination
Authors:Kai Ming Ting  Boon Toh Low  Ian H Witten
Affiliation:1. School of Computing and Mathematics, Deakin University, Australia
2. Department of Systems Engineering and Engineering Management, Chinese University of Hong Kong, Hong Kong
3. Department of Computer Science, University of Waikato, New Zealand
Abstract:Combining models learned from multiple batches of data provide an alternative to the common practice of learning one model from all the available data (i.e. the data combination approach). This paper empirically examines the base-line behavior of the model combination approach in this multiple-data-batches scenario. We find that model combination can lead to better performance even if the disjoint batches of data are drawn randomly from a larger sample, and relate the relative performance of the two approaches to the learning curve of the classifier used. In the beginning of the curve, model combination has higher bias and variance than data combination and thus a higher error rate. As training data increases, model combination has either a lower error rate than or a comparable performance to data combination because the former achieves larger variance reduction. We also show that this result is not sensitive to the methods of model combination employed. Another interesting result is that we empirically show that the near-asymptotic performance of a single model in some classification tasks can be significantly improved by combining multiple models (derived from the same algorithm) in the multiple-data-batches scenario.
Keywords:
本文献已被 SpringerLink 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号