首页 | 本学科首页   官方微博 | 高级检索  
     


Out-of-bag estimation of the optimal sample size in bagging
Authors:Gonzalo Martí  nez-Muñ  oz [Author Vitae],Alberto Suá  rez [Author Vitae]
Affiliation:C/Francisco Tomás y Valiente, 11 Escuela Politécnica Superior, Universidad Autónoma de Madrid, Madrid 28049, Spain
Abstract:The performance of m-out-of-n bagging with and without replacement in terms of the sampling ratio (m/n) is analyzed. Standard bagging uses resampling with replacement to generate bootstrap samples of equal size as the original training set mwor=n. Without-replacement methods typically use half samples mwr=n/2. These choices of sampling sizes are arbitrary and need not be optimal in terms of the classification performance of the ensemble. We propose to use the out-of-bag estimates of the generalization accuracy to select a near-optimal value for the sampling ratio. Ensembles of classifiers trained on independent samples whose size is such that the out-of-bag error of the ensemble is as low as possible generally improve the performance of standard bagging and can be efficiently built.
Keywords:Bagging   Subagging   Bootstrap sampling   Subsampling   Optimal sampling ratio   Ensembles of classifiers   Decision trees
本文献已被 ScienceDirect 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号