Out-of-bag estimation of the optimal sample size in bagging期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

Out-of-bag estimation of the optimal sample size in bagging

Authors:	Gonzalo Martí nez-Muñ oz [Author Vitae],Alberto Suá rez [Author Vitae]

Affiliation:	C/Francisco Tomás y Valiente, 11 Escuela Politécnica Superior, Universidad Autónoma de Madrid, Madrid 28049, Spain

Abstract:	The performance of m-out-of-n bagging with and without replacement in terms of the sampling ratio (m/n) is analyzed. Standard bagging uses resampling with replacement to generate bootstrap samples of equal size as the original training set m_wor=n. Without-replacement methods typically use half samples m_wr=n/2. These choices of sampling sizes are arbitrary and need not be optimal in terms of the classification performance of the ensemble. We propose to use the out-of-bag estimates of the generalization accuracy to select a near-optimal value for the sampling ratio. Ensembles of classifiers trained on independent samples whose size is such that the out-of-bag error of the ensemble is as low as possible generally improve the performance of standard bagging and can be efficiently built.

Keywords:	Bagging Subagging Bootstrap sampling Subsampling Optimal sampling ratio Ensembles of classifiers Decision trees
本文献已被 ScienceDirect 等数据库收录！