The Coupled Bootstrap Framework for Risk and Error Estimation
Test error estimation is a fundamental problem in statistical learning. Its goal is to correctly evaluate how an algorithm that learned from training data will perform on new and unseen data. With the development of complex machine learning models, hyperparameter tuning and model selection are critical to obtaining good performance with learning algorithms, and these tasks strongly rely on a good test error estimator. Existing estimators in the literature require smoothness assumptions on the data generating distribution and/or the fitting algorithm, resampling schemes that rely on symmetry of the data, or can be very computationally expensive.
We propose a new test error and risk estimator named coupled bootstrap, or CB, which is easily computable, model agnostic, and does not rely on sample splitting. By exploiting the distributional properties for some noise classes, we create a pair of perturbed datasets with certain independence properties such that one of these perturbed datasets acts as a training set and the other as a test set. The CB estimator is shown to be unbiased for a slightly perturbed version of the original problem, and converges to the original test error as the magnitude of the added perturbation decreases.
We focus on two very important noise classes for the response variable: Gaussian and Poisson. For both cases, we study the bias behavior as a function of the perturbation magnitude, control the error estimator’s variability as a function of the perturbation size and number of bootstrap samples, and derive limiting results. We compare CB to existing estimators in the literature, both in simulated and real settings. In the Gaussian scenario, we also provide new findings for existing methods in the literature. In the Poisson case, we propose a new estimator based on the computationally costly gold-standard method in the literature and compare it against the CB approach. In general, CB performs favorably iii when compared to other estimators, in particular when the algorithm is highly variable, the response is Gaussian and the algorithm is nondifferentiable, and the response is Poisson and the sample size is large. CB also performs well in two different real data applications, image denoising and density estimation.
Funding
Amazon
History
Date
2022-07-22Degree Type
- Dissertation
Department
- Statistics and Data Science
Degree Name
- Doctor of Philosophy (PhD)