V-fold selection of kernel estimators
V-fold cross-validation is a simple and efficient method for estimator selection when the goal is to minimize the final prediction error. It is widely used, in particular for choosing from data the bandwidth parameter of Parzen-Rosenblatt kernel density estimators.
This talk will present two main results on the problem of choosing among a family of several kernel density estimators, when the goal is to minimize the least-squares loss of the final estimator.
First, a non-asymptotic oracle inequality holds for V-fold cross-validation (and its bias corrected version, V-fold penalization), with a leading constant 1+o(1) for V-fold penalization.
Second, making an exact variance computation allows to quantify the improvement we can expect when V increases. Simulation experiments illustrate that this improvement is typically larger when V goes from 2 to 10 than when V goes from 10 to 100, for instance.
This talk is based upon a collaboration with Matthieu Lerasle (CNRS - Universite de Nice, France) and Nelo Magalhaes (University Paris Sud).