Consistent change-point detection with kernels
We tackle the change-point problem with data belonging to a general set. We propose a penalty for choosing the number of change-points in the kernel-based method of Harchaoui and Cappe (2007). This penalty generalizes the one proposed for one dimensional signals by Lebarbier (2005).
By showing a new concentration result in Hilbert spaces, we prove it satisfies a non-asymptotic oracle inequality. Furthermore, our procedure retrieves the correct number of change-points with high probability, provided the penalty is well chosen, and it estimates the change-points location at the optimal rate. As a consequence, when using a characteristic kernel, KCP detects all kinds of change in the distribution (not only changes in the mean or the variance), and it is able to do so for complex structured data (not necessarily in R^d). Most of the analysis is conducted assuming that the kernel is bounded; part of the results can be extended when we only assume a finite second-order moment.
Experiments on synthetic and real data illustrate the accuracy of our method, showing it can detect changes in the whole distribution of data, even when the mean and variance are constant.
Based upon joints works with Alain Celisse, Damien Garreau and Zaid Harchaoui.
Preprints: http://arxiv.org/abs/1612.04740 and http://arxiv.org/abs/1202.3878