Séminaire Probabilités et Statistiques
Computation-information gap in high-dimensional clustering
13
Feb. 2025
logo_team
Intervenant : Nicolas Verzelen
Institution : INRAE Montpellier
Heure : 15h30 - 16h30
Lieu : 3L15

We investigate the existence of a fundamental computation-information gap for the problem of clustering a mixture of isotropic Gaussian in the high-dimensional regime, where the ambient dimension p is larger than the number n of points. The existence of a computation-information gap in a specific Bayesian high-dimensional asymptotic regime has been conjectured by Lesieur et al. (2016)  based on the replica heuristic from statistical physics. We provide evidence of the existence of such a gap generically in the high-dimensional regime p >n, by (i)  proving a non-asymptotic low-degree polynomials computational barrier for clustering in high-dimension, matching the performance of the best known polynomial time algorithms, and by (ii) establishing that the information barrier for clustering is smaller than the computational barrier, when the number K of clusters is large enough.  These results are in contrast with the (moderately) low-dimensional regime n> poly(p,K) where there is no computation-information gap for clustering a mixture of isotropic Gaussian. This is based on a joint work with Bertrand Even and Christophe Giraud (Paris-Saclay).

All (past and future) events