Karl Hajjar: Symmetries in the dynamics of wide two-layer neural networks

GT Celeste

13
fév. 2023

Intervenant :	Karl Hajjar
Heure :	14h00 - 15h00
Lieu :	3L8

We consider the idealized setting of gradient flow on the population risk for infinitely wide two-layer ReLU neural networks (without bias), and study the effect of symmetries on the learned parameters and predictors. We first describe a general class of symmetries which, when satisfied by the target function $f^*$ and the input distribution, are preserved by the dynamics. We then study more specific cases. When $f^*$ is odd, we show that the dynamics of the predictor reduces to that of a (non-linearly parameterized) linear predictor, and its exponential convergence can be guaranteed. When $f^*$ has a low-dimensional structure, we prove that the gradient flow PDE reduces to a lower-dimensional PDE. Furthermore, we present informal and numerical arguments that suggest that the input neurons align with the lower-dimensional structure of the problem.

This talk may be of interest to other researchers at the LMO in other domains. All welcome!

Voir tous les événements