GT Celeste
Ludovic Stephan
14
déc. 2023
déc. 2023
Intervenant : | Ludovic Stephan |
Institution : | École Polytechnique Fédérale de Lausanne |
Heure : | 15h45 - 16h45 |
Lieu : | LMO Orsay salle 3L15 |
Feature learning in two-layer neural networks with large gradient
steps
Résumé: Feature learning is an important mechanism of neural networks,
and an integral part of their advantages over simpler (e.g. kernel)
learning methods. In this talk, I will present how this phenomenon
occurs in two-layer networks trained with large gradient steps, in which
both the batch size and the learning rate grow polynomially with the
dimension. In particular, we uncover an occurence of the so-called
"staircase" property of learning, where important directions are learned
sequentially at each new step.