EAISI lecture by Visiting Professor Andrea Agazzi

Nonlinear policy optimization in deep reinforcement learning: policy gradients for wide neural networks

EAISI lecture by visiting Professor Andrea Agazzi

Datum

dinsdag 21 mei 2024 vanaf 12:00 PM tot 1:30 PM

Locatie

GZ 0.05

Organisator

Eindhoven Artificial Intelligence Systems Institute

Medeorganisator

Mechanical Engineering

Prijs

free

Nonlinear policy optimization in deep reinforcement learning: policy gradients for wide neural networks

Andrea Agazzi, Assistant Professor in the Mathematics Department at the University of Pisa, is a guest of Mauro Salazar, Assistant Professor at Control Systems Technology group of the department of Mechanical Engineering, 黑料福利网.

Title | Nonlinear policy optimization in deep reinforcement learning: policy gradients for wide neural networks

In recent years, we have witnessed multiple groundbreaking results obtained using neural networks as flexible (nonlinear) parameterizations of large policy classes to solve difficult reinforcement learning tasks, e.g., AlphaGO, Dota2, Self-driving cars. However, despite these successes, there exists a notable gap in providing theoretical explanations for the effectiveness of neural networks trained with (deep) reinforcement learning algorithms. In this presentation, I will first briefly overview of the policy optimization problem in reinforcement learning, along with an introduction to the policy gradient algorithm, a prototypical solution approach. Then, I will discuss some limitations of this algorithm when paired with general nonlinear policy classes. Finally, I will discuss how these limitations are bypassed by wide neural networks under an appropriate scaling of parameters at initialization, resulting in the convergence of the policy gradient training dynamics towards a so-called 鈥渕ean-field鈥� limit. In particular, in this setting one can prove global optimality of the dynamics' fixed points despite the nonlinear and nonconvex characteristics of the risk function.

Program
12:00 - 12.45 Lecture in Gemini South 0.05 (doors open at 11:45)
12:45 - 13:00 Q&A
13:00 Pizza lunch

Andrea Agazzi

Andrea Agazzi, Assistant Professor in the Mathematics Department at the University of Pisa, received his PhD in Theoretical Physics at the University of Geneva, and was then hired as a Griffith Research Assistant Professor at Duke University. Before that, he obtained his Bsc degree in physics at ETH Zurich and his Msc in theoretical physics at Imperial College London. His main research focus is in applied probability theory, using techniques from statistical mechanics and stochastic analysis to gain insight in the (stochastic) behavior of complex dynamical models emerging in real world applications. For example, he has worked on scaling limits of machine learning models seen as interacting particle systems, on the behavior of large networks of chemical reactions, focusing on the relations between their stochastic dynamics and their structure, and on stochastic approximations of complex fluid models.

Organisator

Eindhoven Artificial Intelligence Systems Institute

Artificial Intelligence

Eindhoven Artificial Intelligence Systems Institute