#### About Event

**11h - INTRODUCTION TO REINFORCEMENT LEARNING FOR FINANCE**

This lecture is an introduction to reinforcement learning (RL) and its application to finance. RL may be viewed as a model agnostic approach to solving optimal control problems where there is no explicit underlying modeling. It arises in many contexts in finance such as optimal portfolio allocation, hedging options, and statistical arbitrate strategies. This course will introduce the basic aspects of RL, including Q-learning and deep Q-learning, and how one may apply them in a few prototypical cases.

**14:00h - ROBUST RISK-AWARE AND DYNAMICALLY TIME CONSISTENT REINFORCEMENT LEARNING**

Reinforcement learning (RL) proposes to provide an approach to solve stochastic optimal control problems in a model agnostic manner. Traditionally, however, RL aims to find optimal actions for optimising expected total discounted rewards, and hence, does not account for risk. There are new directions aiming to incorporate risk, and in this talk I will introduce two approaches.

In the first approach, we assess the value of a policy using rank dependent expected utility (RDEU). RDEU allows the agent to seek gains, while simultaneously protecting themselves against downside risk. To robustify optimal policies against model uncertainty, we assess a policy not by its distribution, but rather, by the worst possible distribution that lies within a Wasserstein ball around it. Thus, our problem formulation may be viewed as an actor/agent choosing a policy (the outer problem), and the adversary then acting to worsen the performance of that strategy (the inner problem).

In the second approach, we investigate time-consistent risk-sensitive stochastic optimization problems. Specifically, we assume agents assess the risk of a sequence of random variables using dynamic convex risk measures. We employ a time-consistent dynamic programming principle to determine the value of a particular policy, and develop policy gradient update rules that aid in obtaining optimal policies.

We will illustrate the efficacy of the approaches through several examples.

This is based on joint works:

a) Jaimungal, Pesenti, Wang, & Tatsat (2022) Robust Risk-Aware Reinforcement Learning, SIAM J. Financial Mathematics, v13(1) https://epubs.siam.org/doi/10.1137/21M144640X and

b) Coache and Jaimungal (2021) Reinforcement Learning with Dynamic Convex Risk Measures, https://arxiv.org/abs/2112.13414

Texto informado pelo autor.

**Email: emap@fgv.br**

**Tel.: 21 3799-5917**

#### Apoiadores / Parceiros / Patrocinadores

## Speakers

#### Sebastian Jaimungal

Prof. Jaimungal is the director of the professional *Masters of Financial Insurance Program*. He is a Fields Institute Fellow and the former chair of the SIAM activity group in Financial Mathematics and Engineering (SIAG/FM&E). He acts on several editorial boards including Quantitative Finance and the SIAM Journal on Financial Mathematics. His research interests lie in mathematical finance and ranges over a variety of topics including machine learning, reinforcement learning, mean field games, stochastic control, and algorithmic trading, and commodity and energy markets.