Mean Field Game and Mean Field Control Q-Learning

This was part of Applications to Financial Engineering

Jean-Pierre Fouque, University of California, Santa Barbara

Wednesday, December 8, 2021

Abstract: We extend the Markov Decision Process setup to the cases of MFG and MFC problems and we generalize the optimality Bellman equation for Q-learning. By introducing two learning rates, one for the Q-matrix and one for the population distribution, we are able to design a single algorithm which learns the optimal policies for the MFG or for the MFC depending on the ratio of these two rates. Applications to problems in finance are also discussed. Joint work with Andrea Angiuli and Mathieu Laurière