Sampling with kernelized Wasserstein gradient flows

This was part of Applied Optimal Transport

Anna Korba, ENSAE

Friday, May 20, 2022

Abstract:

Sampling from a probability distribution whose density is only known up to a normalisation constant is a fundamental problem in statistics and machine learning. Recently, several algorithms based on interactive particle systems were proposed for this task, as an alternative to Markov Chain Monte Carlo methods or Variational Inference. These particle systems can be designed by adopting an optimisation point of view for the sampling problem: an optimisation objective is chosen (which typically measures the dissimilarity to the target distribution), and its Wasserstein gradient flow is approximated by an interacting particle system. At stationarity, the stationarity states of these particle systems define an empirical measure approximating the target distribution. In this talk I will present recent work on such algorithms, such as Stein Variational Gradient Descent [1] or Kernel Stein Discrepancy Descent [2], two algorithms based on Wasserstein gradient flows and reproducing kernels. I will discuss some recent results, that show that these particle systems can provide a good approximation of the target distribution; as well as current issues and open questions on the empirical and theoretical side.

[1] A non-asymptotic Analysis of Stein Variational Gradient Descent. Korba, A., Salim, A., Arbel, M., Luise, G., Gretton, A. Neural Information Processing Systems (Neurips), 2020

[2] Kernel Stein Discrepancy Descent. Korba, A., Aubin-Frankowski, P.C., Majewski, S., Ablin, P. International Conference of Machine Learning (ICML), 2021.