This was part of Statistical and Computational Challenges in Probabilistic Scientific Machine Learning (SciML)

All You Need is a Classifier

Assad Oberai, University of Southern California (USC)

Monday, June 9, 2025



Abstract:

We propose a data-driven method to learn the time-dependent probability densities of a multivariate stochastic process from sample paths, assuming that the initial probability density is analytically known. Our method uses a novel time-dependent binary classifier to approximate the partial time derivative of the logarithm of the probability density. We propose a contrastive learning-based objective to train the classifier. Significantly, the proposed method explicitly models the time-dependent probability distribution. We demonstrate that the proposed method can approximate the time-dependent probability density functions for systems driven by white noise. We also employ the proposed method for synthesizing new samples of a random vector from a given set of observations. In such applications, we generate sample paths for training using stochastic interpolants. Subsequently, new samples are generated using gradient-based Markov chain Monte Carlo methods because automatic differentiation can efficiently provide the necessary gradient. Further, we demonstrate the utility of an explicit approximation to the time-dependent probability density function through applications in unsupervised outlier detection. Our method accurately reconstructs complex time-dependent,  multi-modal, and near-degenerate densities, scales effectively to moderately high-dimensional problems, and reliably detects outliers among real-world data.