Clustering of discrete measures via mean measure quantization: application to unsupervised vectorization of persistence diagrams

This was part of Randomness in Topology and its Applications

Fred Chazal, INRIA

Thursday, March 23, 2023

Abstract:

Robust topological information commonly comes in the form of a set of persistence diagrams that can be seen as discrete measures and are uneasy to use in generic machine learning frameworks.

In this talk we will introduce a fast, learnt, unsupervised vectorization method for measures in Euclidean spaces and use it for clustering of distributions of discrete measures. The algorithm is simple and efficiently discriminates important space regions where meaningful differences to the ‘’mean’’ measure arise. Applied to persistence diagrams, we will show that it is proven to be able to separate clusters of persistence diagrams. We will illustrate the strength and robustness of our approach on a few synthetic and real data sets.