This was part of Permutation and Causal Inference

Measuring association on topological spaces using kernels and geometric graphs

Bodhi Sen, Columbia University

Thursday, August 24, 2023



Abstract: We propose and study a class of simple, nonparametric, yet interpretable measures of association between two random variables X and Y taking values in general topological spaces. These nonparametric measures -- defined using the theory of reproducing kernel Hilbert spaces -- capture the strength of dependence between X and Y and have the property that they are 0 if and only if the variables are independent and 1 if and only if one variable is a measurable function of the other. Further, these population measures can be consistently estimated using the general framework of graph functionals which include k-nearest neighbor graphs and minimum spanning trees. Moreover, a subclass of these estimators are also shown to adapt to the intrinsic dimensionality of the underlying distribution. Some of these empirical measures can also be computed in near-linear time. Under the hypothesis of independence between X and Y, these empirical measures (properly normalized) have a standard normal limiting distribution. Thus, these measures can also be readily used to test the hypothesis of mutual independence between X and Y. We also point out how these ideas can be used for: (i) developing a measure of conditional dependence, and for (ii) defining a nonparametric measure of multi-sample dissimilarity.