This was part of Algebraic Statistics for Ecological and Biological Systems

Blessing of Dependence: Identifiability and Geometry of Discrete Models with Multiple Binary Latent Variables

Yuqi Gu, Columbia University

Tuesday, October 10, 2023

Abstract: Identifiability of discrete statistical models with latent variables is known to be challenging to study, yet crucial to a model’s interpretability and reliability. This work presents a general algebraic technique to investigate identifiability of discrete models with latent and graphical components. Specifically, motivated by diagnostic tests collecting multivariate categorical data, we focus on discrete models with multiple binary latent variables. In the considered model, the latent variables can have arbitrary dependencies among themselves while the latent-to-observed measurement graph takes a “star-forest” shape. We establish necessary and sufficient graphical criteria for identifiability, and reveal an interesting and perhaps surprising geometry of blessing-of-dependence: under the minimal conditions for generic identifiability, the parameters are identifiable if and only if the latent variables are not statistically independent. Thanks to this theory, we can perform formal hypothesis tests of identifiability in the boundary case by testing marginal independence of the observed variables. Our results give new understanding of statistical properties of graphical models with latent variables. They also entail useful implications for designing diagnostic tests or surveys that measure binary latent traits.