Algebraic statistics is a branch of mathematical statistics that focuses on the use of algebraic, geometric and combinatorial methods in statistics. The workshop will focus on three themes: (1) modeling environmental and ecological systems so that we can better understand the effects of climate change on these systems (2) reimaging urban development and economic systems to address persistent inequity in daily living activities and (3) providing theoretical underpinnings for statistical learning techniques to understand the implications of widespread use and for easy adaptation to novel applications.
These themes emerged in the Fall 2023 long program “Algebraic Statistics and our Changing World” hosted at IMSI. As the next chapter in the interdisciplinary field of algebraic statistics, this workshop is an exciting opportunity to expand on these themes. All participants interested in algebraic statistics are welcome.
Funding
Priority funding consideration will be given to those to register by May 20, 2025. Funding is limited.
Yulia Alexandr
University of California, Berkeley (UC Berkeley)
C
A
Carlos Amendola
Technische Universität Berlin
H
B
Hector Baños
California State University, San Bernardino
S
C
Shelby Cox
University of Michigan
D
D
Danai Deligeorgaki
KTH Royal Institute of Technology
M
H
Max Hill
University of Wisconsin, Madison
S
H
Serkan Hosten
San Francisco State University
K
K
Kaie Kubjas
Aalto University
J
L
Julia Lindberg
University of Texas
H
L
Hanbaek Lyu
University of Wisconsin, Madison
A
M
Aida Maraj
Max Planck Institute of Molecular Cell Biology and Genetics (MPI-CBG)
D
P
Debdeep Pati
University of Wisconsin, Madison
Á
R
Álvaro Ribot
Harvard University
E
R
Elina Robeva
University of British Columbia
H
S
Henry Schenk
Auburn University
S
S
Seth Sullivant
North Carolina State University (NCSU)
Schedule
Monday, July 21, 2025
8:30-9:00 CDT
Welcome/Breakfast
9:00-9:45 CDT
Tropical ToricMaximum Likelihood Estimation
Speaker: Serkan Hosten (San Fransisco State University)
9:45-10:00 CDT
Q&A
10:00-10:30 CDT
Coffee Break
10:30-11:15 CDT
TBA
Speaker: Hanbaek Lyu (University of Wisconsin, Madison)
11:15-11:30 CDT
Q&A
11:30-13:00 CDT
Lunch Break
13:00-13:45 CDT
Constraining the outputs of ReLu neural networks
Speaker: Yulia Alexandr (University of California, Lo Angeles (UCLA))
Preliminary abstract: I will highlight the role of algebraic geometry in enhancing our understanding of machine learning models, offering a fresh perspective on the problem of neural network verification. I will discuss an approach for ensuring the behavior of a ReLU network at test time by establishing systematic algebraic relations that are satisfied by the outputs produced at various data points. I will emphasize the combinatorial and geometric properties of these relations and explain how they constrain the possible output values at test points of interest. This is joint work with Guido Montúfar.
13:45-14:00 CDT
Q&A
14:00-15:00 CDT
Lightning Talks
15:00-15:30 CDT
Discussion and Follow up
15:30-17:00 CDT
Welcome Reception
Tuesday, July 22, 2025
8:30-9:00 CDT
Sign-in/Breakfast
9:00-9:45 CDT
Tropical Phlogenetics
Speaker: Shelby Cox (University of Michigan)
9:45-10:00 CDT
Q&A
10:00-10:30 CDT
Coffee Break
10:30-11:15 CDT
Orthogonal eigenvectors and singular vectors of tensors
Speaker: Álvaro Ribot (Harvard University)
The spectral theorem says that a real symmetric matrix has an orthogonal basis of eigenvectors and that, for a matrix with distinct eigenvalues, the basis is unique (up to signs). In this paper, we study the symmetric tensors with an orthogonal basis of eigenvectors and show that, for a generic such tensor, the orthogonal basis is unique. We also study the non-symmetric setting. The singular value decomposition says that a real matrix has an orthogonal set of singular vector pairs and that, for a matrix with distinct singular values, the basis is unique (up to signs). We describe the tensors with an orthogonal basis of singular vectors and show that a generic such tensor has a unique orthogonal basis, with one exceptional format: order four binary tensors. Motivated by our uniqueness results, we propose a new tensor decomposition that generalizes an orthogonally decomposable decomposition and specializes the higher-order singular value decomposition. We apply our results to component analysis under pairwise mean independence of the source variables, setting a conjecture of Mesters and Zwiernik.This is joint work with Anna Seigal and Piotr Zwiernik.
11:15-11:30 CDT
Q&A
11:30-13:00 CDT
Lunch Break
13:00-13:45 CDT
Identifiability, indistinguishability, and other problems in biological modeling
Speaker: Nicolette Meshkat (Santa Clara University)
An important question that arises when modeling is if the unknown parameters of a model can be determined from real (and sometimes noisy) data, the so-called parameter estimation problem. A key first step is to ask which parameters can be determined given perfect data, i.e. noise-free and of any time duration required. This is called the structural identifiability problem. If all of the parameters can be determined from data, we say the model is identifiable. However, if there is some subset of parameters that can take on an infinite number of values yet yield the same data, we say the model is unidentifiable. If a model is unidentifiable assuming perfect data, then it is almost certainly unidentifiable with real, noisy data, thus knowing this information a priori helps with experimental design. We examine this question for an important class of models called linear compartmental models used in many areas, such as pharmacokinetics, physiology, cell biology, toxicology, and ecology. We also examine a somewhat related question called indistinguishability, which examines if two distinct models can yield the same dynamics. In this case, two models with completely different structures can be indistinguishable from an input-output perspective. On top of this, there are questions of observability, controllability, and model selection/rejection. For all of these questions, we will consider the underlying graphs corresponding to our models and use tools from graph theory and computational algebra to describe and analyze our models.
13:45-14:00 CDT
Q&A
14:00-15:30 CDT
Work Session
15:30-16:15 CDT
Advances in variational inference for singular models
Speaker: Depdeep Pati (University of Wisconsin-Madison)
The marginal likelihood or evidence in Bayesian statistics contains an intrinsic penalty for larger model sizes and is a fundamental quantity in Bayesian model comparison. Unlike regular models where the Bayesian information criterion (BIC) encapsulates a first-order expansion of the logarithm of the marginal likelihood, parameter counting gets trickier in singular models where a quantity called the real log canonical threshold (RLCT) summarizes the effective model dimensionality. For complex singular models where the marginal likelihood is intractable, variational inference is often utilized to approximate an intractable marginal likelihood. We show that mean-field variational inference correctly recovers the RLCT for any singular model in its canonical or normal form. We additionally exhibit sharpness of our bound by analyzing the dynamics of a general purpose coordinate ascent algorithm (CAVI) popularly employed in variational inference. If a singular model is not in the normal form, we demonstrate that one can use a more flexible variational family using normalizing flows to recover the RLCT.
16:15-16:30 CDT
Q&A
Wednesday, July 23, 2025
8:30-9:00 CDT
Sign-in/Breakfast
9:00-9:45 CDT
When is an ideal toric under a linear change of variables?
Speaker: Aida Maraj (Max Planck Institute of Molecular Cell Biology and Genetics)
The motivation for this talk is to detect when an irreducible projective variety V is not toric. We do this by introducing and analyzing a symmetry Lie group and a Lie algebra associated with the ideal I(V). If the dimension of V is strictly less than the dimension of the above-mentioned objects, then there is no linear transformation that turns I(V) into a toric ideal. We use it to provide examples of non-toric statistical models in algebraic statistics.
9:45-10:00 CDT
Q&A
10:00-10:30 CDT
Coffee Break
10:30-11:15 CDT
Gaussian Voronoi Cells
Speaker: Julia Lindberg (University of Texas at Austin)
The expectation maximization (EM) algorithm is a popular method of density estimation for Gaussian mixture models. Fundamental to understanding the performance of this algorithm is to understand the set of points closest to a given Gaussian, where “closest" is defined in terms of the maximum likelihood function. We call this set of points a Gaussian Voronoi cell. In this talk, I will outline new results regarding the geometry and combinatorics of Gaussian Voronoi cells. This is joint work with Joe Kileel.
11:15-11:30 CDT
Q&A
11:30-13:00 CDT
Lunch
13:00-13:45 CDT
Identifiability in Phylogenetic Networks under the Coalsescent: From Simplet to Broader Classes
Speaker: Hector Baños (California State University, San Bernadino (CSU San Bernardino))
Phylogenetic networks provide a framework for representing complex evolutionary histories involving processes such as hybridization and horizontal gene transfer. An important question in their study is whether the structure of a network (or which features of it) can, in principle, be recovered from genetic data. This talk discusses the identifiability of phylogenetic networks under the multispecies coalescent model. First, I review foundational results for a 'simple' class of networks, then present recent work that establishes identifiability for a much broader network class.
13:45-14:00 CDT
Q&A
14:00-15:30 CDT
Discussion
15:30-16:15 CDT
TBA
Speaker: Elina Robeva (University of British Columbia)
16:15-16:30 CDT
Q&A
Thursday, July 24, 2025
8:30-9:00 CDT
Sign-in/Breakfast
9:00-9:45 CDT
Geometry of Polyominal Neural Networks
Speaker: Kaie Kubjas (Aalto University)
"We study the characterization, expressivity and learning of polynomial neural networks with monomial activation functions. In special cases, we describe neuromanifolds as semialgebraic sets and neurovarieties as algebraic varieties. We study the expressivity of polynomial neural networks by exploring their dimension. Finally, we define and investigate the learning degree, which is a characteristic of the optimization landscape of a polynomial neural network. This talk is based on joint work with Jiayi Li and Maximilian Wiesmann. It was done as part of the apprenticeship program ""Varieties from Statistics"" during the IMSI long program ""Algebraic Statistics and Our Changing World""."
9:45-10:00 CDT
Q&A
10:00-10:30 CDT
Coffee Break
10:30-11:15 CDT
TBA
Speaker: Danai Deligeorgaki (KTH)
11:15-11:30 CDT
Q&A
11:30-13:00 CDT
Lunch Break
13:00-13:45 CDT
TBA
Speaker: TBA (TBA)
13:45-14:00 CDT
Q&A
14:00-15:30 CDT
Discussion
15:30-16:15 CDT
Phylogenetic Network Models and Graphical Models
Speaker: Seth Sullivant (North Carolina State University)
This talk will describe recent results on the algebraic structure of phylogenetic network models under the displayed tree model, and connections to graphical models. In particular, we show how the connection to graphical models yields new (non)identifiability results for the phylogenetic network models.
16:15-16:30 CDT
Q&A
Friday, July 25, 2025
8:30-9:00 CDT
Sign-in/Breakfast
9:00-9:45 CDT
TBA
Speaker: Carlos Amendola (Technische Universitat Berlin)
IMSI is committed to making all of our programs and events inclusive and accessible. Contact [email protected] to request disability-related accommodations.
In order to register for this workshop, you must have an IMSI account and be logged in. Please use one of the buttons below to login or create an account.