This was part of Algebraic Statistics for Ecological and Biological Systems

Inferring the tree-like parts of a species network under the coalescent

John Rhodes, University of Alaska Fairbanks

Monday, October 9, 2023

Abstract: If the evolutionary history of a collection of organisms involves hybridization or gene flow, a network is needed for its graphical depiction. Inferring such a network from biological data, however, is a demanding problem, both practically and theoretically. The size of network space, the size of genomic data sets, the confounding signal arising from incomplete lineage sorting, and the poorly understood identifiability properties of the Network Multispecies Coalescent (NMSC) model all cause difficulties. Bayesian inference has been effective only on tiny problems. Current tractable methods have adopted pseudolikelihood applied to subnetwork summary statistics, or inference of small subnetworks combined with combinatorial network building. Generally these methods also assume simple network structure, such as level-1-ness, that may be hard to justify biologically. This talk describes recent results on identifiability of the `Tree of Blobs' of a species network (in which all biconnected components of the network are collapsed to nodes), as well as an algorithm for its inference. No assumption on network structure is needed, and regardless of whether the internal structure of the blobs is identifiable the estimate of the tree of blobs is consistent under the NMSC.