Using Disjoint Tree Mergers for Large Tree Estimation

This was part of Contemporary Challenges in Large-Scale Sequence Alignments and Phylogenies

Tandy Warnow, University of Illinois at Urbana-Champaign

Tuesday, August 12, 2025

Abstract: Computing large phylogenetic trees typically involves attempts to solve NP-hard optimization problems, such as maximum likelihood, maximum quartet support, etc. In this talk, I will present a new class of divide-and-conquer methods, called “Disjoint Tree Mergers (DTMs)”, for this problem. At a high level, these methods operate by (a) dividing the input sequence dataset into disjoint sets, (b) constructing trees on each subset, and then (c) combining the subset trees (using auxiliary information) into a tree on the full dataset. If appropriately designed, pipelines using DTMs have strong statistical guarantees (e.g., be statistically consistent). Most excitingly, DTMs used with methods like ASTRAL can improve accuracy and reduce runtime for species tree estimation on very large datasets, and some research suggests that DTMs can also be used to improve maximum likelihood gene tree estimation. I will describe these methods, the theory behind the methods, and some open problems in this area.