This was part of Eliciting Structure in Genomics Data

Theory and Practice for Large-scale Phylogeny Estimation

Tandy Warnow, University of Illinois at Urbana-Champaign

Thursday, September 2, 2021


The estimation of phylogenetic trees for individual genes or multi-locus datasets is a basic part of considerable biological research, and is approached through methods that are based on stochastic models of sequence and gene evolution. Statistical properties of methods, including statistical consistency and sample complexity, are important, and theoretical advances in these aspects have resulted in new methods with outstanding theoretical properties. Yet, empirical performance has been of mixed quality. In this talk, I will discuss algorithm design issues that arise, and present new results that shed light on those strategies that maintain theoretical advantages while providing excellent empirical performance on large simulated datasets. I will also present some open questions of interest to mathematicians that would be important to further method development.