Eliciting Structure in Genomics Data
Bridging the Gap between Theory, Algorithms, Implementations, and Applications
August 30-September 3, 2021
- Mihai Anitescu (Argonne and Statistics, Chicago)
- Anna Gilbert (Mathematics, Michigan)
- Dan Nicolae (Human Genetics, Medicine and Statistics)
- Matthew Stephens (Statistics, Chicago)
Methods for dimension reduction play a critical role in a wide variety of genomic applications. Indeed, as technology develops, and datasets grow in both size and complexity, the need for effective dimension reduction methods that help visualize and distill the primary structures remains as essential as ever. Examples of the many practical applications in genomics include: (a) understanding (i) the structure of wild populations (particularly endangered species) from population genetic variation, (ii) human evolutionary history, also from population genetic variation, (iii) the 3-D structure of DNA from hi-C data, and (iv) genetic factors that influence risk for different human disease; (b) identifying (i) substructure among cell populations based on single-cell transcription patterns, and (ii) distinctive signatures of somatic mutations distinguishing different cancer subtypes; c) estimating confounding factors and other sources of unwanted variation in gene expression studies; d) segmenting and annotating genomic regions based on chromatin marks and other molecular features.
The development and provision of effective methods for dimension reduction involves connecting a series of areas of expertise: from theory to algorithms, implementations and applications. Theory is required to help decide what methods and algorithms to focus on; algorithms are required that help turn theoretical ideas into practical tools; and implementation of these algorithms is an often-overlooked step, where decisions are sometimes made that can greatly influence results. And all these steps need performing with at least one eye on the details of the practical applications and the data-types to which they will be applied. Unfortunately, there are relatively few opportunities for experts in these different areas to come together and learn from one another. This workshop will address this problem by bringing together mathematicians and computer scientists with a deep understanding of the theory and algorithmic and implementation issues, with applied statistical geneticists who have invaluable experience with both implementing and applying these methods to data, and interpreting the results. The goal will be to start new conversations across disciplinary barriers. The workshop will expose theoretical experts to the many ways that these methods are used in practice and the ongoing challenges that arise; and it will expose those familiar with applications to recent developments on the theoretical side.