Data & Information

The surging data generation capabilities of modern sensors and networked systems and the vastly increased data processing power of computers and storage media have led to the accumulation of enormous volumes of disparate data. The nascent field of data science focuses on developing scalable and robust algorithms for extracting knowledge from these stores of information. The growing need for powerful and novel methods to extract information from data, in a form that is useful to individuals, society, researchers, and industry, has led to a groundswell in machine learning. Recent progress has been remarkable. This success has been in large part driven by the increasing availability of large-scale training data sets, more powerful computers, and sophisticated algorithms for analyzing extremely large data sets. There is now intense interest in leveraging machine learning in many fields: automatic recognition of image content, identification of best practices in health care, improvement of agricultural yields, understanding how the human brain encodes information, and more.

However, many modern machine-learning algorithms lack interpretability, and can also be surprisingly fragile. Furthermore, training data can be skewed, resulting in unexpectedly “unfair” algorithms which can lead to bias. Although the developments to date, driven primarily by phenomenological considerations, have been remarkably successful, substantial work remains to be done in order to reach a fundamental understanding of why these methodologies actually succeed.

Progress can only come from the development of new and sophisticated mathematics and statistics. The study of data and information cuts across a myriad of disciplines, including computer science, statistics, optimization, and signal processing, and reaches into classical areas of mathematics. Furthermore, application-specific models and constraints in fields such as astrophysics, particle physics, biology, economics, and sociology present additional exciting opportunities for the mathematical and statistical analysis of data.

Upcoming Activities

View all Upcoming Activities

Long Program

Connectomics

September 14 — December 11, 2026

Workshop

Connectomics, Non-Euclidean data, and Statistics: Tutorial and Hands-on

September 14 — 18, 2026

Workshop

Unpacking AI Weather Emulators

September 28 — October 2, 2026

Past Activities

View all Past Activities

New Horizons on Model Transportability and Data Integration

June 22 — 26, 2026

Workshop

Mathematical Aspects of 2D Quantum Materials and Meta-materials

June 8 — 12, 2026

Explore Our Themes

Data & Information

Click to learn more

Health Care & Medicine

Historically, medicine has seen many applications of mathematics and statistics, with examples including the validation of the effectiveness of new drugs, estimation of survival rates for patients undergoing treatments, and medical imaging (CT scans and MRIs).

Click to learn more

Materials Science

Materials science is about the discovery, design and development of new materials in areas such as nanotechnology, biomedicine, metallurgy, forensic science, quantum computing, and development of more efficient energy resources.

Click to learn more

Quantum Computing & Information

There are many challenges, both practical and theoretical, in the emerging and exciting areas of quantum information and computing, which seek to make effective use of the information embedded in the state of a quantum system, and promise to solve previously intractable computational problems and revolutionize simulation.

Click to learn more

Uncertainty Quantification

Uncertainty is ubiquitous in the modern world. This raises profound challenges in any effort to model massively complex phenomena.

Click to learn more

Scientific activity at IMSI is organized around a set of themes which have been chosen as focal points for sustained engagement over many years.

Learn more