As part of IMSI’s long-program on Theoretical Advances in Reinforcement Learning and Control, the Reinforcement Learning Bootcamp is designed as an intensive, foundational tutorial series. Its goal is to equip participants with the core theoretical tools and conceptual framework needed to fully engage with the subsequent workshops in the program.
This bootcamp is especially targeted at early-career researchers, graduate students, postdocs, and others who plan to contribute to the frontier of RL theory or control theory. Participants will benefit most if they come with basic familiarity in probability, linear algebra, optimization, and basic machine learning, although the tutorials aim to bring everyone up to speed.
In-Person Registration
Seats are limited at the venue, which means that in-person registration may be capped prior to the workshop start date. If capacity is reached, a waitlist will be imposed, which the registration form will reflect. Early registration is strongly encouraged.
All in-person registrants must wait to receive an invitation to attend in-person from IMSI before traveling, which generally begin to be sent out 4-6 weeks in advance.
All registrants (online and in-person) will receive zoom links and are welcome to attend online.
R. Srikant
University of Illinois, Urbana-Champaign
M
S
Max Simchowitz
Carnegie Mellon University
Schedule
Monday, March 9, 2026
8:30-8:55 CDT
Breakfast/Check-in
8:55-9:00 CDT
Welcome
9:00-10:30 CDT
Tutorial on Offline RL Theory
Speaker: Nan Jiang (University of Illinois at Urbana-Champaign)
This tutorial will provide an introduction to the core ideas and results in offline RL theory, focusing on the setting of large state spaces and function approximation. Tentatively, the first part of the tutorial will establish the analyses of classic algorithms under the key assumptions on function approximation (such as Bellman-completeness) and the data distribution (i.e., coverage). The second part considers more advanced algorithms and analyses that rely on weaker or alternative assumptions, and extensions to novel settings beyond the standard single-agent MDPs. Participants are expected to be familiar with the theoretical foundation of MDPs (e.g., classic convergence analysis for tabular value iteration/policy iteration) and the basics of learning theory (e.g., concentration inequalities and union bound).
10:30-11:00 CDT
Coffee Break
11:00-12:30 CDT
Tutorial on Offline RL Theory
Speaker: Nan Jiang (University of Illinois at Urbana-Champaign)
This tutorial will provide an introduction to the core ideas and results in offline RL theory, focusing on the setting of large state spaces and function approximation. Tentatively, the first part of the tutorial will establish the analyses of classic algorithms under the key assumptions on function approximation (such as Bellman-completeness) and the data distribution (i.e., coverage). The second part considers more advanced algorithms and analyses that rely on weaker or alternative assumptions, and extensions to novel settings beyond the standard single-agent MDPs. Participants are expected to be familiar with the theoretical foundation of MDPs (e.g., classic convergence analysis for tabular value iteration/policy iteration) and the basics of learning theory (e.g., concentration inequalities and union bound).
12:30-13:30 CDT
Lunch Break
13:30-14:00 CDT
Break
14:00-15:30 CDT
Foundations of Behavior Cloning
Speaker: Max Simchowitz (Carnegie-Mellon University)
This talk will introduce the foundations of behavior cloning - a setting in which sequential decision making is trained via supervision from expert demonstration. Our tutorial will focus on the role of problem horizon - the number of decision making steps - and the possibility of error being amplified as horizon increases. We will review classical results for mitigating error amplification, such as the Dagger algorithm, as well as more modern contributions that study the role of compounding in contemporary applications, such as in large language models and robotics.
15:30-16:30 CDT
Social Hour
Tuesday, March 10, 2026
8:30-9:00 CDT
Breakfast/Check-in
9:00-10:30 CDT
Elements of Interactive Decision Making
Speaker: Sasha Rakhlin (Massachusetts Institute of Technology (MIT))
Machine learning methods are increasingly deployed in interactive environments, ranging from dynamic treatment strategies in medicine to fine-tuning of LLMs using reinforcement learning. In these settings, the learning agent interacts with the environment to collect data and necessarily faces an exploration-exploitation dilemma.In these lectures, we’ll begin with multi-armed bandits, progressing through structured and contextual bandits. We’ll then move on to reinforcement learning and broader decision-making frameworks, outlining the key algorithmic approaches and statistical principles that underpin each setting. Our goal is to develop both a rigorous understanding of the learning guarantees and a toolbox of fundamental algorithms.
10:30-11:00 CDT
Coffee Break
11:00-12:30 CDT
Elements of Interactive Decision Making
Speaker: Sasha Rakhlin (Massachusetts Institute of Technology (MIT))
Machine learning methods are increasingly deployed in interactive environments, ranging from dynamic treatment strategies in medicine to fine-tuning of LLMs using reinforcement learning. In these settings, the learning agent interacts with the environment to collect data and necessarily faces an exploration-exploitation dilemma.In these lectures, we’ll begin with multi-armed bandits, progressing through structured and contextual bandits. We’ll then move on to reinforcement learning and broader decision-making frameworks, outlining the key algorithmic approaches and statistical principles that underpin each setting. Our goal is to develop both a rigorous understanding of the learning guarantees and a toolbox of fundamental algorithms.
12:30-13:30 CDT
Lunch Break
13:30-14:30 CDT
Break
14:30-16:00 CDT
Introduction to Online Nonstochastic Control
Speaker: Karan Singh (Carnegie Mellon University)
This talk will present a first-principles approach to online nonstochastic control, an emerging paradigm in control of dynamical systems that gives provable learning-theory-inspired performance guarantees without making any distributional assumptions. The traditional approaches to control present a present a dichotomy between planning for the average case (stochastic) and the worst case (robust control). Instead, in this framework, the learner's objective is to ensure small regret in comparison to the best controller in hindsight from a suitably chosen benchmark class, thus delivering near-optimal performance on both worst- and average-case instances in a unified way. This talk will introduce the basic theory of non-stochastic control, including extensions to partially observed and unknown systems. Towards the end, we will survey more recent developments, encompassing faster algorithms and extensions to nonlinear systems.
Wednesday, March 11, 2026
8:30-9:00 CDT
Breakfast/Check-in
9:00-10:30 CDT
Elements of Interactive Decision Making
Speaker: Sasha Rakhlin (Massachusetts Institute of Technology (MIT))
Machine learning methods are increasingly deployed in interactive environments, ranging from dynamic treatment strategies in medicine to fine-tuning of LLMs using reinforcement learning. In these settings, the learning agent interacts with the environment to collect data and necessarily faces an exploration-exploitation dilemma.In these lectures, we’ll begin with multi-armed bandits, progressing through structured and contextual bandits. We’ll then move on to reinforcement learning and broader decision-making frameworks, outlining the key algorithmic approaches and statistical principles that underpin each setting. Our goal is to develop both a rigorous understanding of the learning guarantees and a toolbox of fundamental algorithms.
10:30-11:00 CDT
Coffee Break
11:00-12:30 CDT
Stochastic Optimal Control of LQG Systems
Speaker: R. Srikant (University of Illinois at Urbana-Champaign)
We will provide a concise introduction to the problem of optimally controlling a linear system driven by white, Gaussian nose, and where the cost function is quadratic. We will comment on similarities and differences with the discrete state and action space MDPs, which are the most common model in the reinforcement literature. Unlike general MDPs, LQG problems with both full information and partial information admit closed-form optimal solutions. The tutorial will present a derivation of these results.
12:30-13:30 CDT
Lunch Break
13:30-14:30 CDT
Break
14:30-16:00 CDT
Stochastic Optimal Control of LQG Systems
Speaker: R. Srikant (University of Illinois at Urbana-Champaign)
We will provide a concise introduction to the problem of optimally controlling a linear system driven by white, Gaussian nose, and where the cost function is quadratic. We will comment on similarities and differences with the discrete state and action space MDPs, which are the most common model in the reinforcement literature. Unlike general MDPs, LQG problems with both full information and partial information admit closed-form optimal solutions. The tutorial will present a derivation of these results.
Thursday, March 12, 2026
8:30-9:00 CDT
Breakfast/Check-in
9:00-10:30 CDT
Introduction to Reinforcement Learning Methods for Mean Field Control and Mean Field Games
We will review reinforcement learning methods for mean field models, both in the cooperative (mean field control) and non-cooperative (mean field game) settings. After presenting general heuristics, we will focus on a few specific algorithms for which convergence has been rigorously proved. We will then provide numerical illustrations on several examples borrowed from the literature.
10:30-11:00 CDT
Coffee Break
11:00-12:30 CDT
Introduction to Reinforcement Learning Methods for Mean Field Control and Mean Field Games
We will review reinforcement learning methods for mean field models, both in the cooperative (mean field control) and non-cooperative (mean field game) settings. After presenting general heuristics, we will focus on a few specific algorithms for which convergence has been rigorously proved. We will then provide numerical illustrations on several examples borrowed from the literature.
12:30-13:30 CDT
Lunch Break
13:30-14:30 CDT
Break
14:30-16:00 CDT
Introduction to Reinforcement Learning Methods for Mean Field Control and Mean Field Games
We will review reinforcement learning methods for mean field models, both in the cooperative (mean field control) and non-cooperative (mean field game) settings. After presenting general heuristics, we will focus on a few specific algorithms for which convergence has been rigorously proved. We will then provide numerical illustrations on several examples borrowed from the literature.
Friday, March 13, 2026
8:30-9:00 CDT
Breakfast/Check-in
9:00-10:30 CDT
Introduction to Reinforcement Learning Methods for Mean Field Control and Mean Field Games
We will review reinforcement learning methods for mean field models, both in the cooperative (mean field control) and non-cooperative (mean field game) settings. After presenting general heuristics, we will focus on a few specific algorithms for which convergence has been rigorously proved. We will then provide numerical illustrations on several examples borrowed from the literature.
10:30-11:00 CDT
Coffee Break
11:00-12:00 CDT
Open Discussion
Registration
IMSI is committed to making all of our programs and events inclusive and accessible.
Contact [email protected] to request
disability-related accommodations.
In order to register for this workshop, you must have an IMSI account and be logged in.