Frontiers in Online Reinforcement Learning

Description

This workshop highlights recent advances in online reinforcement learning (RL), with a focus on its connections to emerging technologies like large language models (LLMs). As machine learning systems grow more capable, online RL can further enhance their task specific capabilities.

Participants will explore the evolving RL landscape, discuss its integration with large-scale models, and examine challenges and opportunities at this intersection. Join us to engage with cutting-edge ideas shaping the future of online reinforcement learning.

Poster Session and Lightning Talks

This workshop will include a poster session and lightning talks for early career researchers (including graduate students). In order to propose a poster or a lightning talk, you must first register for the workshop, and then submit a proposal using the form that will become available on this page after you register. You can request to do one, or both. The registration form should not be used to propose a poster or a lightning talk.

The deadline for proposing is Wednesday, March 4, 2026. If your proposal is accepted, you should plan to attend the event in-person.

In-Person Registration

Seats are limited at the venue, which means that in-person registration may be capped prior to the workshop start date. If capacity is reached, a waitlist will be imposed, which the registration form will reflect. Early registration is strongly encouraged.

All in-person registrants must wait to receive an invitation to attend in-person from IMSI before traveling, which generally begin to be sent out 4-6 weeks in advance.

All registrants (online and in-person) will receive zoom links and are welcome to attend online.

Organizers

A Z

Andrea Zanette Carnegie Mellon University

L Y

Lin Yang University of California, Los Angeles (UCLA)

Speakers

K B

Kianté Brantley Harvard University

B D

Bo Dai GA Tech

Y D

Yaqi Duan New York University

B E

Benjamin Eysenbach Princeton University

N J

Natasha Jacques University of Washington

A K

Aviral Kumar Carnegie Mellon University

A P

Aldo Pacchiano Boston University

A S

Ayush Sekhari Boston University

L S

Laixi Shi Johns Hopkins University

Y S

Yuda Song Carnegie Mellon University

W S

Wen Sun Cornell University

G S

Gokul Swamy Carnegie Mellon University

A W

Andrew Wagenmaker University of California, Berkeley

M U

Masatoshi Uehara Evolutionary Scale

Z Y

Zhuoran Yang Yale University

X Z

Xuezhou Zhang Boston University

B Z

Banghua Zhu University of Washington and NVIDIA

Schedule

Monday, March 30, 2026

8:30-8:50 CDT

Breakfast/Check-in

8:50-9:00 CDT

Welcome

9:00-9:45 CDT

Open Discussion

9:45-10:00 CDT

Q&A

10:00-10:05 CDT

Tech break

10:05-10:50 CDT

Towards Practical Online Improvement of Pretrained Policies for Robotic Manipulation

Speaker: Andrew Wagenmaker (University of California, Berkeley)

Abstract +

10:50-11:05 CDT

Q&A

11:05-11:35 CDT

Coffee break

11:35-12:20 CDT

Self-Supervised Reinforcement Learning and Patterns in Time

Speaker: Benjamin Eysenbach (Princeton University)

Abstract +

12:20-12:35 CDT

Q&A

12:35-13:35 CDT

Lunch Break

13:35-14:20 CDT

Multi-turn and Multi-agent Reinforcement Learning Fine-Tuning of LLMs

Speaker: Natasha Jaques (University of Washington)

Abstract +

14:20-14:35 CDT

Q&A

14:35-15:35 CDT

Lighting Talks

15:35-16:30 CDT

Poster Session and Social Hour

Tuesday, March 31, 2026

8:30-9:00 CDT

Breakfast/Check-in

9:00-9:45 CDT

Building Deep Research Agents via Reinforcement Learning

Speaker: Wen Sun (Cornell University)

9:45-10:00 CDT

Q&A

10:00-10:05 CDT

Tech break

10:05-10:50 CDT

On the Mechanism and Dynamics of Modular Addition: Fourier Features, Lottery Ticket, and Grokking

Speaker: Zhuoran Yang (Yale University)

Abstract +

10:50-11:05 CDT

Q&A

11:05-11:35 CDT

Coffee break

11:35-12:20 CDT

Toward a Statistical Perspective on LLM Post-training: Preference Sampling and Gradient Reweighting

Speaker: Yaqi Duan (New York University)

Abstract +

12:20-12:35 CDT

Q&A

12:35-13:35 CDT

Lunch Break

13:35-14:20 CDT

Regression as Policy Optimization: Advantages In, Policies Out

Speaker: Kianté Brantley (Harvard University)

Abstract +

14:20-14:35 CDT

Q&A

14:35-15:00 CDT

Coffee break

15:00-15:45 CDT

TBA

Speaker: Ayush Sekhari (MIT)

15:45-16:00 CDT

Q&A

Wednesday, April 1, 2026

8:30-9:00 CDT

Breakfast/Check-in

9:00-9:45 CDT

AI that Learns How to Act: Toward Data-Driven Autonomous Scientific Discovery

Speaker: Aldo Pacchiano (Boston University)

Abstract +

9:45-10:00 CDT

Q&A

10:00-10:05 CDT

Tech break

10:05-10:50 CDT

Reinforcement Learning beyond Reward Maximization

Speaker: Yuda Song (Carnegie Mellon University)

Abstract +

10:50-11:05 CDT

Q&A

11:05-11:35 CDT

Coffee break

11:35-12:20 CDT

The Statistical Cost of Hyperparameter Tuning in Reinforcement Learning

Speaker: Xuezhou Zhang (Boston University)

Abstract +

12:20-12:35 CDT

Q&A

12:35-13:35 CDT

Lunch Break

13:35-14:20 CDT

Failure Patterns of LLM Agentic Reinforcement Learning

Speaker: Manling Li (Northwestern University)

Abstract +

14:20-14:35 CDT

Q&A

14:35-15:00 CDT

Coffee break

15:00-15:45 CDT

TBA

Speaker: Zhaoran Wang (Northwestern University)

15:45-16:00 CDT

Q&A

Thursday, April 2, 2026

8:30-9:00 CDT

Breakfast/Check-in

9:00-9:45 CDT

Reward-Guided Generation in Diffusion Models

Speaker: Masatoshi Uehara (Evolutionary Scale)

Abstract +

9:45-10:00 CDT

Q&A

10:00-10:05 CDT

Tech break

10:05-10:50 CDT

TBA

Speaker: Gokul Swamy (Carnegie Mellon University)

10:50-11:05 CDT

Q&A

11:05-11:35 CDT

Coffee break

11:35-12:20 CDT

Proactive Agents: Task Performance Isn’t the Only Goal

Speaker: Laixi Shi (Johns Hopkins University)

Abstract +

12:20-12:35 CDT

Q&A

12:35-13:35 CDT

Lunch Break

13:35-14:20 CDT

Exploration from a Primal-Dual Optimization Lens in Reinforcement Learning

Speaker: Bo Dai (Georgia Institute of Technology)

Abstract +

14:20-14:35 CDT

Q&A

14:35-15:00 CDT

Coffee break

15:00-16:00 CDT

Panel, Open Discussion, Working groups, Hands-on, etc

Friday, April 3, 2026

8:30-9:00 CDT

Breakfast/Check-in

9:00-9:45 CDT

Miles: Open Source RL for Large MoE Models

Speaker: Banghua Zhu (University of Washington and NVIDIA)

9:45-10:00 CDT

Q&A

10:00-10:05 CDT

Coffee break

10:30-11:15 CDT

TBA

Speaker: Aviral Kumar (Carnegie Mellon University)

11:15-11:30 CDT

Q&A

11:30-11:45 CDT

Workshop Survey

Registration

IMSI is committed to making all of our programs and events inclusive and accessible. Contact [email protected] to request disability-related accommodations.

In order to register for this workshop, you must have an IMSI account and be logged in.

Create an Account

Description

Poster Session and Lightning Talks

In-Person Registration

Organizers

Speakers

Schedule

Registration

Professional and Educational Information

Demographic Information

Scientific Themes