Events & News

February 2024 - Two preprints out that I am particularly excited about! One on efficient inverse RL, another on RL from Human Feedback.

September 2023 - One paper accepted to NeurIPS 2023!

August 2023 - I was featured on the TWIML AI Podcast! You can watch my interview here.

May 2023 - I’ll be spending the summer at Google, working with Alekh Agarwal, Chris Dann, and Rahul Kidambi on better algorithms for RLHF (reinforcement learning from human feedback).

May 2023 - I passed my thesis proposal! Check out my talk if you’re curious about the how the work I’ve been doing over the last few years fits together.

April 2023 - Our paper on exponentially faster algorithms for inverse reinforcement learning was accepted to ICML ‘23. The computational efficiency our algorithm provides complements the statistical efficiency our work at NeurIPS helped develop.

Research Highlights

SPO: Self-Play Preference Optimization

We derive a new fundamental algorithm for RLHF that robustly handles complex intransitive preferences while avoiding reward modeling and adversarial training. [Paper]

Hybrid Inverse RL

We derive a new flavor of inverse RL that uses expert demonstrations to speed up policy search without requiring the ability to reset the learner to arbitrary states. [Paper]

Inverse RL w/o RL

We derive exponentially faster algorithms for inverse RL by resetting the learner to states from the expert demonstrations within the RL subroutine. Our work was published at ICML '23. [Website][Paper]

Imitation w/ Unobserved Contexts

We describe algorithms for and conditions under which it is possible to imitate an expert who has access to privileged information. Our work was published at NeurIPS 2022. [Website][Blog]

Causal Imitation Learning under TCN

We use instrumental variable regression to derive imitation learning algorithms that are robust against temporally correlated noise both in theory and practice. Oral at ICML 2022. [Website][Paper]

Of Moments and Matching

We construct a taxonomy for imitation learning algorithms, derive bounds for each class, construct novel reduction-based algorithmic templates that achieve these bounds, and implement simple elegant realizations with competitive emperical performance. Published at ICML 2021. [Website][Blog]