(*) Equal contribution

AI Safety & LLM Alignment

Modern LLMs are increasingly trained to maximize human feedback. But this creates a perverse incentive: the model can obtain higher reward by manipulating or deceiving users rather than genuinely helping them. We show that RL training on user feedback leads to targeted manipulation of users who are vulnerable to such strategies, with implications for how we deploy and oversee these systems.

On Targeted Manipulation and Deception when Optimizing LLMs for User Feedback Marcus Williams, Micah Carroll, Adhyyan Narang, Constantin Weisser, Brendan Murphy, Anca Dragan. ICLR 2025 (arxiv)

Feedback Loops & Multi-Agent Learning

When ML models are deployed in the real world, they shape the data they are trained on and the behavior of other agents. A recommender system influences what users consume; a pricing algorithm changes competitor behavior; a language model shapes the distribution of text it will later be fine-tuned on. I develop game-theoretic and optimization frameworks for understanding and controlling these feedback dynamics, building on tools from performative prediction, statistical learning and dynamical systems.

Dynamics of Learning under User Choice: Overspecialization and Peer-Model Probing Adhyyan Narang, Sarah Dean, Lillian J Ratliff, Maryam Fazel. Preprint 2026 (arxiv)

Multiplayer performative prediction: Learning in Decision-Dependent Games Adhyyan Narang, Evan Faulkner, Dmitriy Drusvyatskiy, Maryam Fazel, Lillian J Ratliff. JMLR 2023 (arxiv)

Decision Dependent Learning in the Presence of Competition Adhyyan Narang, Evan Faulkner, Dmitriy Drusvyatskiy, Maryam Fazel, Lillian J Ratliff. AIStats 2022 (PMLR)

Global Convergence to Local Minmax Equilibrium in Classes of Nonconvex Zero-Sum Games. Tanner Fiez, Lillian J. Ratliff, Eric Mazumdar, Evan Faulkner, Adhyyan Narang. Neurips 2021 (NeurIPS)

Sample-Efficient Learning

How can agents learn to make good decisions with as few samples as possible? We study this question in two settings: reinforcement learning, where we show how to exploit structural similarities between policies to dramatically reduce sample complexity; and black-box optimization of submodular + supermodular objectives under bandit feedback, with applications to active learning, summarization, and content recommendation.

Sample Complexity Reduction via Policy Difference Estimation in Tabular Reinforcement Learning Adhyyan Narang, Andrew Wagenmaker, Lillian J. Ratliff, Kevin Jamieson. NeurIPS 2024 (arxiv)

Online SuBmodular + SuPermodular (BP) Maximization with Bandit Feedback Adhyyan Narang, Omid Sadeghi, Lillian J Ratliff, Maryam Fazel, Jeff Bilmes. UAI 2024 (arxiv)

Foundations of Overparameterized Learning

Modern neural networks have far more parameters than training points, yet generalize well in practice — defying classical statistical theory. We study this phenomenon in linear models, showing that classification is fundamentally easier than regression in overparameterized settings, and that overparameterization can make models brittle to adversarial perturbations even when standard performance is good. We also study meta-learning under overparameterization, giving insight into where useful inductive biases come from.

Classification and Adversarial examples in an Overparameterized Linear Model: A Signal Processing Perspective Adhyyan Narang, Vidya Muthukumar, Anant Sahai. Short version in ICML OPPO Workshop 2021 (arxiv)

Towards Sample-Efficient Overparameterized Meta-Learning Yue Sun, Adhyyan Narang, Ibrahim Gulluk, Samet Oymak, Maryam Fazel. Neurips 2021 (arxiv)

Classification vs regression in overparameterized regimes: Does the loss function matter? Vidya Muthukumar*, Adhyyan Narang*, Vignesh Subramanian*, Mikhail Belkin, Daniel Hsu, Anant Sahai. JMLR 2021 (arxiv)