Alexander Wei

I'm Alex Wei, a research scientist at OpenAI working on LLMs and reasoning.

Previously, I worked at the intersections of machine learning, game theory, and algorithms. I was part of the team at FAIR that built CICERO (Science, 2022), the first human-level AI for the game of Diplomacy. My research has been recognized with a SODA Best Student Paper award and an INFORMS Auctions & Market Design Rothkopf Prize.

I received my Ph.D. in Computer Science from UC Berkeley in 2023, advised by Nika Haghtalab, Michael I. Jordan, and Jacob Steinhardt. Before that, I completed my A.B. and S.M. at Harvard in 2020, advised by Jelani Nelson and Scott Kominers.

Selected Works

  denotes alphabetical ordering

Jailbroken: How Does LLM Safety Training Fail?

NeurIPS 2023

Alexander Wei, Nika Haghtalab, and Jacob Steinhardt

NeurIPS 2023 Oral Presentation

proc talk arXiv

@article{wei2024jailbroken, title={Jailbroken: How does {LLM} safety training fail?}, author={Wei, Alexander and Haghtalab, Nika and Steinhardt, Jacob}, journal={Advances in Neural Information Processing Systems}, volume={36}, year={2024} }

Learning Equilibria in Matching Markets from Bandit Feedback

Journal of the ACM, 2023

Meena Jagadeesan, Alexander Wei, Yixin Wang, Michael I. Jordan, and Jacob Steinhardt

NeurIPS 2021 Spotlight Presentation

journal poster arXiv

@article{jagadeesan2023learning, title={Learning Equilibria in Matching Markets with Bandit Feedback}, author={Jagadeesan, Meena and Wei, Alexander and Wang, Yixin and Jordan, Michael I and Steinhardt, Jacob}, journal={Journal of the ACM}, volume={70}, number={3}, pages={1--46}, year={2023} }

Human-Level Play in the Game of Diplomacy by Combining Language Models with Strategic Reasoning

Science, 2022

Meta Fundamental AI Research Diplomacy Team (FAIR)

journal

@article{fair2022diplomacy, author = {Meta Fundamental AI Research Diplomacy Team (FAIR) and Anton Bakhtin and Noam Brown and Emily Dinan and Gabriele Farina and Colin Flaherty and Daniel Fried and Andrew Goff and Jonathan Gray and Hengyuan Hu and Athul Paul Jacob and Mojtaba Komeili and Karthik Konath and Minae Kwon and Adam Lerer and Mike Lewis and Alexander H. Miller and Sasha Mitts and Adithya Renduchintala and Stephen Roller and Dirk Rowe and Weiyan Shi and Joe Spisak and Alexander Wei and David Wu and Hugh Zhang and Markus Zijlstra}, title = {Human-level play in the game of \emph{Diplomacy} by combining language models with strategic reasoning}, journal = {Science}, volume = {378}, number = {6624}, pages = {1067--1074}, year = {2022}, }

Designing Approximately Optimal Search on Matching Platforms

Management Science, 2022

Nicole Immorlica, Brendan Lucier, Vahideh Manshadi, and Alexander Wei

INFORMS Auctions & Market Design Rothkopf Junior Researcher Paper Prize, 3^rd place

journal slides arXiv

@article{immorlica2022designing, author = {Immorlica, Nicole and Lucier, Brendan and Manshadi, Vahideh and Wei, Alexander}, title = {Designing Approximately Optimal Search on Matching Platforms}, journal = {Management Science}, volume = {69}, number = {8}, pages = {4609--4626}, year = {2022}, }

Optimal Las Vegas Approximate Near Neighbors in

\ell_p

ACM Transactions on Algorithms, 2022

Alexander Wei

SODA 2019 Best Student Paper

journal arXiv

@article{wei2022optimal, title={Optimal {Las Vegas} Approximate Near Neighbors in $\ell_p$}, author={Wei, Alexander}, journal={ACM Transactions on Algorithms}, volume={18}, number={1}, pages={1--27}, year={2022} }

Selected Awards

Meta Research PhD Fellowship (2022-2023)

NSF Graduate Research Fellowship (2020-2023)

CRA Outstanding Undergraduate Researcher (2020)

Gold Medal, International Olympiad in Informatics (2015)

Selected Works

Selected Awards

Contact