Alexander Wei | Researcher Profile | Sotabase

Career

· PhD in Computer Science, UC Berkeley2020–

Publications (22)

Jailbroken: How Does LLM Safety Training Fail?

Neural Information Processing Systems · 2023

1,483

cited

Human-level play in the game of Diplomacy by combining language models with strategic reasoning

Science · 2022

476

cited

Optimal Robustness-Consistency Trade-offs for Learning-Augmented Online Algorithms

Neural Information Processing Systems · 2020

112

cited

More Than a Toy: Random Matrix Models Predict How Real-World Neural Representations Generalize

International Conference on Machine Learning · 2022

cited

Better and Simpler Learning-Augmented Online Caching

International Workshop and International Workshop on Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques · 2020

cited

Covert Malicious Finetuning: Challenges in Safeguarding LLM Adaptation

International Conference on Machine Learning · 2024

cited

Predicting Out-of-Distribution Error with the Projection Norm

International Conference on Machine Learning · 2022

cited

Learning Equilibria in Matching Markets from Bandit Feedback

Neural Information Processing Systems · 2021

cited

Learning in Stackelberg Games with Non-myopic Agents

ACM Conference on Economics and Computation · 2022

cited

TCT: Convexifying Federated Learning using Bootstrapped Neural Tangent Kernels

Neural Information Processing Systems · 2022

cited

Temperature Self-Calibration of Always-On, Field-Deployed Ion-Selective Electrodes Based on Differential Voltage Measurement.

ACS Sensors · 2022

cited

Designing Approximately Optimal Search on Matching Platforms

ACM Conference on Economics and Computation · 2021

cited

Optimal Las Vegas Approximate Near Neighbors in 𝓁p

ACM-SIAM Symposium on Discrete Algorithms · 2018

cited

Allocation for Social Good: Auditing Mechanisms for Utility Maximization

ACM Conference on Economics and Computation · 2019

cited

X-Guard: Multilingual Guard Agent for Content Moderation

arXiv.org · 2025

cited

An Interscholastic Network To Generate LexA Enhancer Trap Lines in Drosophila

G3: Genes, Genomes, Genetics · 2019

cited

Varying the Number of Signals in Matching Markets

Workshop on Internet and Network Economics · 2018

cited

Learning Equilibria in Matching Markets with Bandit Feedback

Journal of the ACM · 2023

cited

Optimal Las Vegas Approximate Near Neighbors in l_p

ACM-SIAM Symposium on Discrete Algorithms · 2019

cited

OpenAI Research Engineer Interviews (Multiple Choice)