Sotabase
Home
Researchers
Career
·
Researcher
,
UC Berkeley Berkeley Artificial Intelligence Research Lab (BAIR)
2024–
·
PhD student (paused) in computer science
,
UC Berkeley
Publications
(11)
Alignment faking in large language models
arXiv.org · 2024
148
cited
COLA: Consistent Learning with Opponent-Learning Awareness
International Conference on Machine Learning · 2022
59
cited
A New Formalism, Method and Open Issues for Zero-Shot Coordination
International Conference on Machine Learning · 2021
42
cited
Auditing language models for hidden objectives
arXiv.org · 2025
26
cited
Path Independent Equilibrium Models Can Better Exploit Test-Time Computation
Neural Information Processing Systems · 2022
23
cited
Incentivizing honest performative predictions with proper scoring rules
Conference on Uncertainty in Artificial Intelligence · 2023
10
cited
Normative Disagreement as a Challenge for Cooperative AI
arXiv.org · 2021
10
cited
Conditioning Predictive Models: Risks and Strategies
arXiv.org · 2023
8
cited
The Evidentialist's Wager
2021
8
cited
Similarity-based cooperative equilibrium
Neural Information Processing Systems · 2022
7
cited
Modeling evidential cooperation in large worlds
2023
Sotabase
Johannes Treutlein | Researcher Profile | Sotabase | Sotabase