Sotabase
Home
Researchers
Career
·
Systems Researcher
,
Anthropic
2023–
Publications
(11)
Discovering Language Model Behaviors with Model-Written Evaluations
Annual Meeting of the Association for Computational Linguistics · 2022
621
cited
Towards Understanding Sycophancy in Language Models
International Conference on Learning Representations · 2023
512
cited
Measuring Faithfulness in Chain-of-Thought Reasoning
arXiv.org · 2023
321
cited
Question Decomposition Improves the Faithfulness of Model-Generated Reasoning
arXiv.org · 2023
108
cited
Specific versus General Principles for Constitutional AI
arXiv.org · 2023
44
cited
A data-centric optimization framework for machine learning
International Conference on Supercomputing · 2021
19
cited
Ask Again, Then Fail: Large Language Models’ Vacillations in Judgment
Volume 1 · 2024
6
cited
Python FPGA Programming with Data-Centric Multi-Level Design
arXiv.org · 2022
1
cited
Sotabase
Oliver Rausch | Researcher Profile | Sotabase | Sotabase