Xuandong Zhao | Researcher Profile | Sotabase

Career

· Postdoctoral Researcher, UC Berkeley2024–

Publications (65)

Humanity's Last Exam

Robotics · 2025

284

cited

Provable Robust Watermarking for AI-Generated Text

International Conference on Learning Representations · 2023

277

cited

Monitoring AI-Modified Content at Scale: A Case Study on the Impact of ChatGPT on AI Conference Peer Reviews

International Conference on Machine Learning · 2024

182

cited

Mapping the Increasing Use of LLMs in Scientific Papers

arXiv.org · 2024

129

cited

Invisible Image Watermarks Are Provably Removable Using Generative AI

Neural Information Processing Systems · 2023

114

cited

Learning to Reason without External Rewards

arXiv.org · 2025

114

cited

Protecting Language Generation Models via Invisible Watermarking

International Conference on Machine Learning · 2023

111

cited

Pride and Prejudice: LLM Amplifies Self-Bias in Self-Refinement

Annual Meeting of the Association for Computational Linguistics · 2024

cited

Weak-to-Strong Jailbreaking on Large Language Models

International Conference on Machine Learning · 2024

cited

Scalable Best-of-N Selection for Large Language Models via Self-Certainty

arXiv.org · 2025

cited

MarkLLM: An Open-Source Toolkit for LLM Watermarking

Conference on Empirical Methods in Natural Language Processing · 2024

cited

A Survey on Detection of LLMs-Generated Content

Conference on Empirical Methods in Natural Language Processing · 2023

cited

The Hidden Risks of Large Reasoning Models: A Safety Assessment of R1

arXiv.org · 2025

cited

DE-COP: Detecting Copyrighted Content in Language Models Training Data

International Conference on Machine Learning · 2024

cited

SoK: Watermarking for AI-Generated Content

IEEE Symposium on Security and Privacy · 2024

cited

An undetectable watermark for generative image models

IACR Cryptology ePrint Archive · 2024

cited

Reward Shaping to Mitigate Reward Hacking in RLHF

arXiv.org · 2025

cited

Pre-trained Language Models Can be Fully Zero-Shot Learners

Annual Meeting of the Association for Computational Linguistics · 2022

cited

PromptArmor: Simple yet Effective Prompt Injection Defenses

arXiv.org · 2025

cited

CodeHalu: Investigating Code Hallucinations in LLMs via Execution-based Verification

AAAI Conference on Artificial Intelligence · 2024

cited

Sotabase