Sotabase
Home
Researchers
Career
·
Software Engineer
,
Mixpanel
2018–2020
·
MS in Computer Science
,
New York University
·
PhD Student
,
New York University
·
Graduate Assistant
,
NYU Tandon School of Engineering
Publications
(79)
Pre-train, Prompt, and Predict: A Systematic Survey of Prompting Methods in Natural Language Processing
ACM Computing Surveys · 2021
4,937
cited
BARTScore: Evaluating Generated Text as Text Generation
Neural Information Processing Systems · 2021
1,008
cited
Self-Rewarding Language Models
International Conference on Machine Learning · 2024
489
cited
FacTool: Factuality Detection in Generative AI - A Tool Augmented Framework for Multi-Task and Multi-Domain Scenarios
arXiv.org · 2023
275
cited
Iterative Reasoning Preference Optimization
Neural Information Processing Systems · 2024
207
cited
Meta-Rewarding Language Models: Self-Improving Alignment with LLM-as-a-Meta-Judge
Conference on Empirical Methods in Natural Language Processing · 2024
162
cited
Generative Judge for Evaluating Alignment
International Conference on Learning Representations · 2023
153
cited
O1 Replication Journey: A Strategic Progress Report - Part 1
arXiv.org · 2024
139
cited
Can We Automate Scientific Reviewing?
Journal of Artificial Intelligence Research · 2021
109
cited
O1 Replication Journey - Part 2: Surpassing O1-preview through Simple Distillation, Big Progress or Bitter Lesson?
arXiv.org · 2024
90
cited
ExplainaBoard: An Explainable Leaderboard for NLP
Annual Meeting of the Association for Computational Linguistics · 2021
57
cited
Self-Taught Evaluators
arXiv.org · 2024
55
cited
NaturalReasoning: Reasoning in the Wild with 2.8M Challenging Questions
arXiv.org · 2025
47
cited
Thinking LLMs: General Instruction Following with Thought Generation
arXiv.org · 2024
43
cited
Following Length Constraints in Instructions
Conference on Empirical Methods in Natural Language Processing · 2024
35
cited
Self-Consistency Preference Optimization
International Conference on Machine Learning · 2024
23
cited
T5Score: Discriminative Fine-tuning of Generative Evaluation Metrics
Conference on Empirical Methods in Natural Language Processing · 2022
22
cited
An Overview of Large Language Models for Statisticians
arXiv.org · 2025
16
cited
Bridging Offline and Online Reinforcement Learning for LLMs
arXiv.org · 2025
16
cited
CoT-Self-Instruct: Building high-quality synthetic prompts for reasoning and non-reasoning tasks
arXiv.org · 2025
14
cited
Show all 79 papers →
Sotabase
Weizhe Yuan | Researcher Profile | Sotabase | Sotabase