Siyuan Zhuang | Researcher Profile | Sotabase

Career

· Leadership Team, Stealth Startup2024–

· Software Engineer Intern, Anyscale2020–2020

· Graduate Student Researcher, Berkeley RISE Lab2019–2024

· Teaching Assistant, University of Science and Technology of China2016–2017

Publications (28)

Judging LLM-as-a-judge with MT-Bench and Chatbot Arena

Neural Information Processing Systems · 2023

6,903

cited

Efficient Memory Management for Large Language Model Serving with PagedAttention

Symposium on Operating Systems Principles · 2023

4,514

cited

LMSYS-Chat-1M: A Large-Scale Real-World LLM Conversation Dataset

International Conference on Learning Representations · 2023

346

cited

TeraPipe: Token-Level Pipeline Parallelism for Training Large-Scale Language Models

International Conference on Machine Learning · 2021

155

cited

SkyPilot: An Intercloud Broker for Sky Computing

Symposium on Networked Systems Design and Implementation · 2023

cited

Hoplite: efficient and fault-tolerant collective communication for task-based distributed systems

Conference on Applications, Technologies, Architectures, and Protocols for Computer Communication · 2020

cited

This paper is included in the Proceedings of the 20th USENIX Symposium on Networked Systems Design and Implementation.

cited

HuatuoGPT, Towards Taming Language Models To Be a Doctor

cited

sensAI: ConvNets Decomposition via Class Parallelism for Fast Inference on Live Data

Conference on Machine Learning and Systems · 2021

cited

Composing MPC With LQR and Neural Network for Amortized Efficiency and Stable Control

IEEE Transactions on Automation Science and Engineering · 2021

cited

Ask Again, Then Fail: Large Language Models’ Vacillations in Judgment

Volume 1 · 2024

cited

Rearchitecting In-Memory Object Stores for Low Latency

Proceedings of the VLDB Endowment · 2021

cited

Starburst: A Cost-aware Scheduler for Hybrid Cloud

USENIX Annual Technical Conference · 2024

cited

Composing MPC with LQR and Neural Networks for Efficient and Stable Control

arXiv.org · 2021

cited

From KMMLU-Redux to Pro: A Professional Korean Benchmark Suite for LLM Evaluation

Conference on Empirical Methods in Natural Language Processing · 2025

cited

2025 RAG+: Enhancing Retrieval-Augmented Generation with Application-Aware Reasoning

A Statistical Framework for Ranking LLM-Based Chatbots

2024

Bearing run-to-failure datasets of UNSW

2021

B ENCH : Benchmarking Uncertainty in Large Language Models with Multiple Choice Questions

Budget-aware Test-time Scaling via Discriminative Verification

2025

Sotabase