Sotabase
Home
Researchers
Career
·
PhD in Computer Science
,
UC Berkeley
2023–
Publications
(17)
An Illusion of Progress? Assessing the Current State of Web Agents
arXiv.org · 2025
51
cited
PromptArmor: Simple yet Effective Prompt Injection Defenses
arXiv.org · 2025
38
cited
Progent: Programmable Privilege Control for LLM Agents
arXiv.org · 2025
34
cited
Frontier AI's Impact on the Cybersecurity Landscape
arXiv.org · 2025
23
cited
Improving LLM Safety Alignment with Dual-Objective Optimization
International Conference on Machine Learning · 2025
20
cited
AgentVigil: Generic Black-Box Red-teaming for Indirect Prompt Injection against LLM Agents
2025
18
cited
Are You Getting What You Pay For? Auditing Model Substitution in LLM APIs
arXiv.org · 2025
16
cited
UniFed: All-In-One Federated Learning Platform to Unify Open-Source Frameworks
2022
9
cited
CyberGym: Evaluating AI Agents'Real-World Cybersecurity Capabilities at Scale
2025
6
cited
Measuring Agents in Production
arXiv.org · 2025
6
cited
AGENTVIGIL: Automatic Black-Box Red-teaming for Indirect Prompt Injection against LLM Agents
Conference on Empirical Methods in Natural Language Processing · 2025
4
cited
Can LLMs Ask Good Questions?
2025
4
cited
AGENTFUZZER: Generic Black-Box Fuzzing for Indirect Prompt Injection against LLM Agents
arXiv.org · 2025
3
cited
DeServe: Towards Affordable Offline LLM Inference via Decentralization
arXiv.org · 2025
3
cited
A benchmark of expert-level academic questions to assess AI capabilities
Nature · 2026
1
cited
DevOps-Gym: Benchmarking AI Agents in Software DevOps Cycle
2026
FrontierCS: Evolving Challenges for Evolving Intelligence
arXiv.org · 2025
Sotabase
Tianneng Shi | Researcher Profile | Sotabase | Sotabase