Philipp Moritz | Researcher Profile | Sotabase

Career

· Co-founder And CTO, Anyscale2019–

· PhD Student, UC Berkeley2013–2019

Publications (22)

Trust Region Policy Optimization

International Conference on Machine Learning · 2015

7,583

cited

High-Dimensional Continuous Control Using Generalized Advantage Estimation

International Conference on Learning Representations · 2015

4,104

cited

Ray: A Distributed Framework for Emerging AI Applications

USENIX Symposium on Operating Systems Design and Implementation · 2017

1,516

cited

Tune: A Research Platform for Distributed Model Selection and Training

arXiv.org · 2018

1,056

cited

RLlib: Abstractions for Distributed Reinforcement Learning

International Conference on Machine Learning · 2017

981

cited

A Linearly-Convergent Stochastic L-BFGS Algorithm

International Conference on Artificial Intelligence and Statistics · 2015

253

cited

Ray RLLib: A Composable and Scalable Reinforcement Learning Library

Neural Information Processing Systems · 2017

178

cited

SparkNet: Training Deep Networks in Spark

International Conference on Learning Representations · 2015

175

cited

Real-Time Machine Learning: The Missing Pieces

USENIX Workshop on Hot Topics in Operating Systems · 2017

cited

Lineage stash: fault tolerance off the critical path

Symposium on Operating Systems Principles · 2019

cited

Policy Gradient Search: Online Planning and Expert Iteration without Search Trees

arXiv.org · 2019

cited

Hoplite: efficient and fault-tolerant collective communication for task-based distributed systems

Conference on Applications, Technologies, Architectures, and Protocols for Computer Communication · 2020

cited

ESCHER: expressive scheduling with ephemeral resources

ACM Symposium on Cloud Computing · 2022

cited

SkyRL-Agent: Efficient RL Training for Multi-turn LLM Agent

arXiv.org · 2025

cited

Ray: A Distributed Execution Engine for the Machine Learning Ecosystem

2019

cited

Discriminating between causal structures in Bayesian Networks given partial observations

Kybernetika (Praha) · 2014

cited

Flexible Primitives for Distributed Deep Learning in Ray

2018

cited

Ray RLlib: A Framework for Distributed Reinforcement Learning

2017

cited

ARF-RLHF: Adaptive Reward-Following for RLHF through Emotion-Driven Self-Supervision and Trace-Biased Dynamic Optimization

Distributed Training for Reinforcement Learning

2020

Sotabase