Sotabase
Home
Researchers
Career
·
Lecturer in Machine Learning and Departmental Lecturer in Engineering Science
,
University of Oxford
2022–2025
Publications
(28)
Maxmin Q-learning: Controlling the Estimation Bias of Q-learning
International Conference on Learning Representations · 2020
210
cited
Organizing Experience: a Deeper Look at Replay Mechanisms for Sample-Based Planning in Continuous State Domains
International Joint Conference on Artificial Intelligence · 2018
48
cited
The In-Sample Softmax for Offline Reinforcement Learning
International Conference on Learning Representations · 2023
30
cited
Accelerated Gradient Temporal Difference Learning
AAAI Conference on Artificial Intelligence · 2016
28
cited
Fuzzy Tiling Activations: A Simple Approach to Learning Sparse Representations Online
International Conference on Learning Representations · 2021
23
cited
Hill Climbing on Value Estimates for Search-control in Dyna
International Joint Conference on Artificial Intelligence · 2019
19
cited
Reinforcement Learning with Function-Valued Action Spaces for Partial Differential Equation Control
International Conference on Machine Learning · 2018
19
cited
Understanding and mitigating the limitations of prioritized experience replay
Conference on Uncertainty in Artificial Intelligence · 2020
19
cited
Frequency-based Search-control in Dyna
International Conference on Learning Representations · 2020
15
cited
An Alternative to Variance: Gini Deviation for Risk-averse Policy Gradient
Neural Information Processing Systems · 2023
13
cited
An implicit function learning approach for parametric modal regression
Neural Information Processing Systems · 2020
12
cited
Effective sketching methods for value function approximation
Conference on Uncertainty in Artificial Intelligence · 2017
12
cited
Incremental Truncated LSTD
International Joint Conference on Artificial Intelligence · 2015
11
cited
Actor-Expert: A Framework for using Action-Value Methods in Continuous Action Spaces
arXiv.org · 2018
10
cited
Adapting Kernel Representations Online Using Submodular Maximization
International Conference on Machine Learning · 2017
10
cited
Greedy Actor-Critic: A New Conditional Cross-Entropy Method for Policy Improvement
International Conference on Learning Representations · 2018
10
cited
Improving Adversarial Transferability via Model Alignment
European Conference on Computer Vision · 2023
10
cited
Conditionally Optimistic Exploration for Cooperative Deep Multi-Agent Reinforcement Learning
Conference on Uncertainty in Artificial Intelligence · 2023
9
cited
Memory-efficient Reinforcement Learning with Value-based Knowledge Consolidation
Trans. Mach. Learn. Res. · 2022
9
cited
Understanding the robustness difference between stochastic gradient descent and adaptive gradient methods
Trans. Mach. Learn. Res. · 2023
8
cited
Show all 28 papers →
Sotabase
Yangchen Pan | Researcher Profile | Sotabase | Sotabase