Shixiang Shane Gu | Researcher Profile | Sotabase

Career

· Senior Staff Research Scientist, Google DeepMind2024–

· Staff Research Scientist, Google DeepMind2023–2024

· Research Scientist, Google2018–2023

Publications (71)

Large Language Models are Zero-Shot Reasoners

Neural Information Processing Systems · 2022

6,286

cited

Categorical Reparameterization with Gumbel-Softmax

International Conference on Learning Representations · 2016

5,977

cited

Scaling Instruction-Finetuned Language Models

Journal of machine learning research · 2022

3,860

cited

Deep reinforcement learning for robotic manipulation with asynchronous off-policy updates

IEEE International Conference on Robotics and Automation · 2016

1,555

cited

Continuous Deep Q-Learning with Model-based Acceleration

International Conference on Machine Learning · 2016

1,056

cited

A Minimalist Approach to Offline Reinforcement Learning

Neural Information Processing Systems · 2021

1,019

cited

Data-Efficient Hierarchical Reinforcement Learning

Neural Information Processing Systems · 2018

927

cited

Towards Deep Neural Network Architectures Robust to Adversarial Examples

International Conference on Learning Representations · 2014

883

cited

Large Language Models Can Self-Improve

Conference on Empirical Methods in Natural Language Processing · 2022

775

cited

Dynamics-Aware Unsupervised Discovery of Skills

International Conference on Learning Representations · 2019

460

cited

Aligning Text-to-Image Models using Human Feedback

arXiv.org · 2023

400

cited

Way Off-Policy Batch Deep Reinforcement Learning of Implicit Human Preferences in Dialog

arXiv.org · 2019

376

cited

Q-Prop: Sample-Efficient Policy Gradient with An Off-Policy Critic

International Conference on Learning Representations · 2016

357

cited

A Divergence Minimization Perspective on Imitation Learning Methods

Conference on Robot Learning · 2019

276

cited

Temporal Difference Models: Model-Free Deep RL for Model-Based Control

International Conference on Learning Representations · 2018

255

cited

Categorical Reparametrization with Gumble-Softmax

International Conference on Learning Representations · 2017

246

cited

Language as an Abstraction for Hierarchical Deep Reinforcement Learning

Neural Information Processing Systems · 2019

240

cited

Near-Optimal Representation Learning for Hierarchical Reinforcement Learning

International Conference on Learning Representations · 2018

225

cited

Sequence Tutor: Conservative Fine-Tuning of Sequence Generation Models with KL-control

International Conference on Machine Learning · 2016

204

cited

Interpolated Policy Gradient: Merging On-Policy and Off-Policy Gradient Estimation for Deep Reinforcement Learning

Neural Information Processing Systems · 2017

172

cited

Sotabase