Sotabase
Home
Researchers
Career
·
ML Research Engineer
,
Cerebras
·
Doctoral Assistant, Machine Learning and Optimization Laboratory
,
EPFL (École Polytechnique Fédérale de Lausanne)
·
MS Graduate
,
Stanford University
·
Autopilot Engineer
,
Tesla
Publications
(0)
Scaling Laws and Compute-Optimal Training Beyond Fixed Training Durations
2024
104
cited
Online Normalization for Training Neural Networks
2019
59
cited
Pipelined Backpropagation at Scale: Training Large Models without Batches
2020
35
cited
Rotational Equilibrium: How Weight Decay Balances Learning Across Neural Networks
2023
33
cited
Analyzing & Reducing the Need for Learning Rate Warmup in GPT Training
2024
15
cited
Multiplication-Free Transformer Training via Piecewise Affine Operations
2023
8
cited
Weight Decay may matter more than muP for Learning Rate Transfer in Practice
2025
7
cited
Adaptive Braking for Mitigating Gradient Delay
2020
4
cited
Training Dynamics of the Cooldown Stage in Warmup-Stable-Decay Learning Rate Scheduler
2025
4
cited
Analyzing & Eliminating Learning Rate Warmup in GPT Pre-Training
2
cited
Ghost Noise for Regularizing Deep Neural Networks
2023
2
cited
Memory Efficient Mixed-Precision Optimizers
2023
2
cited
Rotational Optimizers: Simple & Robust DNN Training
2023
Sotabase