My Blog

Collection of stuff I've written about that.

December 1, 2024

ZeRO to Hero: How to Build Your Own Adaptive Learning-Rate Optimizers

Learn how to implement popular adaptive optimizers such as Adam, AdamW, Adagrad, and RMSProp.
November 24, 2024

ZeRO to Hero: Creating Momentum-based Optimizers from Scratch

In this blog, we begin embarking on our journey of rebuilding FSDP from scratch by re-implementing popular variants of the Stochastic Gradient Descent algorithm and building intuition for how they work, and what limitations they bring.