Training LLMs end to end is hard. Very excited to share our new blog (book?) that cover the full pipeline: pre-training, post-training and infra. 200+ pages of what worked, what didn’t, and how to make it run reliably
huggingface.co/spaces/Huggin…
Oct 30, 2025 · 4:13 PM UTC
pretraining thread by @LoubnaBenAllal1 (who lead this project 🫶)
After ~4 years building SOTA models & datasets, we're sharing everything we learned in ⚡The Smol Training Playbook
We cover the full LLM cycle: designing ablations, choosing an architecture, curating data, post-training, and building solid infrastructure.
We'll help you navigate the messy training reality that LLM papers don't cover. Chapter highlights in the 🧵
post training thread by @_lewtun
We've just published the Smol Training Playbook: a distillation of hard earned knowledge to share exactly what it takes to train SOTA LLMs ⚡️
Featuring our protagonist SmolLM3, we cover:
🧭 Strategy on whether to train your own LLM and burn all your VC money
🪨 Pretraining, aka turning a mountain of text into a fancy auto-completer
🗿How to sculpt base models with post-training alchemy
🛠️ The underlying infra and how to debug your way out of NCCL purgatory
Highlights from the post-training chapter in the thread 👇
infra thread by @Nouamanetazi

























