I built my own ChatGPT from scratch, and you can too.
karpathy's nanochat is a single, clean, minimal, and hackable codebase to build a modern LLM.
By setting this up, you'll learn how to:
> train a tokenizer from the ground up
> pre-training: master next-word prediction
> mid-training: teach the model to hold conversations
> sft: fine-tune on high-quality dialogue datasets
> evaluate and log every step of the process
I've done this on a LightningAI studio, and you can reproduce everything with a single click (zero setup required).
link in the next tweet!