An exciting new course: Fine-tuning and Reinforcement Learning for LLMs: Intro to Post-training, taught by
@realSharonZhou, VP of AI at
@AMD. Available now at
DeepLearning.AI.
Post-training is the key technique used by frontier labs to turn a base LLM--a model trained on massive unlabeled text to predict the next word/token--into a helpful, reliable assistant that can follow instructions. I've also seen many applications where post-training is what turns a demo application that works only 80% of the time into a reliable system that consistently performs. This course will teach you the most important post-training techniques!
In this 5 module course, Sharon walks you through the complete post-training pipeline: supervised fine-tuning, reward modeling, RLHF, and techniques like PPO and GRPO. You'll also learn to use LoRA for efficient training, and to design evals that catch problems before and after deployment.
Skills you'll gain:
- Apply supervised fine-tuning and reinforcement learning (RLHF, PPO, GRPO) to align models to desired behaviors
- Use LoRA for efficient fine-tuning without retraining entire models
- Prepare datasets and generate synthetic data for post-training
- Understand how to operate LLM production pipelines, with go/no-go decision points and feedback loops
These advanced methods aren’t limited to frontier AI labs anymore, and you can now use them in your own applications.
Learn here:
deeplearning.ai/courses/fine…