This video kinda captures all the foundational AI Papers
1. Attention Is All You Need (2017) - Transformer architecture with self-attention for parallel training; foundation of all modern LLMsโ
2. GPT-3 (2020) - Demonstrated in-context learning at scale; models can learn tasks from prompts without fine-tuningโ
3. InstructGPT (2022) - RLHF alignment technique; smaller aligned models outperform larger unaligned onesโ
4. LoRA (2021) - Low-rank adapters for efficient fine-tuning; 10,000x fewer parameters, enables single-GPU trainingโ
5. RAG (2020) - Retrieval-Augmented Generation; models access external data to reduce hallucination and outdated knowledgeโ
6. LLM-Based Agents Survey (2023) - Framework for AI agents with brain (planning), perception (context), and action (tools) componentsโ
7. Switch Transformers (2021) - Mixture of Experts architecture; trillion-parameter models with sparse activation for efficiencyโ
8. DistilBERT (2019) - Knowledge distillation for compression; 40% smaller, 60% faster, 97% performance retentionโ
9. LLM.int8() (2022) - Outlier-aware quantization; halves memory with 8-bit storage while preserving accuracyโ
10. Model Context Protocol (2024) - Anthropic's open standard for connecting models to tools, databases, and APIs seamlessly
Just Google them and get everything on Arxiv :)