Akshay 🚀 · Nov 7, 2025 · 12:30 PM UTC

Akshay 🚀

Akshay 🚀

@akshay_pachaar

Nov 7

Multi-head attention in LLMs, visually explained:

244

GIF

Avi Chawla · Nov 7, 2025 · 12:37 PM UTC

Avi Chawla · Nov 7, 2025 · 12:37 PM UTC

Avi Chawla

@_avichawla

Nov 7

Replying to @akshay_pachaar

Great! Sharing this thread, which implements full Transformer Architecture and Attention from scratch:

Avi Chawla

@_avichawla

Sep 12

- All Meta Llama models use Attention - All OpenAI GPT models use Attention - All Alibaba Qwen models use Attention - All Google Gemma models use Attention Let's learn how to implement it from scratch:

Nov 7, 2025 · 12:37 PM UTC

Akshay 🚀 · Nov 7, 2025 · 12:39 PM UTC

Akshay 🚀

@akshay_pachaar

Nov 7

Replying to @_avichawla

🔥