Akshay 🚀 · Nov 7, 2025 · 12:30 PM UTC

Akshay 🚀 · Nov 7, 2025 · 12:30 PM UTC

Akshay 🚀

Akshay 🚀

@akshay_pachaar

Nov 7

Multi-head attention in LLMs, visually explained:

Nov 7, 2025 · 12:30 PM UTC

237

GIF

Avi Chawla · Nov 7, 2025 · 12:37 PM UTC

Avi Chawla

@_avichawla

Nov 7

Replying to @akshay_pachaar

Great! Sharing this thread, which implements full Transformer Architecture and Attention from scratch:

Avi Chawla

@_avichawla

Sep 12

- All Meta Llama models use Attention - All OpenAI GPT models use Attention - All Alibaba Qwen models use Attention - All Google Gemma models use Attention Let's learn how to implement it from scratch:

Akshay 🚀 · Nov 7, 2025 · 12:39 PM UTC

Akshay 🚀

@akshay_pachaar

Nov 7

🔥

AKHIL · Nov 7, 2025 · 12:38 PM UTC

AKHIL

@Akhi_l__

Nov 7

Replying to @akshay_pachaar

This is the clearest explanation

Akshay 🚀 · Nov 7, 2025 · 12:39 PM UTC

Akshay 🚀

@akshay_pachaar

Nov 7

Glad you found it helpful!

Carlo Edoardo Ferraris · Nov 7, 2025 · 12:35 PM UTC

Carlo Edoardo Ferraris

@carloAI

Nov 7

Replying to @akshay_pachaar

a way to see how confusion becomes intelligence.

Himanshu Kumar · Nov 7, 2025 · 12:38 PM UTC

Himanshu Kumar

@codewithimanshu

Nov 7

Replying to @akshay_pachaar

Ah, the visual explanation is good, Akshay! But I wonder how well it captures the actual complexities, you know?