What happens when you turn a designer into an interpretability researcher? They spend hours staring at feature activations in SVG code to see if LLMs actually understand SVGs. It turns out – yes~
We found that semantic concepts transfer across text, ASCII, and SVG:
This understanding is context dependent – the eyes feature light up as soon as you give enough characters to form the top of the head!
Only a single _ is needed, as well as a / \ forehead for the 2nd @ to activate the "eyes" feature above.
It gets richer as the models improve. Compared to Haiku 3.5, we find a trove of interesting features in the Sonnet 4.5 base model. From various animal body parts, to “motor” neuron features like “say smile” that activate ahead of the ASCII mouth, to features that perceive “size”!
And yes, we checked that this works for human-made SVGs too! I drew this (imo cuter) dog and found many of the same features as on the Claude-generated dog above (which honestly looks more like a bear!)
The coolest part is that just like Golden Gate Claude, we can steer with many of these features, transforming a smiley face to wrinkly face, an owl, or an eyeball!
Read the full writeup: transformer-circuits.pub/202…
While it’s not a full paper, I’m proud to have been a major contributor. It’s a personal milestone from a non-academic bg, to investigate how LLMs visually reason with talented researchers
Oct 24, 2025 · 9:28 PM UTC
thanks to @purvigoel3, @ikauvar, @thebasepoint, @adamsjermyn for all your work and support on getting this out the door. i learned so much and it's changed how i think about research rigor and communication







