Wow, language models can talk without words. A new framework, Cache-to-Cache (C2C), lets multiple LLMs communicate directly through their KV-caches instead of text, transferring deep semantics without token-by-token generation. It fuses cache representations via a neural projector and gating mechanism for efficient inter-model exchange. The payoff: up to 10% higher accuracy, 3–5% gains over text-based communication, and 2× faster responses. Cache-to-Cache: Direct Semantic Communication Between Large Language Models Code: github.com/thu-nics/C2C Project: github.com/thu-nics Paper: arxiv.org/abs/2510.03215 Our report: mp.weixin.qq.com/s/tjDq99VrE… 📬 #PapersAccepted by Jiqizhixin
19
291
8
5,575
Replying to @lu_sichu
wild to think LLMs are out here telepathically gossiping while we still type like cavemen

Nov 4, 2025 · 9:40 AM UTC

1