We've raised $100M from Kleiner Perkins, Index Ventures, Lightspeed, and NVIDIA.
Today we're introducing Sonic-3 - the state-of-the-art model for realtime conversation.
What makes Sonic-3 great:
- Breakthrough naturalness - laughter and full emotional range
- Lightning fast - 90ms model latency, 190ms end-to-end (fastest on market)
- Supports 42 languages
The difference: We build on State Space Models (SSMs) instead of Transformers.
Transformers (what everyone else uses) are like rewatching the entire conversation from the start before saying each new word. Every word requires reviewing everything.
SSMs (what Sonic-3 uses) are like humans, remembering the topic and vibe of the conversation. Enough context to speak naturally without replaying everything.
My co-founder, Albert, and I pioneered the SSM paradigm at Stanford AI Lab (S4, Mamba), and it is now being adopted industry-wide.
Thousands of businesses like ServiceNow, Cresta, and Decagon power millions of conversations monthly with Sonic.
Try for free or book a demo here:
cartesia.ai/sonic.
If you're qualified and we can't make your voice AI better than what you're using now, I'll donate $5K to your chosen charity.
As part of this launch, we cooked something super cool for you 👇🏻