Today, we're releasing Kimi K2 Thinking, our best open-source model.
What makes it different isn't just the benchmarks, though it achieves SOTA results on Humanity's Last Exam, BrowseComp, and other challenging tests. What matters is how it thinks.
It reminds me of the minds on our team: always asking the next question, refusing to settle for the first answer, following each thread until it leads somewhere true.
This is test-time scaling in its full form, giving models the space to think longer and act more deliberately.
🚀 Hello, Kimi K2 Thinking!
The Open-Source Thinking Agent Model is here.
🔹 SOTA on HLE (44.9%) and BrowseComp (60.2%)
🔹 Executes up to 200 – 300 sequential tool calls without human interference
🔹 Excels in reasoning, agentic search, and coding
🔹 256K context window
Built as a thinking agent, K2 Thinking marks our latest efforts in test-time scaling — scaling both thinking tokens and tool-calling turns.
K2 Thinking is now live on
kimi.com in chat mode, with full agentic mode coming soon. It is also accessible via API.
🔌 API is live:
platform.moonshot.ai
🔗 Tech blog:
moonshotai.github.io/Kimi-K2…
🔗 Weights & code:
huggingface.co/moonshotai