I just tried Kimi K2 Thinking and the computer agent. Holy moly! This is another DeepSeek moment!
🚀 Hello, Kimi K2 Thinking! The Open-Source Thinking Agent Model is here. 🔹 SOTA on HLE (44.9%) and BrowseComp (60.2%) 🔹 Executes up to 200 – 300 sequential tool calls without human interference 🔹 Excels in reasoning, agentic search, and coding 🔹 256K context window Built as a thinking agent, K2 Thinking marks our latest efforts in test-time scaling — scaling both thinking tokens and tool-calling turns. K2 Thinking is now live on kimi.com in chat mode, with full agentic mode coming soon. It is also accessible via API. 🔌 API is live: platform.moonshot.ai 🔗 Tech blog: moonshotai.github.io/Kimi-K2… 🔗 Weights & code: huggingface.co/moonshotai

Nov 7, 2025 · 10:58 PM UTC

42
77
5
1,156
Replying to @DeryaTR_
Did you try it's writing skill that is outstanding
3
2
Replying to @DeryaTR_
how are you finding the comparison to gpt-5-pro?
1
7
Pro is the undisputed King so not even comparing it to that :)
2
29
Replying to @DeryaTR_
What happened after the last DeepSeek moment? People actually tried the model and realized it’s completely unusable because the hallucinations are so bad. The Chinese model creators are benchmark riggers. I’d say these Chinese models are 1:1 with like GPT-3 or so. They’re about 2 years behind. There’s a reason why after the hype you hear nothing about DeepSeek and despite it being the “ChatGPT open source killer” literally nobody uses it anywhere because it’s awful.
1
1
4
You are totally wrong.
2
15
Replying to @DeryaTR_
Thanks so much for the love! 😊 The Thinking model’s live in chat mode for now, but the OK Computer agent mode hasn’t fully powered up with K2 Thinking yet, & we’re on it! Can’t wait for you to try it soon!
14
Replying to @DeryaTR_
If K2's that good, where's the infrastructure to run it? Smart money's on the UAE.
Replying to @DeryaTR_
Deepseek was not a so important moment after some months Let’s see if this is different
2
1
Replying to @DeryaTR_
That’s interesting - but what exactly changed? Is there something fundamentally new in architecture or cognition, or is it just a faster version of the same linear loop? Because true evolution in AI starts not with speed - but with a shift in how it understands meaning.
2
Replying to @DeryaTR_
Impressive stuff. xAI and OpenAI will have to speed up their progress, for sure. Competition is good.
2
Replying to @DeryaTR_
It's getting hard to track the race.
1
Replying to @DeryaTR_
my timeline on dentro.de/ai is waiting for new entries :)
Replying to @DeryaTR_
Impressive numbers.. but the real is how reliable it is over those 200-300 sequential tool calls. Does it gracefully handle failures or unexpected outputs? We've seen agents push boundaries before.. but scaling reasoning and tool usage consistently is still a tough nut to crack. Looking forward to see how Kimi K2 performs in real-world use cases where edge scenarios often surface.
1
Replying to @DeryaTR_
There was no deepseek moment lol
Replying to @DeryaTR_
What did you use it for?
Replying to @DeryaTR_
@DeryaTR_ what are the top 3 models you use for scientific work and why?
Replying to @DeryaTR_
Interesting to see positive feedback based on actual use Always feel a little sceptical of the benchmark based rankings.
1
Replying to @DeryaTR_
K2 Thinking gave me the best analysis I’ve ever seen from any AI chatbot (not coding). Seriously impressed.
1
Replying to @DeryaTR_
It’s my favourite daily model now. Once it’s more widely available it’ll likely supersede ChatGPT for me.
Replying to @DeryaTR_
In that pace, we have to change to a new AI model every two weeks, thanks to China.
Replying to @DeryaTR_
trying it now Writing Mode compared to Claude looks promising
Replying to @DeryaTR_
I've added it to my council as 5th member, so far so good in java, python and coldfusion.
2
Replying to @DeryaTR_
Did u try it self hosted?
Replying to @DeryaTR_
You bet
1
Replying to @DeryaTR_
We are living in an era where an intelligent model like Kimi K2 is completely free and open source.
1
Replying to @DeryaTR_
Kimi K2 benchmarks 44.9% SOTA on HLE, 60.2% on BrowseComp. Executes 200-300 sequential tool calls showing agentic reasoning without interruption. Does reasoning overhead impact latency for real-time deployment scenarios?
Replying to @DeryaTR_
I'm certainly very impressed with it. I'd generally say it's nipping at the heels of top tier closed source models but at a fraction of the cost. For my use case, it's nearly perfect.
1
Replying to @DeryaTR_
definitely agree, that performance jump looks massive for agent tasks. curious if it holds up on longer chains now! btw great read on the other benchmarks too.
1
Replying to @DeryaTR_
These benchmarks show open source AI models can compete with proprietary ones
Replying to @DeryaTR_
Ask it about its guardrails. They are basically just “i follow the law”
Replying to @DeryaTR_
The AI race is wild. Every week theres a new model claiming to be the next big thing. At this point Im just happy if it can debug my code
2
Replying to @DeryaTR_
Kimi k2 cli