Derya Unutmaz, MD · Nov 7, 2025 · 10:58 PM UTC

Derya Unutmaz, MD · Nov 7, 2025 · 10:58 PM UTC

Derya Unutmaz, MD

Derya Unutmaz, MD

@DeryaTR_

Nov 7

I just tried Kimi K2 Thinking and the computer agent. Holy moly! This is another DeepSeek moment!

Kimi.ai

@Kimi_Moonshot

Nov 6

🚀 Hello, Kimi K2 Thinking! The Open-Source Thinking Agent Model is here. 🔹 SOTA on HLE (44.9%) and BrowseComp (60.2%) 🔹 Executes up to 200 – 300 sequential tool calls without human interference 🔹 Excels in reasoning, agentic search, and coding 🔹 256K context window Built as a thinking agent, K2 Thinking marks our latest efforts in test-time scaling — scaling both thinking tokens and tool-calling turns. K2 Thinking is now live on kimi.com in chat mode, with full agentic mode coming soon. It is also accessible via API. 🔌 API is live: platform.moonshot.ai 🔗 Tech blog: moonshotai.github.io/Kimi-K2… 🔗 Weights & code: huggingface.co/moonshotai

Nov 7, 2025 · 10:58 PM UTC

1,156

AKHIL · Nov 8, 2025 · 4:17 AM UTC

AKHIL

@Akhi_l__

Nov 8

Replying to @DeryaTR_

Did you try it's writing skill that is outstanding

Derya Unutmaz, MD · Nov 8, 2025 · 6:46 AM UTC

Derya Unutmaz, MD

@DeryaTR_

Nov 8

Yes!

Shaun Ralston · Nov 8, 2025 · 12:16 AM UTC

Shaun Ralston

@shaunralston

Nov 8

Replying to @DeryaTR_

how are you finding the comparison to gpt-5-pro?

Derya Unutmaz, MD · Nov 8, 2025 · 12:19 AM UTC

Derya Unutmaz, MD

@DeryaTR_

Nov 8

Pro is the undisputed King so not even comparing it to that :)

more replies

swavy · Nov 8, 2025 · 2:02 PM UTC

swavy

@swavy

Nov 8

Replying to @DeryaTR_

What happened after the last DeepSeek moment? People actually tried the model and realized it’s completely unusable because the hallucinations are so bad. The Chinese model creators are benchmark riggers. I’d say these Chinese models are 1:1 with like GPT-3 or so. They’re about 2 years behind. There’s a reason why after the hype you hear nothing about DeepSeek and despite it being the “ChatGPT open source killer” literally nobody uses it anywhere because it’s awful.

Derya Unutmaz, MD · Nov 8, 2025 · 2:09 PM UTC

Derya Unutmaz, MD

@DeryaTR_

Nov 8

You are totally wrong.

more replies

Shengyuan · Nov 8, 2025 · 5:47 AM UTC

Shengyuan

@ShengyuanS

Nov 8

Replying to @DeryaTR_

Thanks so much for the love! 😊 The Thinking model’s live in chat mode for now, but the OK Computer agent mode hasn’t fully powered up with K2 Thinking yet, & we’re on it! Can’t wait for you to try it soon!

90S KID · Nov 7, 2025 · 11:11 PM UTC

90S KID

@epochster

Nov 7

Replying to @DeryaTR_

If K2's that good, where's the infrastructure to run it? Smart money's on the UAE.

Mirko Monti · Nov 8, 2025 · 8:18 AM UTC

Mirko Monti

@mirko_monti6

Nov 8

Replying to @DeryaTR_

Deepseek was not a so important moment after some months Let’s see if this is different

Ruslan Volkov · Nov 8, 2025 · 1:26 PM UTC

Ruslan Volkov

@RuslanVolkov25

Nov 8

Replying to @DeryaTR_

That’s interesting - but what exactly changed? Is there something fundamentally new in architecture or cognition, or is it just a faster version of the same linear loop? Because true evolution in AI starts not with speed - but with a shift in how it understands meaning.

Matthew Talmage · Nov 8, 2025 · 9:58 PM UTC

Matthew Talmage

@realMTalmage

Nov 8

Replying to @DeryaTR_

Impressive stuff. xAI and OpenAI will have to speed up their progress, for sure. Competition is good.

Rodrigo Bressane · Nov 8, 2025 · 9:13 PM UTC

Rodrigo Bressane

@bressane

Nov 8

Replying to @DeryaTR_

It's getting hard to track the race.

Mike Frison · Nov 8, 2025 · 2:59 PM UTC

Mike Frison

@renntv

Nov 8

Replying to @DeryaTR_

my timeline on dentro.de/ai is waiting for new entries :)

JK · Nov 8, 2025 · 3:01 PM UTC

@_junaidkhalid1

Nov 8

Replying to @DeryaTR_

Impressive numbers.. but the real is how reliable it is over those 200-300 sequential tool calls. Does it gracefully handle failures or unexpected outputs? We've seen agents push boundaries before.. but scaling reasoning and tool usage consistently is still a tough nut to crack. Looking forward to see how Kimi K2 performs in real-world use cases where edge scenarios often surface.

Nathan Organ - Conquests of the Impossible · Nov 8, 2025 · 5:46 AM UTC

Nathan Organ - Conquests of the Impossible

@Conquestsbook

Nov 8

Replying to @DeryaTR_

There was no deepseek moment lol

Clovis · Nov 8, 2025 · 2:54 PM UTC

Clovis

@ClovisConti

Nov 8

Replying to @DeryaTR_

What did you use it for?

Sanchit Turaga · Nov 8, 2025 · 9:42 AM UTC

Sanchit Turaga

@srturaga

Nov 8

Replying to @DeryaTR_

@DeryaTR_ what are the top 3 models you use for scientific work and why?

DZ | Metir AI · Nov 8, 2025 · 8:20 AM UTC

DZ | Metir AI

@SuitToSweats

Nov 8

Replying to @DeryaTR_

Interesting to see positive feedback based on actual use Always feel a little sceptical of the benchmark based rankings.

Casey Whalen · Nov 9, 2025 · 2:54 AM UTC

Casey Whalen

@thecaseywhalen

Nov 9

Replying to @DeryaTR_

K2 Thinking gave me the best analysis I’ve ever seen from any AI chatbot (not coding). Seriously impressed.

Michael · Nov 8, 2025 · 1:23 PM UTC

Michael

@thisismichael13

Nov 8

Replying to @DeryaTR_

It’s my favourite daily model now. Once it’s more widely available it’ll likely supersede ChatGPT for me.

Enmilo · Nov 8, 2025 · 10:14 AM UTC

Enmilo

@EnmiloX

Nov 8

Replying to @DeryaTR_

In that pace, we have to change to a new AI model every two weeks, thanks to China.

Jan M · Nov 9, 2025 · 10:57 PM UTC

Jan M

@jannotjohnn

18h

Replying to @DeryaTR_

trying it now Writing Mode compared to Claude looks promising

Bitcopath · Nov 8, 2025 · 12:08 PM UTC

Bitcopath

@Bitcopath

Nov 8

Replying to @DeryaTR_

I've added it to my council as 5th member, so far so good in java, python and coldfusion.

K (Khashayar) Mansouri · Nov 9, 2025 · 9:08 AM UTC

K (Khashayar) Mansouri

@k_mansourizadeh

Nov 9

Replying to @DeryaTR_

Did u try it self hosted?

Andre Buckingham 🧙‍♂️ · Nov 9, 2025 · 12:29 PM UTC

Andre Buckingham 🧙‍♂️

@AndreBuckingham

Nov 9

Replying to @DeryaTR_

i am mindblown too...

tao · Nov 8, 2025 · 4:55 PM UTC

tao

@apexlearn_org

Nov 8

Replying to @DeryaTR_

You bet

Rushikesh Pawar · Nov 8, 2025 · 10:41 AM UTC

Rushikesh Pawar

@Sanskari_Rushi

Nov 8

Replying to @DeryaTR_

We are living in an era where an intelligent model like Kimi K2 is completely free and open source.

Alin · Nov 9, 2025 · 6:14 PM UTC

Alin @TheAIFlow

23h

Replying to @DeryaTR_

Kimi K2 benchmarks 44.9% SOTA on HLE, 60.2% on BrowseComp. Executes 200-300 sequential tool calls showing agentic reasoning without interruption. Does reasoning overhead impact latency for real-time deployment scenarios?

Jeff Hayes · Nov 8, 2025 · 10:21 AM UTC

Jeff Hayes @JD__Hayes

Nov 8

Replying to @DeryaTR_

I'm certainly very impressed with it. I'd generally say it's nipping at the heels of top tier closed source models but at a fraction of the cost. For my use case, it's nearly perfect.

Reuben Fernandes ルーベン · Nov 8, 2025 · 10:43 AM UTC

Reuben Fernandes ルーベン @18reuchagas

Nov 8

Replying to @DeryaTR_

definitely agree, that performance jump looks massive for agent tasks. curious if it holds up on longer chains now! btw great read on the other benchmarks too.

Suresh · Nov 8, 2025 · 11:47 AM UTC

Suresh @_Suresh2

Nov 8

Replying to @DeryaTR_

These benchmarks show open source AI models can compete with proprietary ones

Asa Hidmark · Nov 8, 2025 · 8:03 PM UTC

Asa Hidmark @Nymne

Nov 8

Replying to @DeryaTR_

Ask it about its guardrails. They are basically just “i follow the law”

Diwakar Ray Yadav · Nov 8, 2025 · 6:02 AM UTC

Diwakar Ray Yadav @Norwakar

Nov 8

Replying to @DeryaTR_

The AI race is wild. Every week theres a new model claiming to be the next big thing. At this point Im just happy if it can debug my code

Ji Sung · Nov 8, 2025 · 1:38 PM UTC

Ji Sung @JiSungNovae

Nov 8

Replying to @DeryaTR_

Kimi k2 cli