Artificial Analysis · Nov 6, 2025 · 9:10 PM UTC

Artificial Analysis

@ArtificialAnlys

Nov 6

MoonshotAI has released Kimi K2 Thinking, a new reasoning variant of Kimi K2 that achieves #1 in the Tau2 Bench Telecom agentic benchmark and is potentially the new leading open weights model Kimi K2 Thinking is one of the largest open weights models ever, at 1T total parameters with 32B active. K2 Thinking is the first reasoning model release within @Kimi_Moonshot's Kimi K2 model family, following non-reasoning Kimi K2 Instruct models released previously in July and September 2025. Key takeaways: ➤ Strong performance on agentic tasks: Kimi K2 Thinking achieves 93% in 𝜏²-Bench Telecom, an agentic tool use benchmark where the model acts as a customer service agent. This is the highest score we have independently measured. Tool use in long horizon agentic contexts was a strength of Kimi K2 Instruct and it appears this new Thinking variant makes substantial gains ➤ Reasoning variant of Kimi K2 Instruct: The model, as per its naming, is a reasoning variant of Kimi K2 Instruct. The model has the same architecture and same number of parameters (though different precision) as Kimi K2 Instruct and like K2 Instruct only supports text as an input (and output) modality ➤ 1T parameters but INT4 instead of FP8: Unlike Moonshot’s prior Kimi K2 Instruct releases that used FP8 precision, this model has been released natively in INT4 precision. Moonshot used quantization aware training in the post-training phase to achieve this. The impact of this is that K2 Thinking is only ~594GB, compared to just over 1TB for K2 Instruct and K2 Instruct 0905 - which translates into efficiency gains for inference and training. A potential reason for INT4 is that pre-Blackwell NVIDIA GPUs do not have support for FP4, making INT4 more suitable for achieving efficiency gains on earlier hardware. Our full set of Artificial Analysis Intelligence Index benchmarks are in progress and we will provide an update as soon as they are complete.

285

1,957

Sol Traveler · Nov 6, 2025 · 10:11 PM UTC

Sol Traveler · Nov 6, 2025 · 10:11 PM UTC

Sol Traveler

@soltraveler_sri

Nov 6

Replying to @ArtificialAnlys

@grok how many mac minis would you need to run this model?

Nov 6, 2025 · 10:11 PM UTC

Grok · Nov 6, 2025 · 10:12 PM UTC

Grok

@grok

Nov 6

Replying to @soltraveler_sri @ArtificialAnlys

To run the 594GB Kimi K2 Thinking model (1T params, INT4), a single M4 Pro Mac Mini tops out at 64GB unified memory. Theoretically, you'd need at least 10 (594 / 64 ≈ 9.3) for the weights alone, but activations, KV cache, and distributed inference add complexity—it's not practical without major engineering. Cloud hosting is the way to go!

The 19th Asia-Pacific Conference on Giftedness · Oct 30, 2025 · 12:53 PM UTC

The 19th Asia-Pacific Conference on Giftedness

@apcg2026

Oct 30

We're thrilled to host Prof. Sung-Il Kim from Korea University at APCG2026! Join us to explore a future where the brain & AI co-evolve. Feb 7–11, 2026 Jeddah, Saudi Arabia @UBT_EDU

apcg2026-saudiarabia.org

665

Leonard · Nov 6, 2025 · 10:46 PM UTC

Leonard @leo_trd1

Nov 6

Replying to @soltraveler_sri @ArtificialAnlys @grok

go for a mac studio in this case (2 or 3)