Senior vibe coder · NLP/LLM research · PhD in AI & Wireless Comms

Montpellier, France
Joined March 2020
Is this the first OSS model that does o3-style parallel trajectory generation and aggregation?
🚀 Hello, Kimi K2 Thinking! The Open-Source Thinking Agent Model is here. 🔹 SOTA on HLE (44.9%) and BrowseComp (60.2%) 🔹 Executes up to 200 – 300 sequential tool calls without human interference 🔹 Excels in reasoning, agentic search, and coding 🔹 256K context window Built as a thinking agent, K2 Thinking marks our latest efforts in test-time scaling — scaling both thinking tokens and tool-calling turns. K2 Thinking is now live on kimi.com in chat mode, with full agentic mode coming soon. It is also accessible via API. 🔌 API is live: platform.moonshot.ai 🔗 Tech blog: moonshotai.github.io/Kimi-K2… 🔗 Weights & code: huggingface.co/moonshotai
forget agents, you can just pipe llm calls @remilouf
1
I'll be at @dotaiconf tomorrow. DM me you wanna meet up!
2
2
The lazy imports PEP got accepted today!
Gentlemen I need your full attention. Python is introducing lazy imports. I repeat. Python is introducing lazy imports. inb4 the flood of `treewide: adopt lazy imports` +123,244 PRs
Cursor moving into tier 4 a few days after the Hotz diss is hilarious Bullish!
Introducing Cursor 2.0. Our first coding model and the best way to code with agents.
Interesting argument in favor of keeping the KL term in GRPO. Ig it makes sense when fine-tuning on top of an already strong reasoning baseline.
These guys know how to benchmark. They included models released just yesterday.
We’re updating olmOCR, our model for turning PDFs & scans into clean text with support for tables, equations, handwriting, & more. olmOCR 2 uses synthetic data + unit tests as verifiable rewards to reach state-of-the-art performance on challenging documents. 🧵
5
Listening to @karpathy's take on mode collapse > They [LLMs] have a collapsed data distribution. One easy way to see it is to go to ChatGPT and ask it, "Tell me a joke." It only has like three jokes. Curious what he thinks about this work
New paper: You can make ChatGPT 2x as creative with one sentence. Ever notice how LLMs all sound the same? They know 100+ jokes but only ever tell one. Every blog intro: "In today's digital landscape..." We figured out why – and how to unlock the rest 🔓 Copy-paste prompt: 🧵
1
1
NVIDIA quietly adding a new GPU to their lineup
"You’re absolutely right! And if you’d like to stay right and anonymous, try NordVPN. Use code AMPFREE for up to 77% off plans."
We made Amp Free. It's powered by great tokens and tasteful ads. Agentic coding is now free for everyone.
Nvidia DGX Spark: CUDA AMD Strix Halo: CUDA from wish
AMD Strix Halo: DenseAI compute(BF16) - 110tops 128GB 256GB/s memory 4TB Nvme 10GbE LAN x86 64MB cache $2,300 Nvidia DGX Spark: Dense AI compute(BF16) 125tops 128GB 273GB/s memory 4TB Nvme 10GbE LAN ARM 24MB cache $4,000
1
In my experience, this works well when peer programming with AI as well. When I ask it to implement a feature or solve an issue, I always ask for different options to choose from, and oftentimes I end up not picking the 1st suggestion.
New paper: You can make ChatGPT 2x as creative with one sentence. Ever notice how LLMs all sound the same? They know 100+ jokes but only ever tell one. Every blog intro: "In today's digital landscape..." We figured out why – and how to unlock the rest 🔓 Copy-paste prompt: 🧵
Are we gonna start seeing DGX Spark instances on @runpod_io, @PrimeIntellect and the likes? 👀
1
1
Also @soumithchintala seems to share the same opinion
Sometimes we forget that NVIDIA wins because it's a software company. DGX Spark is a reminder of that. It's a CUDA dev machine that's beautiful enough and small enough to be on my desk and with enough memory to fit a truckload of params. It's not the fastest or best at anything, but it's great to develop on and transfer your final training run to a H/B200, final robotics policy to your Jetson, final inference to {nvidia/apple/amd/[favorite vendor]}.
I'm now convinced that the DGX Spark is more meant to be a devkit for B200s than anything, so it doesn't make sense to compare it to Mac Studios or Ryzen AI Max+ 395s.
🚀 SGLang In-Depth Review of the NVIDIA DGX Spark is LIVE! Thanks to @NVIDIA’s early access program, SGLang makes its first ever appearance in a consumer product, the brand-new DGX Spark. The DGX Spark’s 128GB Unified Memory and Blackwell architecture set a new standard for local AI prototyping and edge computing. We're thrilled to bring these cutting-edge performance insights and software support to the developer community. Our review dives into how to efficiently deploy and accelerate large models like Llama 3.1 70B, GPT-OSS using SGLang's EAGLE3 speculative decoding and @Ollama on this beautiful piece of engineering. 👇 Unboxing video and tech blog in the thread #SGLang #NVIDIA #SparkSomethingBig #Blackwell #DGXSpark #AIInference #LLMServing
3
1
1
10
It's only a matter of time before @pewdiepie discovers @home_assistant and sees the light
1
Gentlemen I need your full attention. Python is introducing lazy imports. I repeat. Python is introducing lazy imports. inb4 the flood of `treewide: adopt lazy imports` +123,244 PRs
So it turns out people who care about performance already turn all Python imports into local ones under the hood: both Hudson River Trading, and Meta's production python auto lazy import:
28
55
17
1,161
Taha Yassine 🍉 retweeted
I've created a new MIT-NSFG license: "No Software for Genocide" I'm going to be using this in my personal projects to make sure no genocidal army or organization is able to benefit from the code that I write Feel free to use it for your own software as well.
1
6
1
34