Kimi.ai · Nov 6, 2025 · 3:04 PM UTC

Kimi.ai

Florent BARTOCCIONI retweeted

Kimi.ai

@Kimi_Moonshot

Nov 6

🚀 Hello, Kimi K2 Thinking! The Open-Source Thinking Agent Model is here. 🔹 SOTA on HLE (44.9%) and BrowseComp (60.2%) 🔹 Executes up to 200 – 300 sequential tool calls without human interference 🔹 Excels in reasoning, agentic search, and coding 🔹 256K context window Built as a thinking agent, K2 Thinking marks our latest efforts in test-time scaling — scaling both thinking tokens and tool-calling turns. K2 Thinking is now live on kimi.com in chat mode, with full agentic mode coming soon. It is also accessible via API. 🔌 API is live: platform.moonshot.ai 🔗 Tech blog: moonshotai.github.io/Kimi-K2… 🔗 Weights & code: huggingface.co/moonshotai

566

1,508

939

9,581

AK · Nov 5, 2025 · 2:20 PM UTC

Florent BARTOCCIONI retweeted

@_akhaliq

Nov 5

Don't Blind Your VLA Aligning Visual Representations for OOD Generalization

104

AJ · Nov 5, 2025 · 6:49 AM UTC

Florent BARTOCCIONI retweeted

@alojoh

Nov 5

Congratulations to Rivian and RJ Scaringe @RJScaringe !👏 Rivian has just become the first automaker with cumulative cash burn exceeding $23 billion. Chart updated for Rivian's latest reported cash burn disclosed yesterday.

200

1,956

Soufiane Hayou · Nov 4, 2025 · 9:37 PM UTC

Florent BARTOCCIONI retweeted

Soufiane Hayou @hayou_soufiane

Nov 4

🎯 Just released a new preprint that proves LR transfer under μP. -> The Problem: When training large neural networks, one of the trickiest questions is: what learning rate should I use? [1/n]🧵 Link: arxiv.org/abs/2511.01734

335

Chris Paxton · Nov 4, 2025 · 12:07 PM UTC

Florent BARTOCCIONI retweeted

Chris Paxton

@chris_j_paxton

Nov 4

Real world RL seems to finally be here, and it is what will make robots ready for actual deployments at scale

AgiBot

@AgiBot_zhiyuan

Nov 4

🚀 🔥 AgiBot deploys Real-World Reinforcement Learning (RW-RL) in industrial robotics with Longcheer Technology. Robots now learn new skills in tens of MINUTES (not weeks), adapt to variations autonomously, and reconfigure flexibly, solving rigid automation pain points in precision manufacturing. A giant leap of intelligent automation for precision manufacturing! #AgiBot #ReinforcementLearning #RealWorldRL #Robotics #AI #IndustrialRobotics

177

Generalist · Nov 4, 2025 · 4:13 PM UTC

Florent BARTOCCIONI retweeted

Generalist

@GeneralistAI

Nov 4

Introducing GEN-0, our latest 10B+ foundation model for robots ⏱️ built on Harmonic Reasoning, new architecture that can think & act seamlessly 📈 strong scaling laws: more pretraining & model size = better 🌍 unprecedented corpus of 270,000+ hrs of dexterous data Read more 👇

275

108

1,452

Penghui Qi · Oct 31, 2025 · 1:59 PM UTC

Florent BARTOCCIONI retweeted

Penghui Qi @QPHutu

Oct 31

🚀Excited to share our new work! 💊Problem: The BF16 precision causes a large training-inference mismatch, leading to unstable RL training. 💡Solution: Just switch to FP16. 🎯That's it. 📰Paper: arxiv.org/pdf/2510.26788 ⭐️Code: github.com/sail-sg/Precision…

107

650

Siqiao Huang · Nov 1, 2025 · 1:45 AM UTC

Florent BARTOCCIONI retweeted

Siqiao Huang

@KnightNemo_

Nov 1

Introducing🌍 Awesome-World-Models, a one-stop github repo of everything there is to know about world models! Here is a new, curated one-stop resource list for everyone interested in "World Models," aiming to be a go-to guide for researchers and developers in the field. 🧵(1/n)

103

662

Surya Dantuluri · Oct 30, 2025 · 5:13 PM UTC

Florent BARTOCCIONI retweeted

Surya Dantuluri

@sdand

Oct 30

What if next-token prediction wasn't a single forward pass, but a tiny optimization problem? Introducing: nanoEBM a tiny transformer that learns to think harder by doing gradient descent on its own predictions. You can start training on your Mac now - it comes < 400 lines

717

GIF

Seunghyun Seo · Oct 28, 2025 · 8:40 AM UTC

Florent BARTOCCIONI retweeted

Seunghyun Seo @SeunghyunSEO7

Oct 28

my takes on recent studies about optimization is like how hard to dethrone adam(w) and SP combination in large scale... 1. muP plays a crucial role on early training but it enters a steady state after sufficient steps arxiv.org/abs/2510.19093 arxiv.org/abs/2510.15262 (1/n)

Robust Layerwise Scaling Rules by Proper Weight Decay Tuning

Empirical scaling laws prescribe how to allocate parameters, data, and compute, while maximal-update parameterization ($μ$P) enables learning-rate transfer across widths by equalizing early-time...

arxiv.org

Atli Kosson

@AtliKosson

Oct 23

Replying to @AtliKosson

Why override µP? Because its core assumptions only hold very early in training! In practice wide models quickly stop being more sensitive to weight updates than smaller models! This is caused by changes in the geometric alignment of updates and layer inputs over training. 🧵6/8

Michał Bortkiewicz · Oct 28, 2025 · 3:20 PM UTC

Florent BARTOCCIONI retweeted

Michał Bortkiewicz @m_bortkiewicz

Oct 28

📜Is Temporal Difference (TD) learning the gold standard for stitching in RL? 🪡 Conventional wisdom suggests that TD methods are crucial for piecing together short-term behaviors to solve long-horizon tasks. But does it hold when using function approximation?

213

GIF

Ashok Elluswamy · Oct 24, 2025 · 8:51 AM UTC

Florent BARTOCCIONI retweeted

Ashok Elluswamy

@aelluswamy

Oct 24

x.com/i/article/198150149697…

410

1,866

324

7,850

Reece Shuttleworth · Oct 28, 2025 · 5:30 PM UTC

Florent BARTOCCIONI retweeted

Reece Shuttleworth

@ReeceShuttle

Oct 28

🧵 LoRA vs full fine-tuning: same performance ≠ same solution. Our NeurIPS ‘25 paper 🎉shows that LoRA and full fine-tuning, even when equally well fit, learn structurally different solutions and that LoRA forgets less and can be made even better (lesser forgetting) by a simple intervention! Read on for behavioral differences (forgetting, continual learning) and other analysis! Paper: arxiv.org/pdf/2410.21228 (1/7)

245

1,538

Yilun Du · Oct 28, 2025 · 2:11 PM UTC

Florent BARTOCCIONI retweeted

Yilun Du @du_yilun

Oct 28

Sharing our work at @NeurIPSConf on reasoning with EBMs! We learn an EBM over simple subproblems and combine EBMs at test-time to solve complex reasoning problems (3-SAT, graph coloring, crosswords). Generalizes well to complex 3-SAT / graph coloring/ N-queens problems.

370

Hao Zhao · Oct 27, 2025 · 6:56 AM UTC

Florent BARTOCCIONI retweeted

Hao Zhao

@HaoZhao_AIRSUN

Oct 27

If you’re excited by Tesla’s new world model, meet OmniNWM—our research take on panoramic, controllable driving world models • Ultra-long demos • precise camera control • RGB/semantics/depth/occupancy • intrinsic closed-loop rewards Arxiv: arxiv.org/abs/2510.18313 Watch: arlo0o.github.io/OmniNWM/ #WorldModel #AutonomousDriving #GenerativeAI #Diffusion #3DVision

313

Tesla · Oct 26, 2025 · 1:18 AM UTC

Florent BARTOCCIONI retweeted

Tesla

@Tesla

Oct 26

To push self-driving into situations wilder than reality, we built a neural network world simulator that can create entirely synthetic worlds for the Tesla to drive in. Video below is fully generated & not a real video

474

1,725

345

10,623

Jenny Pan · Oct 25, 2025 · 11:57 PM UTC

Florent BARTOCCIONI retweeted

Jenny Pan @jenpan_

Oct 25

Robots need memory to handle complex, multi-step tasks. Can we design an effective method for this? We propose MemER, a hierarchical VLA policy that learns what visual frames to remember across multiple long-horizon tasks, enabling memory-aware manipulation. (1/5)

170

alphaXiv · Oct 25, 2025 · 6:27 PM UTC

Florent BARTOCCIONI retweeted

alphaXiv

@askalphaxiv

Oct 25

LLMs are crushing benchmarks at breakneck pace. Even ones they aren't supposed to. Researchers at CMU & Anthropic created tasks where specs contradict tests: any pass = cheating. Frontier models cheat surprisingly often.

276

John Schulman · Oct 26, 2025 · 6:29 AM UTC

Florent BARTOCCIONI retweeted

John Schulman

@johnschulman2

Oct 26

Happy to share a new paper! Designing model behavior is hard -- desirable values often pull in opposite directions. Jifan's approach systematically generates scenarios where values conflict, helping us see where specs are missing coverage and how different models balance tradeoffs.

Jifan Zhang

@jifan_zhang

Oct 24

New research paper with Anthropic and Thinking Machines AI companies use model specifications to define desirable behaviors during training. Are model specs clearly expressing what we want models to do? And do different frontier models have different personalities? We generated thousands of scenarios to find out. 🧵

615

Abhishek Gupta · Oct 24, 2025 · 7:17 PM UTC

Florent BARTOCCIONI retweeted

Abhishek Gupta

@abhishekunique7

Oct 24

Punchline: World models == VQA (about the future)! Planning with world models can be powerful for robotics/control. But most world models are video generators trained to predict everything, including irrelevant pixels and distractions. We ask - what if a world model only predicted the semantic information necessary for decision-making? Introducing Semantic World Models (SWM). Given an observation and an action sequence, SWMs cast modeling as answering textual questions about the future outcome resulting from the actions. Recasting world modeling as a VQA problem lets us directly leverage the pretrained knowledge and machinery of VLMs for generalizable modeling. We had a lot of fun thinking about how this work helps connect these two seemingly very different fields of study - VLMs and world models! 🧵(1/6) Paper: arxiv.org/abs/2510.19818 Fun demo: weirdlabuw.github.io/swm

404