A very nice read. Fixed chunks make ultra-long reasoning feasible. Very nice visualizations too! Congrats to the authors!
Introducing linear scaling of reasoning:
๐๐ก๐ ๐๐๐ซ๐ค๐จ๐ฏ๐ข๐๐ง ๐๐ก๐ข๐ง๐ค๐๐ซ
Reformulate RL so thinking scales ๐(๐ง) ๐๐จ๐ฆ๐ฉ๐ฎ๐ญ๐, not O(n^2), with O(1) ๐ฆ๐๐ฆ๐จ๐ซ๐ฒ, architecture-agnostic.
Train R1-1.5B into a markovian thinker with 96K thought budget, ~2X accuracy ๐งต
Oct 10, 2025 ยท 12:28 PM UTC
