A very nice read. Fixed chunks make ultra-long reasoning feasible. Very nice visualizations too! Congrats to the authors!
Introducing linear scaling of reasoning: ๐“๐ก๐ž ๐Œ๐š๐ซ๐ค๐จ๐ฏ๐ข๐š๐ง ๐“๐ก๐ข๐ง๐ค๐ž๐ซ Reformulate RL so thinking scales ๐Ž(๐ง) ๐œ๐จ๐ฆ๐ฉ๐ฎ๐ญ๐ž, not O(n^2), with O(1) ๐ฆ๐ž๐ฆ๐จ๐ซ๐ฒ, architecture-agnostic. Train R1-1.5B into a markovian thinker with 96K thought budget, ~2X accuracy ๐Ÿงต

Oct 10, 2025 ยท 12:28 PM UTC

1
19