Introducing RSA π (Recursive Self-Aggregation): unlocking deep thinking with test-time scaling
π₯ Qwen3-4B + RSA > DeepSeek-R1
π Gains across Qwen, Nemo, GPT-OSS
π Benchmarks: Math β’ Reasoning Gym β’ Code
β‘ Aggregation-aware RL lets Qwen3-4B surpass o3-mini π
Sep 27, 2025 Β· 3:21 AM UTC
RSA (Recursive Self-Aggregation) = Sequential refinement π + Parallel exploration β‘
β Unified into a hybrid evolutionary loop for deeper reasoning.
π Paper + website: rsa-llm.github.io/
π§΅ Details in the thread
RSA is simple π
1οΈβ£ Generate a population of N reasoning chains in parallel
2οΈβ£ Subsample into N subsets of K chains
3οΈβ£ Prompt the model to aggregate each subset β new improved population of CoTs
4οΈβ£ Repeat for T loops
Thatβs the whole algorithm: Recursive Self-Aggregation
RSA scales sequentially & in parallel: more steps T or larger aggregation K jointly with population N β better performance!
π₯ Gains:
AIME-25 47 β 73%
HMMT-25 29 β 50%
Reasoning Gym Games 55 β 70%
LiveCodeBench 49.5 β 56.3%
No verifiers β No prompt opt β No RL yet!
RSA boosts performance across all models and all reasoning tasks.
Tested on Qwen, Nemo, GPT-OSS β thinking & non-thinking, MoE & dense, full-attention or hybrid SSM.
Benchmarks: AIME, HMMT LiveCodeBench, Reasoning Gym
π Gains are consistent and significant throughout.
RL makes RSA even stronger π
Naive RL can hurt aggregation, but aggregation-aware RL β
π Generate K responses
π Create aggregation prompts
π Train the model to aggregate
Boosts performance & generalizes to new tasks β RSA is worth it!
RSA beats all simple budget-matched test-time scaling baselines!
Shoutout to concurrent work by @wzhao_nlp & team for AggLM (RL-trains single-step aggregators). We independently discovered this, but also find that sequential aggregation with larger population is key for scaling.
Amazing work jointly with @siddarthv66 & @thevineetjain as equal contributors, and help from a fantastic team: @veds_12 @johanobandoc @Yoshua_Bengio @bartoldson @bkailkhu @g_lajoie_ @GlenBerseth @FelineAutomaton @JainMoksh !







