Kunhao Zheng · Apr 27, 2025 · 4:30 PM UTC

Kunhao Zheng

Kunhao Zheng @KunhaoZ

Apr 27

🚨 Your RL only improves 𝗽𝗮𝘀𝘀@𝟭, not 𝗽𝗮𝘀𝘀@𝗸? 🚨 That’s not a bug — it’s a 𝗳𝗲𝗮𝘁𝘂𝗿𝗲 𝗼𝗳 𝘁𝗵𝗲 𝗼𝗯𝗷𝗲𝗰𝘁𝗶𝘃𝗲 you’re optimizing. You get what you optimize for. If you want better pass@k, you need to optimize for pass@k at training time. 🧵 How?

133

833

Mellen Y. Pu · Apr 28, 2025 · 7:48 AM UTC

Mellen Y. Pu · Apr 28, 2025 · 7:48 AM UTC

Mellen Y. Pu @CassielYM

Apr 28

Replying to @KunhaoZ

pretty interesting, sorry I'm not your domain, just wondering what is the Pass@1 and Pass@k?

Apr 28, 2025 · 7:48 AM UTC

damdandustu · Apr 28, 2025 · 11:04 AM UTC

damdandustu @damdandusmus

Apr 28

Replying to @CassielYM @KunhaoZ

pass@k means at least one of k independent solution attempts will succeed. pass@1 means at least 1 solution attempt will succeed. pass^k means all k attempts will succeed in average in a k-fold test.

Mellen Y. Pu · Apr 28, 2025 · 1:00 PM UTC

Mellen Y. Pu @CassielYM

Apr 28

Thank you for your kind explanation, I saw the differences bettween pass^k and pass@k, it really helps me a lot!