๐Ÿšจ Your RL only improves ๐—ฝ๐—ฎ๐˜€๐˜€@๐Ÿญ, not ๐—ฝ๐—ฎ๐˜€๐˜€@๐—ธ? ๐Ÿšจ Thatโ€™s not a bug โ€” itโ€™s a ๐—ณ๐—ฒ๐—ฎ๐˜๐˜‚๐—ฟ๐—ฒ ๐—ผ๐—ณ ๐˜๐—ต๐—ฒ ๐—ผ๐—ฏ๐—ท๐—ฒ๐—ฐ๐˜๐—ถ๐˜ƒ๐—ฒ youโ€™re optimizing. You get what you optimize for. If you want better pass@k, you need to optimize for pass@k at training time. ๐Ÿงต How?
Replying to @KunhaoZ
pretty interesting, sorry I'm not your domain, just wondering what is the Pass@1 and Pass@k?

Apr 28, 2025 ยท 7:48 AM UTC

Replying to @CassielYM @KunhaoZ
pass@k means at least one of k independent solution attempts will succeed. pass@1 means at least 1 solution attempt will succeed. pass^k means all k attempts will succeed in average in a k-fold test.
Thank you for your kind explanation, I saw the differences bettween pass^k and pass@k, it really helps me a lot!