researchers when asked to switch from bf16 to fp16 and do loss scaling because it is way better for RL

Oct 31, 2025 · 2:22 PM UTC

i am just playing guys, you know i love you.
1
5
FP16 can have a smaller training-inference gap compared to BFloat16, thus fits better for RL. Even the difference between RL algorithms vanishes once FP16 is adopted. Surprising!
1
8
Replying to @tokenbender
loss scaling is so simple i don't understand why anyone would have a problem with this
1
Replying to @tokenbender
fortunately there is a karpathy tutorial for this
2
4
Replying to @tokenbender
I just canceled my mid-training and switched to fp16. I hope it helps.
1
3
Replying to @tokenbender
what's up with their faces
1
2
Replying to @tokenbender
🧍‍♂️🧍‍♂️🧍‍♂️
1
Replying to @tokenbender
too many big words
1
Replying to @tokenbender
FP16 eliminates the training-inference mismatch in RL while BF16's rounding errors break consistency But yeah, everyone's been defaulting to BF16 for so long it feels wrong to switch back
Replying to @tokenbender
lol, classic research advice, except changing datatypes always breaks something random down the line. anyone actually seen improvements or just more pain?
1