tokenbender · Oct 31, 2025 · 2:22 PM UTC

tokenbender · Oct 31, 2025 · 2:22 PM UTC

tokenbender

tokenbender

@tokenbender

Oct 31

researchers when asked to switch from bf16 to fp16 and do loss scaling because it is way better for RL

Oct 31, 2025 · 2:22 PM UTC

641

tokenbender · Oct 31, 2025 · 2:22 PM UTC

tokenbender

@tokenbender

Oct 31

i am just playing guys, you know i love you.

Pranav Shyam · Oct 31, 2025 · 5:03 PM UTC

Pranav Shyam @recurseparadox

Oct 31

Replying to @tokenbender

Source?

tokenbender · Oct 31, 2025 · 5:08 PM UTC

tokenbender

@tokenbender

Oct 31

Rosinality @rosinality

Oct 31

FP16 can have a smaller training-inference gap compared to BFloat16, thus fits better for RL. Even the difference between RL algorithms vanishes once FP16 is adopted. Surprising!

more replies

sasuke⚡420 · Oct 31, 2025 · 2:32 PM UTC

sasuke⚡420

@sasuke___420

Oct 31

Replying to @tokenbender

loss scaling is so simple i don't understand why anyone would have a problem with this

Ted - 🥖/acc · Oct 31, 2025 · 5:41 PM UTC

Ted - 🥖/acc

@ted_engineer

Oct 31

Replying to @tokenbender

fortunately there is a karpathy tutorial for this

Mariusz Kurman · Oct 31, 2025 · 9:47 PM UTC

Mariusz Kurman

@mkurman88

Oct 31

Replying to @tokenbender

I just canceled my mid-training and switched to fp16. I hope it helps.

Sanket Patrikar · Oct 31, 2025 · 3:03 PM UTC

Sanket Patrikar @PatrikarSanket

Oct 31

Replying to @tokenbender

what's up with their faces

BenIt Pro · Oct 31, 2025 · 3:29 PM UTC

BenIt Pro

@BennettBuhner

Oct 31

Replying to @tokenbender

🧍‍♂️🧍‍♂️🧍‍♂️

Ayush · Oct 31, 2025 · 10:12 PM UTC

Ayush

@ayushhcantcode

Oct 31

Replying to @tokenbender

too many big words

Chaospanda · Nov 1, 2025 · 12:14 AM UTC

Chaospanda

@Chaospandah

Nov 1

Replying to @tokenbender

Ezzat Chamudi · Nov 2, 2025 · 2:25 AM UTC

Ezzat Chamudi

@echamudi

Nov 2

Replying to @tokenbender

FP16 eliminates the training-inference mismatch in RL while BF16's rounding errors break consistency But yeah, everyone's been defaulting to BF16 for so long it feels wrong to switch back

Thanh Doan · Nov 1, 2025 · 4:24 AM UTC

Thanh Doan

@leodoan_

Nov 1

Replying to @tokenbender

lol, classic research advice, except changing datatypes always breaks something random down the line. anyone actually seen improvements or just more pain?