Jerry Tworek · Oct 24, 2025 · 7:48 PM UTC

Jerry Tworek

Jerry Tworek

@MillionInt

Oct 24

RL really feels like a technical revolution within revolution. It spun up a completely new wave of startups, products and thought leaders on top of a huge wave we were already riding on

230

Jerry Tworek · Oct 24, 2025 · 3:40 PM UTC

Jerry Tworek

@MillionInt

Oct 24

Codex is giving me factorio dopamine hits times a healthy multiplier

roon

@tszzl

Oct 24

managing fleets of agents should be more fun than playing factorio with the UI/UX to boot

406

Jerry Tworek · Oct 23, 2025 · 2:00 PM UTC

Jerry Tworek

@MillionInt

Oct 23

Technical decisions matter kids

Aurko Roy

@aurko79

Oct 23

Who would have thought that a multi trillion dollar cap company could have been thrown into such chaos (layoffs) by a single technical decision they made a year ago - using expert choice MoEs for their frontier model.

187

Clive Chan · Oct 23, 2025 · 5:16 AM UTC

Jerry Tworek retweeted

Clive Chan

@itsclivetime

Oct 23

🧵 Announcing a $30B collaboration with Pringles for datacenter chips (1/N)👇

Jerry Tworek · Oct 23, 2025 · 1:53 AM UTC

Jerry Tworek

@MillionInt

Oct 23

🎯

samsja

@samsja19

Oct 22

Working on llm RL is one of the most intellectually satisfying things I ever done, both from a system and ml perspective

Jerry Tworek · Oct 23, 2025 · 1:48 AM UTC

Jerry Tworek

@MillionInt

Oct 23

I guess people are doing cool things with good old chatty

Ernest Ryu @ErnestRyu

Oct 21

I used ChatGPT to solve an open problem in convex optimization. *Part I* (1/N)

Jerry Tworek · Oct 22, 2025 · 8:15 PM UTC

Jerry Tworek

@MillionInt

Oct 22

Chess grandmasters must have something amazing going on in their brains

Jerry Tworek · Oct 21, 2025 · 10:36 PM UTC

Jerry Tworek

@MillionInt

Oct 21

sometimes it just is mid

roon

@tszzl

Oct 6

it can take hundreds of man-years, sometimes thousands to make some creative project, and mere minutes for you to call it mid. what power you have

191

Jerry Tworek · Oct 21, 2025 · 5:00 PM UTC

Jerry Tworek

@MillionInt

Oct 21

Looking at training dynamics of neural networks it is an amazing feat of nature that humans crash out so infrequently

121

Jerry Tworek · Oct 21, 2025 · 3:06 PM UTC

Jerry Tworek

@MillionInt

Oct 21

AIs today are not perfect, but this alone is the most bullish argument why they have a lot of low hanging fruit improvements ahead

Taelin

@VictorTaelin

Oct 21

> Codex, do X. *writes wrong code* > This is wrong because <reason>. Instead, do <solution>. *writes wrong code* > ?? Can you EXPLAIN why your code is wrong? *writes perfect explanation* > If you get it, why didn't you apply that to your code? *writes perfect code* ...

396

Jerry Tworek · Oct 20, 2025 · 7:22 PM UTC

Jerry Tworek

@MillionInt

Oct 20

Thanks to twitter vibes optimisation pressure we’re likely getting models great at one shot generating svg files which is probably something no one ever asked for 😄

257

Jerry Tworek · Oct 20, 2025 · 3:19 AM UTC

Jerry Tworek

@MillionInt

Oct 20

It is kind of sweet to think that most viable research ideas have already been tried and there is no overhang that could make our models 10 thousand times better on current hardware. That's some confidence.

297

Jerry Tworek · Oct 19, 2025 · 8:27 PM UTC

Jerry Tworek

@MillionInt

Oct 19

vibe coding. feels. so. good.

284

Jerry Tworek · Oct 19, 2025 · 8:15 PM UTC

Jerry Tworek

@MillionInt

Oct 19

Interesting fact about nature is that it stumbled into general intelligence when trying to make monkeys that are hungry less often.

341

Jerry Tworek · Oct 18, 2025 · 5:52 PM UTC

Jerry Tworek

@MillionInt

Oct 18

Politely and constructively disagreeing is an art humanity should get much better at

303

Jerry Tworek · Oct 17, 2025 · 5:24 PM UTC

Jerry Tworek

@MillionInt

Oct 17

unlearning is harder than learning

226

Jerry Tworek · Oct 16, 2025 · 6:13 PM UTC

Jerry Tworek

@MillionInt

Oct 16

This is mostly how I imagine postagi life

Rohan Pandey @khoomeik

Oct 16

nothin better than kicking off a couple codex jobs and going back to watching a physics lecture i don't understand (from Prof. Coskun Kocabas talk on topological effects in graphene at @periodiclabs)

716

Jerry Tworek · Oct 16, 2025 · 3:58 PM UTC

Jerry Tworek

@MillionInt

Oct 16

I don't do podcasts very often - in reality this is my first one ever, but if anyone wants to listen to someone talk about RL for an hour, this is it

Matt Turck

@mattturck

Oct 16

How GPT-5 thinks, with @OpenAI VP of Research @MillionInt 00:00 - Intro 01:01 - What Reasoning Actually Means in AI 02:32 - Chain of Thought: Models Thinking in Words 05:25 - How Models Decide How Long to Think 07:24 - Evolution from o1 to o3 to GPT-5 11:00 - The Road to OpenAI: Growing up in Poland, Dropping out of School, Trading 20:32 - Working on Robotics and Rubik's Cube Solving 23:02 - A Day in the Life: Talking to Researchers 24:06 - How Research Priorities Are Determined 26:53 - OpenAI's Culture of Transparency 29:32 - Balancing Research with Shipping Fast 31:52 - Using OpenAI's Own Tools Daily 32:43 - Pre-Training Plus RL: The Modern AI Stack 35:10 - Reinforcement Learning 101: Training Dogs 40:17 - The Evolution of Deep Reinforcement Learning 42:09 - When GPT-4 Seemed Underwhelming at First 45:39 - How RLHF Made GPT-4 Actually Useful 48:02 - Unsupervised vs Supervised Learning 49:59 - GRPO and How DeepSeek Accelerated US Research 53:05 - What It Takes to Scale Reinforcement Learning 55:36 - Agentic AI and Long-Horizon Thinking 59:19 - Alignment as an RL Problem 1:01:11 - Winning ICPC World Finals Without Specific Training 1:05:53 - Applying RL Beyond Math and Coding 1:09:15 - The Path from Here to AGI 1:12:23 - Pure RL vs Language Models

111

1,162