Berry farmer @ OpenAI | o3, o1, GPT4, ChatGPT, Codex, Solved Rubik’s cube with robotic hand | cautious AI optimist

San Francisco, CA
Joined January 2013
RL really feels like a technical revolution within revolution. It spun up a completely new wave of startups, products and thought leaders on top of a huge wave we were already riding on
16
7
1
230
Codex is giving me factorio dopamine hits times a healthy multiplier
managing fleets of agents should be more fun than playing factorio with the UI/UX to boot
12
18
406
Technical decisions matter kids
Who would have thought that a multi trillion dollar cap company could have been thrown into such chaos (layoffs) by a single technical decision they made a year ago - using expert choice MoEs for their frontier model.
3
8
187
Jerry Tworek retweeted
🧵 Announcing a $30B collaboration with Pringles for datacenter chips (1/N)👇
4
4
3
98
🎯
Working on llm RL is one of the most intellectually satisfying things I ever done, both from a system and ml perspective
1
96
I guess people are doing cool things with good old chatty
I used ChatGPT to solve an open problem in convex optimization. *Part I* (1/N)
1
92
Chess grandmasters must have something amazing going on in their brains
sometimes it just is mid
it can take hundreds of man-years, sometimes thousands to make some creative project, and mere minutes for you to call it mid. what power you have
4
3
191
Looking at training dynamics of neural networks it is an amazing feat of nature that humans crash out so infrequently
3
6
2
121
AIs today are not perfect, but this alone is the most bullish argument why they have a lot of low hanging fruit improvements ahead
> Codex, do X. *writes wrong code* > This is wrong because <reason>. Instead, do <solution>. *writes wrong code* > ?? Can you EXPLAIN why your code is wrong? *writes perfect explanation* > If you get it, why didn't you apply that to your code? *writes perfect code* ...
9
17
396
Thanks to twitter vibes optimisation pressure we’re likely getting models great at one shot generating svg files which is probably something no one ever asked for 😄
18
6
2
257
It is kind of sweet to think that most viable research ideas have already been tried and there is no overhang that could make our models 10 thousand times better on current hardware. That's some confidence.
vibe coding. feels. so. good.
34
7
1
284
Interesting fact about nature is that it stumbled into general intelligence when trying to make monkeys that are hungry less often.
Politely and constructively disagreeing is an art humanity should get much better at
unlearning is harder than learning
This is mostly how I imagine postagi life
nothin better than kicking off a couple codex jobs and going back to watching a physics lecture i don't understand (from Prof. Coskun Kocabas talk on topological effects in graphene at @periodiclabs)
I don't do podcasts very often - in reality this is my first one ever, but if anyone wants to listen to someone talk about RL for an hour, this is it
How GPT-5 thinks, with @OpenAI VP of Research @MillionInt 00:00 - Intro 01:01 - What Reasoning Actually Means in AI 02:32 - Chain of Thought: Models Thinking in Words 05:25 - How Models Decide How Long to Think 07:24 - Evolution from o1 to o3 to GPT-5 11:00 - The Road to OpenAI: Growing up in Poland, Dropping out of School, Trading 20:32 - Working on Robotics and Rubik's Cube Solving 23:02 - A Day in the Life: Talking to Researchers 24:06 - How Research Priorities Are Determined 26:53 - OpenAI's Culture of Transparency 29:32 - Balancing Research with Shipping Fast 31:52 - Using OpenAI's Own Tools Daily 32:43 - Pre-Training Plus RL: The Modern AI Stack 35:10 - Reinforcement Learning 101: Training Dogs 40:17 - The Evolution of Deep Reinforcement Learning 42:09 - When GPT-4 Seemed Underwhelming at First 45:39 - How RLHF Made GPT-4 Actually Useful 48:02 - Unsupervised vs Supervised Learning 49:59 - GRPO and How DeepSeek Accelerated US Research 53:05 - What It Takes to Scale Reinforcement Learning 55:36 - Agentic AI and Long-Horizon Thinking 59:19 - Alignment as an RL Problem 1:01:11 - Winning ICPC World Finals Without Specific Training 1:05:53 - Applying RL Beyond Math and Coding 1:09:15 - The Path from Here to AGI 1:12:23 - Pure RL vs Language Models
46
111
4
1,162
Necessary prerequisite to doing something that matters is believing that what you’re doing matters
6
9
1
157
Jerry Tworek retweeted
NEWS: Taylor Swift to enter into a multibillion dollar deal with OpenAI to deploy 10 gigawatts of AI data centers
Taylor Swift will make an announcement on GMA (Good Morning America) tomorrow, October 13th.
177
538
43
14,326