Dwarkesh Patel · Sep 26, 2025 · 4:01 PM UTC

Dwarkesh Patel

Dwarkesh Patel

@dwarkesh_sp

Sep 26

.@RichardSSutton, father of reinforcement learning, doesn’t think LLMs are bitter-lesson-pilled. My steel man of Richard’s position: we need some new architecture to enable continual (on-the-job) learning. And if we have continual learning, we don't need a special training phase - the agent just learns on-the-fly - like all humans, and indeed, like all animals. This new paradigm will render our current approach with LLMs obsolete. I did my best to represent the view that LLMs will function as the foundation on which this experiential learning can happen. Some sparks flew. 0:00:00 – Are LLMs a dead-end? 0:13:51 – Do humans do imitation learning? 0:23:57 – The Era of Experience 0:34:25 – Current architectures generalize poorly out of distribution 0:42:17 – Surprises in the AI field 0:47:28 – Will The Bitter Lesson still apply after AGI? 0:54:35 – Succession to AI

255

637

340

4,538

Jacob Burgess · Sep 27, 2025 · 9:58 PM UTC

Jacob Burgess · Sep 27, 2025 · 9:58 PM UTC

Jacob Burgess

@Burgess_33

Sep 27

Replying to @dwarkesh_sp @RichardSSutton

I really like the point on kids, especially having an active 2 year old and 3 month old. Kids mimic to an extent, but they also push boundaries and surprise you with new types of behavior you don’t expect. More challenging and limit testing the world for reaction than mimicking.

Sep 27, 2025 · 9:58 PM UTC