.@RichardSSutton, father of reinforcement learning, doesn’t think LLMs are bitter-lesson-pilled. My steel man of Richard’s position: we need some new architecture to enable continual (on-the-job) learning. And if we have continual learning, we don't need a special training phase - the agent just learns on-the-fly - like all humans, and indeed, like all animals. This new paradigm will render our current approach with LLMs obsolete. I did my best to represent the view that LLMs will function as the foundation on which this experiential learning can happen. Some sparks flew. 0:00:00 – Are LLMs a dead-end? 0:13:51 – Do humans do imitation learning? 0:23:57 – The Era of Experience 0:34:25 – Current architectures generalize poorly out of distribution 0:42:17 – Surprises in the AI field 0:47:28 – Will The Bitter Lesson still apply after AGI? 0:54:35 – Succession to AI
much of Sutton’s critique of LLMs is virtually identical to what I have been arguing for many many years. it is disappointing @dwarkesh_sp that you would not let me present my views.

Sep 26, 2025 · 7:11 PM UTC

16
2
164
I don't think that's true.
1
Yes, this is about you
14
You are an unpleasant person.
1
8
"would not let me present my views" - what does this mean? He did not invite you on his podcast?
6
I'd like to see you debate someone on his show.
5
Gary when someone asks him about Dwarkesh podcast
5
GIF
Bc Gary is the kind of guy that would get a negative view count
1
Sutton gets his own frame bias of TD learning algorithm?
He had to be taken over the threshold by a (former) believer.
Gary, the last thing we want on this podcast is it to have it be polluted by your relentless self-aggrandizing intellectually incoherent noise generation.
1
Nobody cares about Gary