.@RichardSSutton, father of reinforcement learning, doesnโt think LLMs are bitter-lesson-pilled.
My steel man of Richardโs position: we need some new architecture to enable continual (on-the-job) learning.
And if we have continual learning, we don't need a special training phase - the agent just learns on-the-fly - like all humans, and indeed, like all animals.
This new paradigm will render our current approach with LLMs obsolete.
I did my best to represent the view that LLMs will function as the foundation on which this experiential learning can happen. Some sparks flew.
0:00:00 โ Are LLMs a dead-end?
0:13:51 โ Do humans do imitation learning?
0:23:57 โ The Era of Experience
0:34:25 โ Current architectures generalize poorly out of distribution
0:42:17 โ Surprises in the AI field
0:47:28 โ Will The Bitter Lesson still apply after AGI?
0:54:35 โ Succession to AI
much of Suttonโs critique of LLMs is virtually identical to what I have been arguing for many many years.
it is disappointing @dwarkesh_sp that you would not let me present my views.
Sutton gets his own frame bias of TD learning algorithm?
Sep 27, 2025 ยท 2:58 AM UTC


