Jeffrey Seely retweeted
not gonna screen-shot & dump on those commenting on the depth of linear algebra... for i too was once a fool and did not appreciate its beauty (a long time ago)... better to point out beauty than ridicule stupidity... so here is something beautiful about linear algebra (or, even, matrices, if you must) == Fundamental Theorem of Linear Algebra == for any linear transformation T : V=> W V = kernel T (+) coimage T W = image T (+) cokernel T and the image and coimage are naturally isomorphic ===================== this is utterly beautiful, as it explains: * what is done and what is left undone by T * most of the properties of rank & nullity * what the pseudoinverse means * why least squares works (with inner products) * & so much more....
+100. Learning math is fun and easy when it’s for a research goal.
Exactly. I learned a ton of math during my PhD, and it was fun and easy *because I had a goal* to use it in my research. Coding it up is also a great way to detect gaps in your understanding. Totally different from learning in class. Another common fallacy is that you need to follow the logical curriculum and complete all the prerequisites for a topic before learning it. Instead I find that going up and down the curriculum repeatedly is much more effective. That way, you have an understanding of where the basics fit in, and why you're learning it, which helps with comprehension and motivation. Inspired by the success of LLM pretraining, I even started reading random papers by Grothendieck, Scholze and Mochizuki that are way above my head, soaking my brain in genius vibes so to speak, in the hope of immitation-learning some good representations. Not sure if it has worked but it feels good 😂
1
3
Jeffrey Seely retweeted
Below is a deep dive into why self play works for two-player zero-sum (2p0s) games like Go/Poker/Starcraft but is so much harder to use in "real world" domains. tl;dr: self play converges to minimax in 2p0s games, and minimax is really useful in those games. Every finite 2p0s game has a minimax equilibrium, which is essentially an unbeatable strategy in expectation (assuming the players alternate sides). In rock paper scissors, for example, minimax is 1/3rd on each action. Is minimax what we want? Not necessarily. If you're playing minimax in Rock Paper Scissors when most opponents' strategies are "always throw Rock" then you're clearly suboptimal, even though you're not losing in expectation. This especially matters in a game like poker because playing minimax means you might not make as much money off of weak players as you could if you maximally exploited them. But the guarantee of "you will not lose in expectation" is really nice to have. And in games like Chess and Go, the difference between a minimax strategy and a strategy that optimally exploits the population of opponents is negligible. For that reason, minimax is typically considered the goal for a two-player zero-sum game. Even in poker, the conventional wisdom among top pros is to play minimax (game theory optimal) and then only deviate if you spot clear weaknesses in the opponent. Sound self play, even from scratch, is guaranteed to converge to a minimax equilibrium in finite 2p0s games. That's amazing! By simply scaling memory and compute, and with no human data, we can converge to a strategy that's unbeatable in expectation. What about non-2p0s games? Sadly, pure self play, with no human data, is no longer guaranteed to converge to a useful strategy. This can be clearly seen in the Ultimatum Game. Alice must offer Bob $0-100. Bob then accepts or rejects. If Bob accepts, the money is split according to Alice's proposal. If Bob rejects, both receive $0. The equilibrium (specifically, subgame perfect equilibrium) strategy is to offer 1 penny and for Bob to accept. But in the real world, people aren't so rational. If Alice were to try that strategy with real humans she would end up with very little money. Self play becomes untethered from what we as humans find useful. A lot of folks have proposed games like "an LLM teacher proposes hard math problems, and a student LLM tries to solve them" to achieve self-play training, but this runs into similar problems as the Ultimatum game where the equilibrium is untethered from what we as humans find useful. What should the reward for the teacher be in such a game? If it's 2p0s then the teacher is rewarded if the student couldn't solve the problem, so the teacher will pose impossible problems. Okay, what if we reward it for the student having a 50% success rate? Then the teacher could just flip a coin and ask the student if it landed Heads. Or the teacher could ask the student to decrypt a message via an exhaustive key search. Reward shaping to achieve intended behavior becomes a major challenge. This isn't an issue in 2p0s games. I do believe in self play. It provides an infinite source of training, and it continuously matches an agent with an equally skilled peer. We've also seen it work in some complex non-2p0s settings like Diplomacy and Hanabi. But applying it outside of 2p0s games is a lot harder than it was for Go, Poker, Dota, and Starcraft.
Self play works so well in chess, go, and poker because those games are two-player zero-sum. That simplifies a lot of problems. The real world is messier, which is why we haven’t seen many successes from self play in LLMs yet. Btw @karpathy did great and I mostly agree with him!
62
172
32
1,599
Jeffrey Seely retweeted
nobody wants to be in a code-pendant relationship
3
1
16
Jeffrey Seely retweeted
Super happy to announce that Continuous Thought Machines has been accepted as a spotlight for NeurIPS2025. We are working on so very many fascinating directions - the CTM architecture just keeps opening doors to fun, thought-provoking projects.
We are excited to share that “Continuous Thought Machines” has been accepted as a Spotlight at #NeurIPS2025! 🧠✨ The CTM is an AI that mimics biological brains by using neural dynamics & synchronization to think over time. It can solve complex mazes by building internal maps, gaze around images to classify them, and learn algorithms—all emergent from its core design. This is just the beginning. A hint of what we're exploring next… (video attached!) The team: @LearningLukeD @ciaran_regan_ @risi1979 @jeffreyseely @YesThisIsLion
1
8
59
I like this tweet. Active inference is "obviously true," but the details of how it all fits in is still not really fleshed out. You don't get GUT status just for choosing the right problem to work on. It's the unsolved details here that are so compelling.
Replying to @fchollet
The FEP posits that intelligent agents don't just passively absorb data to build a world model (the way DL models are trained), instead they actively sample their environment to reduce uncertainty and bring their internal model into alignment with reality. This is 100% correct
2
7
Exhibit A
Replying to @poetengineer__
working hypothesis that mathematicians with artistic sensibilities tend to gravitate towards geometry and topology over analysis
3
can you find the cat?
can you find the cat?
2
Jeffrey Seely retweeted
You can’t spell bellman without llm
.@RichardSSutton, father of reinforcement learning, doesn’t think LLMs are bitter-lesson-pilled. My steel man of Richard’s position: we need some new architecture to enable continual (on-the-job) learning. And if we have continual learning, we don't need a special training phase - the agent just learns on-the-fly - like all humans, and indeed, like all animals. This new paradigm will render our current approach with LLMs obsolete. I did my best to represent the view that LLMs will function as the foundation on which this experiential learning can happen. Some sparks flew. 0:00:00 – Are LLMs a dead-end? 0:13:51 – Do humans do imitation learning? 0:23:57 – The Era of Experience 0:34:25 – Current architectures generalize poorly out of distribution 0:42:17 – Surprises in the AI field 0:47:28 – Will The Bitter Lesson still apply after AGI? 0:54:35 – Succession to AI
2
1
5
Jeffrey Seely retweeted
Replying to @leothecurious
You'd be amazed at the kind of stuff buried in papers from 30 years ago with a dozen citations.
1
1
6
the G in AGI actually stands for Grothendieck
This is an unwise statement that can only make people confused about what LLMs can or cannot do. Let me tell you something: Math is NOT about solving this kind of ad hoc optimization problems. Yeah, by scraping available data and then clustering it, LLMs can sometimes solve some very minor math problems. It's an achievement, and I applaud you for that. But let's be honest: this is NOT the REAL Math. Not by 10,000 miles. REAL Math is about concepts and ideas - things like "schemes" introduced by the great Alexander Grothendieck, who revolutionized algebraic geometry; the Atiyah-Singer Index Theorem; or the Langlands Program, tying together Number Theory, Analysis, Geometry, and Quantum Physics. That's the REAL Math. Can LLMs do that? Of course not. So, please, STOP confusing people - especially, given the atrocious state of our math education. LLMs give us great tools, which I appreciate very much. Useful stuff! Go ahead and use them AS TOOLS (just as we use calculators to crunch numbers or cameras to render portraits and landscapes), an enhancement of human abilities, and STOP pretending that LLMs are somehow capable of replicating everything that human beings can do. In this one area, mathematics, LLMs are no match to human mathematicians. Period. Not to mention many other areas. Calling on my friend @EricRWeinstein and @GaryMarcus, who has been one of the few sane expert voices on these matters lately. 🙏 h/t @hellheff
1
2
14
thinking of adding some Stephen Boyd flair to my paper by referring to the next section as "in the sequel"
1
5
Jeffrey Seely retweeted
Coming March 17, 2026! Just got my advance copy of Emergence — a memoir about growing up in group homes and somehow ending up in neuroscience and AI. It’s personal, it’s scientific, and it’s been a wild thing to write. Grateful and excited to share it soon.
The linear systems book is a bit straightforward, but it's a good no-frills speedrun of all things linear dynamical systems and control. Linearity is the price you pay if you want to model systems that interact with the external world.
1
2
Chronological...
What four math books had a big influence on your mathematical thinking? I'll start:
1
2
12
someone's going to write a paper with the title "the grothendieck machine" a la boltzmann machine, or schmidhuber's godel machine. but i think 50 years need to pass before you're allowed to do that
1
8
markov blankets? in the middle of july?!?!
9
Jeffrey Seely retweeted
Submissions for papers and artworks are open until August 2 for the #CreativeAI Track at @NeurIPSConf! Whether you're an academic, artist, or a bit of both, we'd love to see your work!
We just opened the submissions portal for the #CreativeAI Track at @NeurIPSConf 🥳 🤖 Submit papers and art until 2nd August Details: bit.ly/NeurIPSCreativeAI cc @marcelocoelho @priyascape @alanyttian #NeurIPS2025
5
18
this one page from @stevenstrogatz has ripple effects across entire academic careers (good ripples of course)
1
7
73