“Hope, however, is a self-modifying recurrent architecture that can take advantage of unbounded levels of in-context learning and also is augmented with CMS blocks to scale to larger context windows.” that will be so cool.
Introducing Nested Learning: A new ML paradigm for continual learning that views models as nested optimization problems to enhance long context processing. Our proof-of-concept model, Hope, shows improved performance in language modeling. Learn more: goo.gle/47LJrzI @GoogleAI
piped.video/K09erFsOnxA?t=1227 Koltun's team established a rule that they should never kick a robot. That reminds me of my bad impression about unitree, in their PR video they show so often bad behaviors treating robots, makes me wonder are they really building "intelligent" robot?
and knowledge as described in incompleteideas.net/IncIdeas…, outcome conditioned prediction.
In our new work - Algorithm Distillation - we show that transformers can improve themselves autonomously through trial and error without ever updating their weights. No prompting, no finetuning. A single transformer collects its own data and maximizes rewards on new tasks. 1/N
They all look quite equivalent to me: induced factors, parity verification matrix, codebook.
And it looks to me we always transform information in the following order most often (1st order approximation): Reality => see/act/(w * reason with language model in mind) => language. w change with growth. Show thinking
Compared to VAE and Diffusion models, "The noise channel" the GPT must overcome is the inherent ambiguity and uncertainty of language", hence the information acquision from this source is in theory limited by how well can human acquire information from reality using language
The lecture slides and videos for the first six weeks of my new course are now posted on the open book website: ma-lab-berkeley.github.io/de…
Looool retweeted
Machines that can predict what their sensors (touch, cameras, keyboard, temperature, microphones, gyros, …) will perceive are already aware and have subjective experience. It’s all a matter of degree now. More sensors, data, compute, tasks will lead without any doubt to the “I think therefore I am” moment for computers, and we’re not ready for it yet. arxiv.org/pdf/1804.06318 share.google/kxx6WyqHpwPmo6Q…
Looool retweeted
Looking fwd to playing w this @GoogleDeepMind developers.googleblog.com/en…
2
6
57
Over the past year, my lab has been working on fleshing out theory/applications of the Platonic Representation Hypothesis. Today I want to share two new works on this topic: Eliciting higher alignment: arxiv.org/abs/2510.02425 Unpaired rep learning: arxiv.org/abs/2510.08492 1/9
Looool retweeted
Excited to share our new paper on AI-Driven Research for Systems. We show that AI can autonomously generate and verify novel solutions for classic systems performance problems, matching or exceeding human designs. A glimpse into how AI might transform not only systems, but the research process itself.
🚀 Excited to release our new paper: “Barbarians at the Gate: How AI is Upending Systems Research” We show how AI-Driven Research for Systems (ADRS) can rediscover or outperform human-designed algorithms across cloud scheduling, MoE expert load balancing, LLM-SQL optimization, transaction scheduling, and more — all within hours ⚡️ and under $20 💰. 🧵👇 Check it out!
5
33
177
Real time online 3D reconstruction of 3D scene and humans represented with SMPL. fanegg.github.io/Human3R/ I don't get tired of looking at these results
今天的心情也让我想起那个徒手攀岩的Alex,看到山和不平处就想去爬。最后是山和被攀爬的对象在不断shape他的肌肉认知和技巧一样。这和米开朗基罗倒是相反。
This tweet is unavailable
Looool retweeted
A couple bits of news: 1. Happy to share my first (human) NetHack ascension-next step is RL agents :) 2. I wrote a post discussing some @NetHack_LE challenges & how they map to open problems in RL & agentic AI. Still the best RL benchmark imo. mikaelhenaff.substack.com/p/…
Looool retweeted
Introducing Scalable Option Learning (SOL☀️), a blazingly fast hierarchical RL algorithm that makes progress on long-horizon tasks and demonstrates positive scaling trends on the largely unsolved NetHack benchmark, when trained for 30 billion samples. Details, paper and code in >