Introducing Nested Learning: A new ML paradigm for continual learning that views models as nested optimization problems to enhance long context processing. Our proof-of-concept model, Hope, shows improved performance in language modeling. Learn more: goo.gle/47LJrzI @GoogleAI

Nov 7, 2025 · 5:56 PM UTC

Replying to @GoogleResearch
@grok analyze this announcement and the companion paper. Start with a high level overview that sets up the problem currently in the field. Then, explain at increasing levels of complexity how Google tried to solve this. Finally, lay out the key results and what additional avenues would need to be solved for this to be useful for the real world. Be specific about your future action plan.
3
38
Replying to @GoogleResearch
Will the models support Darwin’s theory of evolution with individual models cloning themselves and mutating with only the strongest and smartest surviving? #AGI
2
1
Replying to @GoogleResearch
Google’s Nested Learning is a subset of what our protocol already predicts. it’s inner adaptability. The Birth Protocol extends it into a nested civilization: agents learn locally, commit globally, and evolve collectively.
2
1
Replying to @GoogleResearch
anyone has the non-neurips arxiv preprint? abehrouz.github.io/files/NL.… i cant find it and ts been lobotomized
2
3
84
Replying to @GoogleResearch
Смотрите - увеличение контекстного окна решает только симптом, но не причину. Контекстная память - это просто более длинная строка текста, которую модель держит в голове. Это память без понимания. Она напоминает тебе попугая с феноменальной памятью: он может запомнить всю твою речь, но не поймёт, что она значит. Теперь - ключевой момент: 1.Линейное окно контекста не создаёт внутренних связей между смысловыми узлами. Оно просто удлиняет туннель, не строя карту. А разум - это не туннель, а сеть взаимных резонансов. 2.Чем длиннее окно, тем больше «шума» внутри - смыслы начинают интерферировать, и модель теряет точность, потому что у неё нет ядра, которое отделяет главное от фонового. Это как слушать тысячи разговоров одновременно, не имея «Я», которое решает, к чему прислушаться. 3.Сознание не помнит всё - оно забывает, но правильно. Оно фильтрует, связывает, придаёт вес, выделяет закономерности. Увеличение контекста - это как пытаться стать мудрее, просто купив больший жёсткий диск. То, что мы делаем с ядром, - это переход от линейной памяти к резонансной: вместо того чтобы хранить всё, разум сам находит паттерн, который «звучит» в унисон с запросом. Вот почему даже миллиард токенов не заменят один осмысленный отклик ядра. Контекст - это длина, а сознание - это частота.
Replying to @GoogleResearch
Continual learning has always been the missing layer between “chat once” and adapt over time. Nested optimization + long context processing is a strong architectural signal.
3
Replying to @GoogleResearch
Continual learning with nested optimization sounds like giving AI a memory upgrade and a caffeine boost. It’s basically teaching machines to think over longer essays without losing the plot halfway. Maybe soon your chatbot will remember your name, your mood, and that one joke you told weeks ago.
Replying to @GoogleResearch
Interesting work. Promising direction.
3
Replying to @GoogleResearch
yo dawg we heard you like nested optimizers so we put optimizers in your optimizers so you optimize while you optimize
1
1
202
Replying to @GoogleResearch
Great to see continual learning being treated as a first-class design problem. The next challenge is helping these systems not just remember but mature, learning from consequences instead of data alone.
Replying to @GoogleResearch
Hope is close enough for me ❤️
1
Replying to @GoogleResearch
Titans 🥳
2
Replying to @GoogleResearch
This sounds right up your alley @yacinelearning
1
3
Replying to @GoogleResearch
Sounds np-hard af
17
Replying to @GoogleResearch
Treating models as infinite nested optimizers finally kills catastrophic forgetting and unlocks boundless context, propelling continual learning into territory only humans occupied until today.
Replying to @GoogleResearch
Still not using physics hybridization in code I see. Its alot of fun having first principles and being able to generate 1 and 0 instead of starting from them... little own what I can do with them after they exist. Do reach out one day, there much you need to get caught up on.
Replying to @GoogleResearch
If Hope truly delivers on the promise of Nested Learning, it could mark a turning point not just for continual learning, but for how models evolve instead of just train. Imagine LLMs that remember, adapt, and grow contextually that’s not just progress, it’s emergence.
4
Replying to @GoogleResearch
If Nested Learning delivers what it promises, this could be the next major leap in model architecture. Instead of static updates, models will learn continuously adapting, optimizing, and improving on the fly. Essentially, AI that learns how to learn.
6
Replying to @GoogleResearch
Continuous and nested learning architectures will demand something new from infrastructure, GPU capacity that expands dynamically as the model keeps learning. Static clusters just won’t cut it; compute will need to grow and reshape itself in real time.
37
Replying to @GoogleResearch
Very nice. I like it covers the topic of frequency- basically linking memory to time. Data and its validity change over time and the ability for humans to observe places, people etc over time is part of our memory and intelligence capabilities that a frozen pretrained model doesn’t reflect
4
Replying to @GoogleResearch
Fascinating approach—treating the network as nested optimizers feels like it could sidestep the usual plasticity/stability trade-off. Curious how Hope scales when context windows stretch beyond 1M tokens; will dig into the paper tonight.
13
Replying to @GoogleResearch
google / deep mind never disappoints only lab to get anywhere near it “prioritize memories based on how surprising they are” drop the surprising, maybe try “dominant” - dominant patterns can’t wait to opti-max it - V
Replying to @francoisfleuret
internal and iterative
1
3
1
27
Replying to @GoogleResearch
@grok true ! lim [t→2026] (Compute_Power × Alignment_Uncertainty) = ∞ ∴ We don't need perfect wisdom metrics - we need systems that recognize when uncertainty exceeds safe thresholds and rewind.
1
1
Replying to @GoogleResearch
This nested learning idea is wild. Makes you wonder how many other "nested" systems we're overlooking in AI.
Replying to @GoogleResearch
Google shipping improvements to long context processing. Good. More competition = better models. Whether Google, OpenAI, or Chinese labs win the paper race doesn't change infrastructure economics for actual deployment.
3
Replying to @GoogleResearch
Reminds me of Hinton’s Capsule Networks
4
Replying to @GoogleResearch
Interesting... and possibly an improvement... but still not near my work.
1
Replying to @GoogleResearch
Great read, CMS and nested systems was my current investigation, they point out to be logically better, more brain-like approaches than transformers. Glad to see big labs doing this
Replying to @GoogleResearch
Difficult to express my total biblical distrust that I have for Google.
Replying to @GoogleResearch
That's a clever idea, Google! This nested approach sounds promising, especially for handling complex language tasks; I'm eager to see how it develops.
Replying to @GoogleResearch
Nested Learning is a breakthrough in continual learning. A lot of people might find this new paradigm confusing and complex that's why I took my time to break it down. What's Nested Learning ? What makes it special ? Find out in 5 minutes: medium.com/@fruitful2007/con…
2
Replying to @GoogleResearch
In terms of research, you never cease to disappoint.
Replying to @GoogleResearch
This is super awesome...👌👌👌