Andy Zeng · Sep 26, 2025 · 1:03 PM UTC

Andy Zeng

Pinned Tweet

Andy Zeng

@andyzengineer

Sep 26

This is one-shot assembly: you show examples of what to build, and the robot just does it. (see original post: generalistai.com/blog) To share more on how this works, the robot is controlled in real time by a neural network that takes in video pixels and outputs 100Hz actions. The video below is part of the raw input passed directly into the model. I also like this view (at 1x speed) because it shows more of the (I think very cool) subtle moments of dexterity near the fingertips 👌 One-shot assembly seemed like a dream even just a year ago — it's not easy. It requires both the high-level reasoning of "what to build" (recognizing the geometry of the structures presented by the human), and the low-level visuomotor control of "how to build it" (purposefully re-orienting individual pieces and nudging them together in place). While possible to manually engineer a complex system for this (e.g. w/ hierarchical control, or explicit state representations), we were curious if our own Foundation model could do it all end-to-end with just some post-training data. Surprisingly, it just worked. Nothing about the recipe is substantially different than any other demo we’ve run in the past, and we’re excited about its implications on model capabilities: • On contextual reasoning, these models can (i) attend to task-related pixels in the peripheral view of the video inputs, and (ii) retain this knowledge in-context while ignoring irrelevant background. This is useful for generalizing to a wide range of real workflows: e.g. paying attention to what’s coming down the conveyor line, or glancing at the instructions displayed on a nearby monitor. • On dexterity, these models can produce contact-rich "commonsense" behaviors that can be difficult to pre-program or write language instructions for e.g. rolling a brick slightly to align its studs against the bottom of another, re-grasping to get a better grip or to move out of the way before a forceful press, or gently pushing the corners of a brick against the mat to rotate it in hand and stand it up vertically (i.e. extrinsic dexterity). These aspects work together to form a capability that resembles fast adaptation — a hallmark of intelligence, relevant for real use cases. This has also expanded my own perspective on what's possible with robot learning, using a recipe that's repeatable for many more skills. This milestone stands on top of the solid technical foundations we’ve built here at Generalist: hardcore controls & hardware, all in-house built models, and a data engine that "just works." We're a small group of hyper-focused engineers, and hands-down the highest talent-density team I’ve ever worked with. We're accelerating and scaling aggressively towards unlocking next-generation robot intelligence. Building Legos is just one example, and it's clear to me that we're headed towards a future where robots can do just about anything we want them to. Its coming, and we're going to make it happen.

273

Andy Zeng · Nov 5, 2025 · 3:38 AM UTC

Andy Zeng

@andyzengineer

Nov 5

Glad you caught this 🙂 Never thought we’d go this deep into networking. Move enough robot data globally 🌎 and your NAT gateways might get silently banned mid-Atlantic... 🤦 The world is not ready yet for massive-scale robots—but its coming, and we’ll lay the groundwork for it.

huaijiang

@huaijiangzhu

Nov 5

ok actually i think this is probably the most underappreciated part. these guys are serious about scaling. it’s not just talk.

huaijiang · Nov 5, 2025 · 12:55 AM UTC

Andy Zeng retweeted

huaijiang

@huaijiangzhu

Nov 5

ok actually i think this is probably the most underappreciated part. these guys are serious about scaling. it’s not just talk.

Generalist

@GeneralistAI

Nov 4

Introducing GEN-0, our latest 10B+ foundation model for robots ⏱️ built on Harmonic Reasoning, new architecture that can think & act seamlessly 📈 strong scaling laws: more pretraining & model size = better 🌍 unprecedented corpus of 270,000+ hrs of dexterous data Read more 👇

Kamyar Ghasemipour · Nov 4, 2025 · 5:26 PM UTC

Andy Zeng retweeted

Kamyar Ghasemipour

@coolboi95

Nov 4

At Generalist, robotics is no longer limited by data. Breaking through the data wall has enabled us to become a true foundation model company, building and scaling models from the ground up for embodied intelligence. In today’s blog post we’re excited to share details about GEN-0, our first generation of embodied foundation models: • GEN-0 is a class of custom, natively cross-embodied, model architectures built from the ground up for the decadent dexterous behaviors Generalist has become known for. • We have scaled GEN-0 to 10B+ fully active parameters and continue to push the boundaries of scale for training and inference. • GEN-0 foundation models are thus far pretrained on 270,000+ hours of real world diverse manipulation data. We collect 10,000 hours of data per week and are accelerating. • We are now seeing that massive-scale pretraining leads to beautifully sample-efficient finetuning on downstream tasks, delivering on the fundamental promise of embodied foundation models. • The scale of our data trove has enabled us to conduct detailed science on pretraining. Scaling Laws are alive and well in robotics. All of the above and much more on our blog post: generalistai.com/blog/nov-04…

GEN-0 - Generalist AI

Embodied Foundation Models That Scale with Physical Interaction

generalistai.com

Generalist

@GeneralistAI

Nov 4

267

Andy Zeng · Nov 4, 2025 · 10:10 PM UTC

Andy Zeng

@andyzengineer

Nov 4

General dexterity involves "physical commonsense" —learning long tail of cases like: 🤏nudging objects to make space for fingers to grasp 🫴placing down slipping objects to get better grip etc... This👇shares more on how we think robots🦾can get there with models & lots of data

Generalist

@GeneralistAI

Nov 4

Andy Zeng · Oct 8, 2025 · 1:04 AM UTC

Andy Zeng

@andyzengineer

Oct 8

"Train a model to predict how many timesteps left until task success" - a simple, yet powerful way to get rewards from episodic BC data Lots of nuggets in the paper (including steps-to-go fn is distributionally multimodal) Kamyar's post 👇 on how it drives RL self-improvement

Kamyar Ghasemipour

@coolboi95

Oct 7

Super excited to finally share our work on “Self-Improving Embodied Foundation Models”!! (Also accepted at NeurIPS 2025) • Online on-robot Self-Improvement • Self-predicted rewards and success detection • Orders of magnitude sample-efficiency gains compared to SFT alone • Generalization enables novel skill acquisition 🧵👇[1/11]

Danfei Xu · Sep 26, 2025 · 8:18 PM UTC

Andy Zeng retweeted

Danfei Xu

@danfei_xu

Sep 26

One shot imitation learning! Brings back good memories from eons ago (aka 2016). This is probably my favorite demo of the year (so far). The smoothness and agility of these systems speak to the quality of the full stack system. Sometimes I tell my students that a good number of problems could be fixed with better robot controller and overall system integration (time sync etc.) rather than fancy learning algorithms and bigger models ...

Andy Zeng

@andyzengineer

Sep 26

Kamyar Ghasemipour · Sep 24, 2025 · 11:27 PM UTC

Andy Zeng retweeted

Kamyar Ghasemipour

@coolboi95

Sep 24

it just worked 🤷‍♂️

Generalist

@GeneralistAI

Sep 24

At Generalist, we’re working towards a future where robots can do anything. To that end, the robots build now, too. We’ve trained a robot to do one-shot assembly, constructing Legos end-to-end: no custom engineering, just pixels in → Lego copies out.

Andy Zeng · Sep 24, 2025 · 11:03 PM UTC

Andy Zeng

@andyzengineer

Sep 24

Imagine having a "copy-paste" button 📋 for the physical world – a pretty good litmus test for embodied AGI

Generalist

@GeneralistAI

Sep 24

106

Kento Kawaharazuka / 河原塚健人 · Sep 4, 2025 · 12:24 PM UTC

Andy Zeng retweeted

Kento Kawaharazuka / 河原塚健人 @KKawaharazuka

Sep 4

🎉Advanced Robotics Best Survey Paper Award has been awarded to our survey paper "Real-World Robot Applications of Foundation Models: A Review"! We are truly grateful to everyone who contributed! Thank you @__tmats__, Andrew, @jiaxianguo07, @chris_j_paxton, and @andyzeng_ !

Kento Kawaharazuka / 河原塚健人 @KKawaharazuka

9 Feb 2024

How can existing robot systems be replaced with foundation models? Check out our new survey paper on the real-world robot applications of foundation models: arxiv.org/abs/2402.05741 Thread👇

117

Kamyar Ghasemipour · Jun 18, 2025 · 8:31 AM UTC

Andy Zeng retweeted

Kamyar Ghasemipour

@coolboi95

Jun 18

Replying to @JonathanTo64772

Thank you 🙏🙏 We ran the policies with “--butter” flag for the videos 😄

Felix Wang · Jun 17, 2025 · 5:23 PM UTC

Andy Zeng retweeted

Felix Wang

@felixwyw

Jun 17

Excited to be out of stealth! Since graduating from MIT, I’ve been building something incredibly FUN. Every day at work is full of laughter and “wows.” Super proud of the fantastic team making it all happen. Let’s go 🚀

Generalist

@GeneralistAI

Jun 17

Today we're excited to share a glimpse of what we're building at Generalist. As a first step towards our mission of making general-purpose robots a reality, we're pushing the frontiers of what end-to-end AI models can achieve in the real world. Here's a preview of our early results in autonomous general-purpose dexterous capabilities – fast, reactive, smooth, precise, bi-manual coordinated sensorimotor control.

192

Shuran Song · Jun 18, 2025 · 3:37 AM UTC

Andy Zeng retweeted

Shuran Song

@SongShuran

Jun 18

There’s something satisfying to see the robot slotting in the box flaps so nicely in the end ... 😌

Generalist

@GeneralistAI

Jun 17

141

Kathleen Brandes · Jun 17, 2025 · 4:43 PM UTC

Andy Zeng retweeted

Kathleen Brandes @kbrandes45

Jun 17

Check out our robots! 🤖

Generalist

@GeneralistAI

Jun 17

Andy Zeng · Jun 17, 2025 · 5:25 PM UTC

Andy Zeng

@andyzengineer

Jun 17

To see emergent behaviors from low-level policies was a first for many of us on the team. They don't happen often enough yet, but it certainly feels like we're headed in the right direction. Reach out if you're interested in working together.

Generalist

@GeneralistAI

Jun 17

118

Evan Morikawa · Jun 17, 2025 · 4:51 PM UTC

Andy Zeng retweeted

Evan Morikawa

@E0M

Jun 17

I’m excited to announce @GeneralistAI_ We believe the path to general-purpose robots starts with precise, fast, and resilient manipulation. What you see here are end-to-end AI models, trained from scratch, doing some very hard tasks. This is pixels in, actions out. While I’m proud of how far we’ve pushed the frontier in a year, this is only the beginning of the end-to-end AI era in robotics. This is not our “ChatGPT moment” — this is what the “GPT-2 of robotics” looks like. Robotics is at an inflection point right now that feels analogous to NLP pre GPT-3. IBM Watson, sentiment analysis, translation, and image classification were all specialized, hardcoded tasks. Robotics has picking, fulfillment, and dozens of other specialized, hardcoded tasks. LLMs and ChatGPT proved the market was bigger than just those NLP tasks. We are proving a Generalist robot can be more.

Generalist

@GeneralistAI

Jun 17

181

Pete Florence · Jun 17, 2025 · 5:06 PM UTC

Andy Zeng retweeted

Pete Florence

@peteflorence

Jun 17

Last Spring I took off from Google DeepMind, and I've been heads-down building since with an amazing team. Excited to share more today -- introducing Generalist. It's felt to me for a couple years, since we started bringing multimodal LLMs into robotics, that a subset of the ingredients for creating truly general purpose robot intelligence seem to be falling into place. But what's been needed is a new focus at the intersection of data, models, and hardware. No amount of downloading data from the internet, by itself, will create the level of fast, fluid, precise, reactive layer of intelligence in being able to interact with the physical world. In due time we'll be excited to share more, but what we're sharing today is about what the models have grown to be capable of. We think we've hit a new point on the frontier of general purpose real world intelligence – new levels of simultaneously fast, smooth, precise, reactive, bi-manual coordinated dexterity. Looking forward to sharing even more. Super proud of the team we've put together, and where we're headed. Reach out if you'd like to chat about working together!

Generalist

@GeneralistAI

Jun 17

415

Generalist · Jun 17, 2025 · 5:31 AM UTC

Andy Zeng retweeted

Generalist

@GeneralistAI

Jun 17

We've been heads-down building. The robots have gotten pretty good. We'll be sharing a brief update soon.

136

Shuran Song · Oct 15, 2024 · 3:27 AM UTC

Andy Zeng retweeted

Shuran Song

@SongShuran

15 Oct 2024

Position control can only go so far. For contact-rich tasks, robots must master both position and force – that’s where compliance comes in! But what’s the right compliance? 🤔Hint: being always compliant in all directions won’t cut it. Check out @YifanHou2’s solution 😉⤵️

Yifan Hou

@YifanHou2

15 Oct 2024

Can robots learn to manipulate with both care and precision? Introducing Adaptive Compliance Policy, a framework to dynamically adjust robot compliance both spatially and temporally for given manipulation tasks from human demonstrations. Full detail at adaptive-compliance.github.i…

Andy Zeng · Oct 15, 2024 · 4:49 PM UTC

Andy Zeng

@andyzengineer

15 Oct 2024

UMI data doubles as a 3D vision and robotics dataset. I'm curious to see what people can do with it! Consider prototyping with and contributing to the community dataset 👇

Shuran Song

@SongShuran

14 Oct 2024

We recently launched umi-data.github.io as a community-driven effort to pool UMI-related data together. 🦾 If you are using a UMI-like system, please consider adding your data here. 🤩🤝 No dataset is too small; small data WILL add up!📈