Pinned Tweet
Turbocharge speed & quality gains in Diffusion World Models! 🚨 ︀︀ ︀︀- 8x8 AE w/ depth latents → 4x fewer tokens, 4x FPS boost ︀︀- 4x4 flow+depth AE in progress → next-level consistency ︀︀- DMD distillation: 16→2 steps = 8x faster sampling ︀︀- Custom RoPE fix → 20x faster attention ︀︀- Strong KV caching → O(n) rollout, targeting 400 FPS ︀︀- PoE on physics sims, 1 Kh dataset ︀︀ ︀︀What’s next: product-of-experts, full-time hires! 🧵 1/3 ︀︀
cooking simulator 2025 coming to a waypoint near you getting a steady 100fps+ on our test devices (frame rate in top right is for something else)
OWL retweeted
Is YOUR KV cache eating all your VRAM? Are you getting OOMs before your model can even generate a single frame? Is your GPU pleading with you to just go back to UNets? 🧵 1/N
1
2
1
10
Compress your picture, but don't get your addresses wrong. Result: ~50% less KV cache memory with near-zero overhead, thanks to torch.compile. 4/N
2
1
Quantizing without losing performance is all about choosing what to quantize. We're chopping our KV cache in half by keeping Keys in BF16 while storing Values in FP8. A noisy Key is a wrong address. 🗺️ A noisy Value is just a fuzzy picture. 🖼️ 3/N
1
1
The key (haha) is to differentiate your Keys and Values. They do different things and have different sensitivities. The big insight: An error in a Key derails the entire attention lookup, but the model is surprisingly robust to errors in Values. 2/N
1
1
We're incredibly excited to be crowd sourcing data for our model being trained with @nebiusai
Do you want to contribute to the future of open science world models? We are paying $5/h to gamers of all skill levels to record their gameplay w/ owl-control. 🧵 1/N
5
1
36
Do you want to contribute to the future of open science world models? We are paying $5/h to gamers of all skill levels to record their gameplay w/ owl-control. 🧵 1/N
OWL retweeted
👀World Models feel almost like magic. Finally starting to see some hope after countless hours of debugging with @praymesh and @Summer_1932005 . Also @wayfarerlabs has an amazing open source ecosystem for training world models on any game that you can think of.
We're building something truly amazing @wayfarerlabs.
Have't done quantization or max autotune yet, getting 30FPS just from basic compile on RTX 5070 on a laptop :D Next step is to get fp8 60fps, then after that I gotta plug the upsampler in. These models will run on everyones hardware!
1
3
17
Have't done quantization or max autotune yet, getting 30FPS just from basic compile on RTX 5070 on a laptop :D Next step is to get fp8 60fps, then after that I gotta plug the upsampler in. These models will run on everyones hardware!
they yearn for the holodeck 😔
2
3
18
0
We're hiring GAN people btw DM if interested
1
16
I'm currently working on building out the optimization stuff and even before quantization/proper compile we're hitting 30fps on a 5070, so this will run on consumer GPUs! While we get distillation and quantization sorted it's fun to look at base model samples for motivation :D
Our models run real time on a laptop 5090 at 100+ FPS. We're so excited for people to start playing with it.
2
5
62
Our models run real time on a laptop 5090 at 100+ FPS. We're so excited for people to start playing with it.
.@shah_bu_land just engineered a new VAE. Is it better than the old one?
This AI-generated video game is running end-to-end on a mobile 5090 at 500FPS+. ⚡️ More coming soon.
Why are world models important? AKA "the simulator," "continually hallucinated reality," or "the Holodeck." The founders of wayfarerlabs.ai go in depth about what it all means. Grok, in reply below, tells you what you will learn by watching this.
So as expected, when we tried to generate higher resolutions, the flickering got worse! Fun to pause and look at the little details the model decides to generate. Temporal consistency and a better "weak" decoder are definitely needed!
3
3
21
0