Zhengyi Wang · Oct 27, 2025 · 11:35 AM UTC

Zhengyi Wang

Zhengyi Wang @coolboywzy

Oct 27

Project page: jamesyjl.github.io/Nano3D/

Zhengyi Wang · Oct 27, 2025 · 11:34 AM UTC

Zhengyi Wang @coolboywzy

Oct 27

🚀 Introducing Nano3D — a training-free framework for precise, coherent 3D object editing without masks! By integrating FlowEdit into TRELLIS and introducing Voxel/Slat-Merge, Nano3D preserves structure & consistency while delivering superior 3D quality.

234

Cheng Lu · Oct 10, 2025 · 6:51 PM UTC

Zhengyi Wang retweeted

Cheng Lu @clu_cheng

Oct 10

This is a very solid and promising research that scales consistency models to 10B+ video diffusion models. The combination of sCM and Variational Score Distillation is a very promising direction for few-step generation!

Kaiwen Zheng @zkwthu

Oct 10

🚀Try out rCM—the most advanced diffusion distillation! ✅First to scale up sCM/MeanFlow to 10B+ video models ✅Open-sourced FlashAttention-2 JVP kernel & FSDP/CP support ✅High quality & diversity videos in 2~4 steps Paper: arxiv.org/abs/2510.08431 Code: github.com/NVlabs/rcm

124

Zhengyi Wang · Sep 27, 2025 · 3:23 AM UTC

Zhengyi Wang @coolboywzy

Sep 27

Meet RDT2, our latest foundation model that zero-shot deploys on any robot arms with unseen scenes, objects & instructions.🔥 Fully open-sourced: github.com/thu-ml/RDT2 Project page: rdt-robotics.github.io/rdt2/

GitHub - thu-ml/RDT2: Official code of RDT 2

Official code of RDT 2. Contribute to thu-ml/RDT2 development by creating an account on GitHub.

github.com

Songming Liu @songming_liu

Sep 26

😠💢😵‍💫Tired of endless data collection & fine-tuning every time you try out VLA? Meet RDT2, the first foundation model that zero-shot deploys on any robot arms with unseen scenes, objects & instructions. No collection. No tuning. Just plug and play🚀 Witness a clear sign of embodied superintelligence - 7B one-step diffusion → 23 Hz inference⚡ - Re-designed UMI @chichengcc @SongShuran and manufactured 100 portable devices - Trained on 10K-hour UMI data on 100 real houses - Zero-shot: pick, place, press, wipe… open-vocabulary - Demos: block 30 m/s arrows in 500 ms🛡️; first to play ping-pong with an end-to-end model 🏓; extinguish burning incense by shaking quickly🥢 Fully open source at github.com/thu-ml/RDT2 Project page: rdt-robotics.github.io/rdt2/ Thanks to awesome collaborators @bang_guo96535 @D0g4M74794 @EthanNg51931527

Elon Musk · Aug 11, 2025 · 3:18 AM UTC

Zhengyi Wang retweeted

Elon Musk

@elonmusk

Aug 11

Cool

@_akhaliq

Aug 11

Vibe coding with @xai grok 4 and @Alibaba_Qwen Image on my phone

723

789

4,156

Zhengyi Wang · Aug 4, 2025 · 11:51 PM UTC

Zhengyi Wang @coolboywzy

Aug 4

So excited to share Qwen-Image—a 20B MMDiT model! 🚀 It’s been amazing to watch its accurate text rendering start emerging and getting better and better during training. Also, it starts to show preliminary abilities in understanding 3D space and handling spatial transformation.

Qwen

@Alibaba_Qwen

Aug 4

🚀 Meet Qwen-Image — a 20B MMDiT model for next-gen text-to-image generation. Especially strong at creating stunning graphic posters with native text. Now open-source. 🔍 Key Highlights: 🔹 SOTA text rendering — rivals GPT-4o in English, best-in-class for Chinese 🔹 In-pixel text generation — no overlays, fully integrated 🔹 Bilingual support, diverse fonts, complex layouts 🎨 Also excels at general image generation — from photorealistic to anime, impressionist to minimalist. A true creative powerhouse. Blog:qwenlm.github.io/blog/qwen-i… Hugging Face：huggingface.co/Qwen/Qwen-Ima… ModelScope：modelscope.cn/models/Qwen/Qw… Github：github.com/QwenLM/Qwen-Image Technical report：qianwen-res.oss-cn-beijing.a… Demo: modelscope.cn/aigc/imageGene…

Zhengyi Wang · Jul 27, 2025 · 12:53 PM UTC

Zhengyi Wang @coolboywzy

Jul 27

DeepMesh V2 drops soon! Upgraded autoregressive 3D mesh generator.🔥🔥🔥

203

Zhengyi Wang · Jun 5, 2025 · 4:50 AM UTC

Zhengyi Wang @coolboywzy

Jun 5

Dive into the paper 👉 arxiv.org/pdf/2506.01853

Zhengyi Wang · Jun 5, 2025 · 4:49 AM UTC

Zhengyi Wang @coolboywzy

Jun 5

🚀 Introducing ShapeLLM-Omni, a 3D-native multimodal large language model finetuned from Qwen2.5-VL-7B. It builds on voxel-based 3D VQVAE and a 2.56M-dialogue 3D-Alpaca dataset, enabling 4 tasks: text/image-to-3D, 3D comprehension and editing. Code, model, data open-sourced!

Zhengyi Wang · Mar 24, 2025 · 7:46 AM UTC

Zhengyi Wang @coolboywzy

Mar 24

Illustration video for DeepMesh, our latest model for 3D mesh generation! 🔥 DeepMesh generates high-quality meshes from raw point clouds. It can also refine existing meshes, improving their structure and quality. Open-sourced: github.com/zhaorw02/DeepMesh

170

Zhengyi Wang · Mar 20, 2025 · 4:04 PM UTC

Zhengyi Wang @coolboywzy

Mar 20

Thank @_akhaliq for sharing our work! We're thrilled to announce DeepMesh, our latest auto-regressive artist-mesh generative model. The model weights and inference code are fully open-sourced!🎉 Code: github.com/zhaorw02/DeepM… Project page: zhaorw02.github.io/DeepMesh/

@_akhaliq

Mar 20

DeepMesh is out on Hugging Face Auto-Regressive Artist-mesh Creation with Reinforcement Learning Conditioned on point clouds and images, DeepMesh generates meshes with intricate details and precise topology, outperforming state-of-the-art methods in both precision and quality.

Zhengyi Wang · Nov 27, 2024 · 11:48 PM UTC

Zhengyi Wang @coolboywzy

27 Nov 2024

🚀 So excited to see LLaMa-Mesh integrated into Blender with #meshgen! 🎉 Now you can generate 3D meshes locally with AI in Blender. Open-source and available now! 🙌 #AI #3D #Blender #LLaMaMesh #OpenSource

dylan @dylan_ebert_

27 Nov 2024

Generate meshes with AI locally in Blender 📢 New open-source release meshgen, a local blender integration of LLaMa-Mesh, is open source and available now 🤗

dylan · Nov 22, 2024 · 7:34 PM UTC

Zhengyi Wang retweeted

dylan @dylan_ebert_

22 Nov 2024

LLaMa-Mesh running locally in blender official @huggingface release soon 🤗

383

Zhengyi Wang · Nov 19, 2024 · 3:46 AM UTC

Zhengyi Wang @coolboywzy

19 Nov 2024

Code github.com/nv-tlabs/LLaMA-Me…

GitHub - nv-tlabs/LLaMA-Mesh: Unifying 3D Mesh Generation with Language Models

Unifying 3D Mesh Generation with Language Models. Contribute to nv-tlabs/LLaMA-Mesh development by creating an account on GitHub.

github.com

Zhengyi Wang · Nov 19, 2024 · 3:37 AM UTC

Zhengyi Wang @coolboywzy

19 Nov 2024

Paper link arxiv.org/pdf/2411.09595

Zhengyi Wang · Nov 19, 2024 · 3:35 AM UTC

Zhengyi Wang @coolboywzy

19 Nov 2024

Online Demo huggingface.co/spaces/Zhengy…

LLaMA Mesh - a Hugging Face Space by Zhengyi

huggingface.co

Zhengyi Wang · Nov 19, 2024 · 3:32 AM UTC

Zhengyi Wang @coolboywzy

19 Nov 2024

🚀 Introducing LLaMA-Mesh! 🎉 We fine-tuned LLaMA on 3D Mesh data, enabling LLMs to natively generate 3D meshes via chatting while retaining their original language capabilities. ✨ Model weights and inference code are fully open-sourced. 🌐 Project page: research.nvidia.com/labs/tor…

378

Zhengyi Wang · Jul 2, 2024 · 5:38 AM UTC

Zhengyi Wang @coolboywzy

2 Jul 2024

CRM is accepted by #ECCV2024 ! CRM can generate high fidelity 3D textured mesh from single image in 10 seconds.🔥

@_akhaliq

11 Mar 2024

CRM Single Image to 3D Textured Mesh with Convolutional Reconstruction Model Feed-forward 3D generative models like the Large Reconstruction Model (LRM) have demonstrated exceptional generation speed. However, the transformer-based methods do not leverage the geometric

Zhengyi Wang · May 28, 2024 · 4:28 PM UTC

Zhengyi Wang @coolboywzy

28 May 2024

High resolution 4D results.

@_akhaliq

28 May 2024

Vidu4D Single Generated Video to High-Fidelity 4D Reconstruction with Dynamic Gaussian Surfels Video generative models are receiving particular attention given their ability to generate realistic and imaginative frames. Besides, these models are also observed to exhibit strong

Zhengyi Wang · May 6, 2024 · 10:20 AM UTC

Zhengyi Wang @coolboywzy

6 May 2024

Amazing video generation results🔥

ShengShu @Shengshu_ai

27 Apr 2024

Thank you all for introducing Vidu 🚀: Elevating video creation with our revolutionary U-ViT tech. Vidu AI supports crafting 16-second, 1080p HD videos with multicam flair and flawless transitions. Explore the edge in AI-driven video magic. Guess what's next!#Vidu #AIGC