3D Vision Researcher |《Hello 算法》hello-algo.com

Shanghai
Joined November 2021
Yudong Jin retweeted
凯恩在备战明年的CSP-J(今年有12岁年龄限制),找来找去,发现最好的算法书是 @krahets 的《Hello算法》,我们买的是python代码版本(网上有各语言版开源),但新C++语法其实看起来和Python没多大区别,凯恩读起来没障碍。这本书真的是大人小孩都能看。
6
70
1
380
Yudong Jin retweeted
Diffuman4D: 4D Consistent Human View Synthesis from Sparse-View Videos with Spatio-Temporal Diffusion Models Contributions: • We introduce Diffuman4D, a novel diffusion model that generates spatio-temporally consistent and high-resolution (1024p) human videos from sparse-view video inputs. • We propose a sliding iterative denoising mechanism that enhances both the spatial and temporal consistency of generated long-term videos while maintaining efficient inference. • We design a human pose conditioning scheme to enhance the appearance quality and motion accuracy of generated human videos. • We plan to release our processed version of the DNA-Rendering dataset, which we believe will benefit future research in this area.
Want to model reflective scenes and render them in real-time? Check out EnvGS!
EnvGS: Modeling View-Dependent Appearance with Environment Gaussian Contributions: • We propose a novel scene representation for accurately modeling complex near-field and high-frequency reflections in real-world environments. • We developed a real-time ray-tracing renderer for 2DGS, enabling joint optimization of our representation for accurate scene reconstruction while achieving real-time rendering speeds. • Extensive experiments show that EnvGS significantly outperforms previous methods. To the best of our knowledge, EnvGS is the first method to achieve real-time photorealistic specular reflections synthesis in real-world scenes.
3
Yudong Jin retweeted
Check out our new work, Prompt Depth Anything, which achieves accurate metric depth estimation at up to 4K resolution! Thanks to all our collaborators!
Want to use Depth Anything, but need metric depth rather than relative depth? Thrilled to introduce Prompt Depth Anything, a new paradigm for accurate metric depth estimation with up to 4K resolution. 👉Key Message: Depth foundation models like DA have already internalized rich geometric knowledge of the 3D world but lack a proper way to elicit it. Inspired by the success of prompting in LLMs, we propose prompting Depth Anything with metric cues to produce metric depth. This method proves to be very effective when using a low-cost lidar (e.g., iPhone's LiDAR), which is widely available, as prompts. We believe the prompt can generalize to other forms as long as scale information is provided. Prompt Depth Anything offers 1⃣A series of models for iPhone lidars. 2⃣4D reconstruction from monocular videos (captured with iPhone). 3⃣Improved generalization ability for robot manipulation, e.g. Training on cans but generalizing on glasses. 4⃣More detailed depth annotations for the ScanNet++ dataset. The first author is our excellent intern @HaotongLin. Paper: huggingface.co/papers/2412.1… Huggingface: huggingface.co/papers/2412.1… Project Page: promptda.github.io Code: github.com/DepthAnything/Pro…
2
5
40
Yudong Jin retweeted
CAT3D + time => CAT4D! 🐈 Check out our latest work on turning text/image(s)/video into dynamic 3D models that one can explore in real time, led by brilliant @ChrisWu6080!
🚀 Introducing CAT4D! 🚀 CAT4D transforms any real or generated video into dynamic 3D scenes with a multi-view video diffusion model. The outputs are dynamic 3D models that we can freeze and look at from novel viewpoints, in real-time! Be sure to try our interactive viewer!
Awesome! The transformer version of cnn-explainer.
Project #2: LLM Visualization So I created a web-page to visualize a small LLM, of the sort that's behind ChatGPT. Rendered in 3D, it shows all the steps to run a single token inference. (link in bio)
5
Yudong Jin retweeted
🗣️
34
227
37
4,657
今天看到了一位读者的评论,心情久久未能平复... 愿功夫不负有心人!
附文章链接(阮老师 YYDS !😭) ruanyifeng.com/blog/2010/08/…
1
1
前段时间重读了一下阮老师 2010 年写的博客「关于 IT 出版业」,颇有感触。 十五年了,文章里谈到的畅销书《C++ Primer》仍然名列前茅,可谓经久不衰。这在互联网平台上是难以想象的。 “作者版税”“译者报酬”等话题,读起来似乎“时空停滞”了。 出版业是一个有钝感力的行业。
周五分享 - Hello 算法(图一):开源的算法入门书籍hello-algo.com/chapter_paper… - StockCake(图二):无限的无版权AI生成图片下载stockcake.com/ - KanjiVG(图三):汉字SVG文件下载,有笔划动画kanjivg.tagaini.net/index.ht… #科技爱好者周刊(第294期)ruanyifeng.com/blog/2024/03/…
1
1
11
❔我们为什么要学习数据结构与算法? 📗《Hello算法》纸质书长什么样? 🌈为什么要做开源书? 新人 UP 主,请多多关照、一键三连~ bilibili.com/video/BV1QH4y15…
“Son Goku from Dragon Ball is the ultimate Shōnen Jump model that made me think ‘now THIS is a main character’ I wanted a character like Goku in my manga. It’s that clear & simple mindset that makes readers feel great. It motivated me too. That’s my image of a Hero” — Kishimoto
Yudong Jin retweeted
Announcing Stable Diffusion 3, our most capable text-to-image model, utilizing a diffusion transformer architecture for greatly improved performance in multi-subject prompts, image quality, and spelling abilities. Today, we are opening the waitlist for early preview. This phase is crucial for gathering insights to improve its performance and safety ahead of open release. You can sign up to join the waitlist and learn more here: bit.ly/3OR2qQF #stablediffusion3 Prompt: Epic anime artwork of a wizard atop a mountain at night casting a cosmic spell into the dark sky that says "Stable Diffusion 3" made out of colorful energy
Yudong Jin retweeted
Introducing Sora, our text-to-video model. Sora can create videos of up to 60 seconds featuring highly detailed scenes, complex camera motion, and multiple characters with vibrant emotions. openai.com/sora Prompt: “Beautiful, snowy Tokyo city is bustling. The camera moves through the bustling city street, following several people enjoying the beautiful snowy weather and shopping at nearby stalls. Gorgeous sakura petals are flying through the wind along with snowflakes.”
Yudong Jin retweeted
ok this demo is better than apple's actual ads for vision pro 🤯
Yudong Jin retweeted
It’s incredibly hard to execute a vision this simple
Yudong Jin retweeted
Introducing Multi Motion Brush. Control multiple areas of your video generations with independent motion. Available now for Gen-2 at runwayml.com
Yudong Jin retweeted
If you cannot explain something in simple terms, you don't understand it.