Robotics & Computer Vision Professor at Georgia Tech, and part-time CAIO at Verdant Robotics. Before: stints at KUL, Skydio, Facebook B*8, Google AI.

San Mateo, CA
Joined June 2008
My Annual Reviews article on Factor Graphs in Robotics is finally out with a publicly accessible link: annualreviews.org/eprint/85P…
Wow, walking in a GS reconstruction is pretty cool. Try to find the castle (another environment) and entering. It’s wild.
北アルプスで最も美しい火山湖「みくりが池」の室堂周辺を丸ごと3D化し、アバターで歩けるようにしました。 どなたでもArrival Spaceからご覧いただけます。 URL: arrival.space/murodou_mikuri…
13
I was one of the 510 (!!) area chairs for #ICCV2025 :-) Check out this report, with a great overview of best paper candidates.
We've released the ICCV 2025 Report! hirokatsukataoka.net/temp/pr… Compiled during ICCV in collaboration with LIMIT.Lab, cvpaper.challenge, and Visual Geometry Group (VGG), this report offers meta insights into the trends and tendencies observed at this year's conference. #ICCV2025
2
11
71
This! If we can’t all work in OCaml, at least impose a “non-imperative” coding discipline.
When I started working in python, I got lazy with “single assignment”, and I need to nudge myself about it. You should strive to never reassign or update a variable outside of true iterative calculations in loops. Having all the intermediate calculations still available is helpful in the debugger, and it avoids problems where you move a block of code and it silently uses a version of the variable that wasn’t what it originally had. In C/C++, making almost every variable const at initialization is good practice. I wish it was the default, and mutable was a keyword.
1
9
Frank Dellaert retweeted
X-mas came earlier this year! Nvidia has just released the huge Physical AI AV Dataset - 1727 hrs of driving data: 310K clips of 20s - sensor rig: 7 cameras, lidar, radar - 25 countries, 2.5K cities from US + Europe Kudos to Kashyap Chitta et al.! huggingface.co/datasets/nvid…
Frank Dellaert retweeted
Representation representation representation #SpatialAI See the SLAM Handbook Chapter 18 for my views! github.com/SLAM-Handbook-con…
The hot topic at #ICCV2025 was World Models. They come in different flavors — (interactive) video models, neural simulators, reconstruction models, etc. — but the overarching goal is clear: Generative AI that predict and simulate how the real world works.
2
15
1
160
Georgia Tech winning streak brings out the fans :-) Midtown Atlanta in the background.
The scene at Bobby Dodd Stadium at the start of the second quarter
4
Cool work by @taherehtoosi on how learned priors can be used to reinforce recognition of ambiguous stimuli, including visual illusions and Gestalt predictions.
How does our brain excel at complex object recognition, yet get fooled by simple illusory contours? What unifying principle governs all Gestalt laws of perceptual organization? We may have an answer: integration of learned priors through feedback. New paper with Ken Miller! 🧵
1
15
GLIM: amazing LIDAR-based SLAM built with @gtsam4
筑波の航空写真に,点群地図を透過させてみたが,ほぼあってる.GLIMすごい
15
1
192
Frank Dellaert retweeted
🛰️ Excited to share Skyfall-GS - the FIRST method to create real-time navigable 3D cities from satellite imagery alone! We transform multi-view satellite images into immersive 3D scenes you can freely fly through! 🚁✨ 🌐 Project Page: skyfall-gs.jayinnn.dev 1/5
I guess Lego is trademarked :-)
🏆 Excited to share that BrickGPT (avalovelace1.github.io/Brick…) received the ICCV Best Paper Award! Our first author, @AvaLovelace0, will present it from 1:30 to 1:45 p.m. today in Exhibit Hall III. Huge thanks to all the co-authors @RuixuanLiu_ @RamananDeva @ChangliuL @junyanz89
1
GaussGym looks amazing!
Simulation drives robotics progress, but how do we close the reality gap? Introducing GaussGym: an open-source framework for learning locomotion from pixels with ultra-fast parallelized photorealistic rendering across >4,000 iPhone, GrandTour, ARKit, and Veo scenes! Thread 🧵
1
1
10
Codex is working incredibly well for me inside VS code.
GPT-5-Codex, optimized for agentic coding, is rolling out to @code now! Try it out and let us know what you think. github.blog/changelog/2025-0…
1
17
Cool visualizations and fast bundle adjustment on GPU.
InstantSfM: Fully Sparse and Parallel Structure-from-Motion TLDR: InstantSfM is a fully sparse and parallel Structure-from-Motion pipeline. It leverages GPU acceleration to achieve up to 40× speedup over traditional methods like COLMAP, while maintaining or improving reconstruction accuracy across diverse datasets. Contributions: • We extend sparse-aware bundle adjustment techniques to global positioning, introducing a complete global SfM system in PyTorch. • We demonstrate state-of-the-art efficiency while achieving comparable accuracy to both traditional, well-established SfM pipelines and learning-based methods. • We incorporate depth prior into the optimization so that camera parameters and 3D point clouds are recovered at metric scale. InstantSfM significantly outperforms compared methods on several datasets.
8
117
This DLR paper is quite remarkable: accurate touch localization without artificial skins, just force and torque sensors in the joints. There is some learning in this paper, but you don’t need it to get touch points/trajectories. Non-paywalled link below.
𝗗𝗟𝗥 𝗿𝗲𝘀𝗲𝗮𝗿𝗰𝗵𝗲𝗿𝘀 𝗴𝗮𝘃𝗲 𝗮 𝗿𝗼𝗯𝗼𝘁𝗶𝗰 𝗮𝗿𝗺 𝗳𝘂𝗹𝗹-𝗯𝗼𝗱𝘆 𝘁𝗼𝘂𝗰𝗵 𝘀𝗲𝗻𝘀𝗶𝘁𝗶𝘃𝗶𝘁𝘆 𝘄𝗶𝘁𝗵 𝗻𝗼 𝗮𝗿𝘁𝗶𝗳𝗶𝗰𝗶𝗮𝗹 𝘀𝗸𝗶𝗻 𝗻𝗲𝗲𝗱𝗲𝗱. They used internal force-torque sensors at 8 kHz + deep learning. The robot can feel where you touch it, recognize letters drawn on its surface, and respond to virtual buttons placed anywhere on its body. What's interesting is the infrastructure behind it. To train these models, you need high-frequency sensor streams, manifold learning to unfold trajectories, and the ability to iterate fast. They collected 2,300 samples from 20 people and hit 95.5% accuracy on digit recognition. This is what's possible when you have the right data infrastructure. 📄 lnkd.in/exgWfeXf Video credit: @DLR_en
1
1
12
Amazing stuff!
Unitree G1 Kungfu Kid V6.0 A year and a half as a trainee — I'll keep working hard! Hope to earn more of your love🥰
1
Incredible work!
Real time online 3D reconstruction of 3D scene and humans represented with SMPL. fanegg.github.io/Human3R/ I don't get tired of looking at these results
1
13
I'm absolutely right! (says Claude)
2
3
First Waymo ride in the ATL :-)
3
1
15
Agreed! Data will be plentiful once you have 100k humanoids in the world :-) Still, @rodneyabrooks made good points about getting the *right* kind of data. Especially the experiment he wrote about, where a person with numbed fingertips struggled with manipulation, was insightful.
I think the real bitter lesson for robotics is going to be that *cheap robots deployed at scale* are going to let people solve the core robotics problems. And this is what we see from e.g. Unitree now, and increasingly companies like Galaxea.
2
6