Want to use Depth Anything, but need metric depth rather than relative depth?
Thrilled to introduce Prompt Depth Anything, a new paradigm for accurate metric depth estimation with up to 4K resolution.
👉Key Message: Depth foundation models like DA have already internalized rich geometric knowledge of the 3D world but lack a proper way to elicit it. Inspired by the success of prompting in LLMs, we propose prompting Depth Anything with metric cues to produce metric depth. This method proves to be very effective when using a low-cost lidar (e.g., iPhone's LiDAR), which is widely available, as prompts. We believe the prompt can generalize to other forms as long as scale information is provided.
Prompt Depth Anything offers
1⃣A series of models for iPhone lidars.
2⃣4D reconstruction from monocular videos (captured with iPhone).
3⃣Improved generalization ability for robot manipulation, e.g. Training on cans but generalizing on glasses.
4⃣More detailed depth annotations for the ScanNet++ dataset.
The first author is our excellent intern
@HaotongLin.
Paper:
huggingface.co/papers/2412.1…
Huggingface:
huggingface.co/papers/2412.1…
Project Page:
promptda.github.io
Code:
github.com/DepthAnything/Pro…