Categories: FAANG

Matrix3D: Large Photogrammetry Model All-in-One

We present Matrix3D, a unified model that performs several photogrammetry subtasks, including pose estimation, depth prediction, and novel view synthesis using just the same model. Matrix3D utilizes a multi-modal diffusion transformer (DiT) to integrate transformations across several modalities, such as images, camera parameters, and depth maps. The key to Matrix3D’s large-scale multi-modal training lies in the incorporation of a mask learning strategy. This enables full-modality model training even with partially complete data, such as bi-modality data of image-pose and image-depth pairs…
AI Generated Robotic Content

Recent Posts

3 Actionable AI Recommendations for Businesses in 2026

TL;DR In 2026, the businesses that win with AI will do three things differently: redesign…

12 hours ago

Revolutionizing Construction

How Cavanagh and Palantir Are Building Construction’s OS for the 21st CenturyEditor’s Note: This blog post…

2 days ago

Building a voice-driven AWS assistant with Amazon Nova Sonic

As cloud infrastructure becomes increasingly complex, the need for intuitive and efficient management interfaces has…

2 days ago

Cloud CISO Perspectives: Our 2026 Cybersecurity Forecast report

Welcome to the first Cloud CISO Perspectives for December 2025. Today, Francis deSouza, COO and…

2 days ago

As AI Grows More Complex, Model Builders Rely on NVIDIA

Unveiling what it describes as the most capable model series yet for professional knowledge work,…

2 days ago