Apple Releases Depth Pro, an Open Source Monocular Depth Estimation AI Model

By: gadgets.ndtv.com

Oct 07 2024
0
0 Views

Apple Releases Depth Pro, an Open Source Monocular Depth Estimation AI Model

Apple has released several open-source artificial intelligence (AI) models this year. These are mostly small language models designed for a specific task. Adding to the list, the Cupertino-based tech giant has now released a new AI model dubbed Depth Pro. It is a vision model that can generate monocular depth maps of any image. This technology is useful in the generation of 3D textures, augmented reality (AR), and more. The researchers behind the project claim that the depth maps generated by AI are better than the ones generated with the help of multiple cameras.

Apple Releases Depth Pro AI Model

Depth estimation is an important process in 3D modelling as well as various other technologies such as AR, autonomous driving systems, robotics, and more. The human eye is a complex lens system that can accurately gauge the depth of objects even while observing them from a single-point perspective. However, cameras are not that good at it. Images taken with a single camera make it appear two-dimensional, removing depth from the equation.

So, for technologies where the depth of an object plays an important role, multiple cameras are used. However, modelling objects like this can be time-consuming and resource-intensive. Instead, in a research paper titled “Depth Pro: Sharp Monocular Metric Depth in Less Than a Second”, Apple highlighted how it used a vision-based AI model to generate zero-shot depth maps of monocular images of objects.

How the Depth Pro AI model generates depth maps
Photo Credit: Apple

To develop the AI model, the researchers used the Vision Transformer-based (ViT) architecture. The output resolution of 384 x 384 was picked, but the input and processing resolution was kept at 1536 x 1536, allowing the AI model more space to understand the details.

In the pre-print version of the paper, which is currently published in the online journal arXiv, the researchers claimed that the AI model can now accurately generate depth maps of visually complex objects such as a cage, a furry cat's body and whiskers, and more. The generation time is said to be one second. The weights of the open-source AI model are currently being hosted on a GitHub listing. Interested individuals can run the model on the inference of a single GPU.

Linkedin

Jan Jansen Easy Branches

Tumblr

Easy Branches

Instagram

Share this page

Guest Posts by Easy Branches

anchor links ads by Easy Branches

Apple Releases Depth Pro, an Open Source Monocular Depth Estimation AI Model

Apple Releases Depth Pro AI Model

Related

Lenovo Legion Go S Announced Alongside Next-Generation Legion Go 2 Prototype at CES 2025

Asus TUF Gaming A18 With Nvidia GeForce RTX 50 Series GPU Unveiled at CES 2025

Microsoft Announces AI-Focused Strategic Partnerships With Government, Major Enterprises

Samsung Said to Launch Tri-Folding Phone in 2025; Could Cost More Than Huawei Mate XT Ultimate Design

Sony Working on Helldivers 2, Horizon Zero Dawn Movies; Ghost of Tsushima to Get Anime Adaptation

Ancient Migration Shaped Yemen's Unique Genetic Makeup, Finds New Study

Crypto Prices Today: Bitcoin Falls Below $102,000, Altcoins Face Losses Amid Market Volatility

Nvidia Introduces Llama Nemotron Open-Source LLMs to Build and Deploy AI Agents at CES 2025

Getty Images to Acquire Shutterstock to Create $3.7 Billion Firm

iPhone SE 4 and iPad 11 Tipped to Launch Alongside iOS 18.3

Amazon Echo Spot Smart Alarm Clock With Alexa Voice Assistant Launched: Price, Specifications

Lenovo ThinkBook Plus Gen 6 Rollable AI Laptop Unveiled Alongside New ThinkPad X9 Aura Edition Models

Lenovo Legion Go S Announced Alongside Next-Generation Legion Go 2 Prototype at CES 2025