Apple Partners With Nvidia to Improve Performance Speed of Its AI Models

By: gadgets.ndtv.com

Dec 19 2024
0
0 Views

Apple Partners With Nvidia to Improve Performance Speed of Its AI Models

Apple is partnering with Nvidia in an effort to improve the performance speed of artificial intelligence (AI) models. On Wednesday, the Cupertino-based tech giant announced that it has been researching inference acceleration on Nvidia's platform to see whether both the efficiency and latency of a large language model (LLM) can be improved simultaneously. The iPhone maker used a technique dubbed Recurrent Drafter (ReDrafter) that was published in a research paper earlier this year. This technique was combined with the Nvidia TensorRT-LLM inference acceleration framework.

Apple Uses Nvidia Platform to Improve AI Performance

In a blog post, Apple researchers detailed the new collaboration with Nvidia for LLM performance and the results achieved from it. The company highlighted that it has been researching the problem of improving inference efficiency while maintaining latency in AI models.

Inference in machine learning refers to the process of making predictions, decisions, or conclusions based on a given set of data or input while using a trained model. Put simply, it is the processing step of an AI model where it decodes the prompts and converts raw data into processed unseen information.

[Sponsored] Samsung Galaxy S24: The Ultimate Camera Phone With Smarter On-Device AI

Earlier this year, Apple published and open-sourced the ReDrafter technique bringing a new approach to the speculative decoding of data. Using a Recurrent neural network (RNN) draft model, it combines beam search (a mechanism where AI explores multiple possibilities for a solution) and dynamic tree attention (tree-structure data is processed using an attention mechanism). The researchers stated that it can speed up LLM token generation by up to 3.5 tokens per generation step.

While the company was able to improve performance efficiency to a certain degree by combining two processes, Apple highlighted that there was no significant boost to speed. To solve this, researchers integrated ReDrafter into the Nvidia TensorRT-LLM inference acceleration framework.

As a part of the collaboration, Nvidia added new operators and exposed the existing ones to improve the speculative decoding process. The post claimed that when using the Nvidia platform with ReDrafter, they found a 2.7x speed-up in generated tokens per second for greedy decoding (a decoding strategy used in sequence generation tasks).

Apple highlighted that this technology can be used to reduce the latency of AI processing while also using fewer GPUs and consuming less power.

anchor links ads by Easy Branches

Apple Partners With Nvidia to Improve Performance Speed of Its AI Models

Apple Uses Nvidia Platform to Improve AI Performance

Related

Android 16 Developer Preview 2 Improves Battery Life, Adds Fingerprint Unlock on Pixel With Screen Off: Report

Intel Releases Arrow Lake CPU Updates to Fix Performance Issues

Samsung Galaxy Ring May Launch in Two New Size Options

Apple Partners With Nvidia to Improve Performance Speed of Its AI Models

YouTube Announces Crackdown on Videos With Egregious Clickbait Thumbnails and Titles in India

GitHub Copilot Free Version With 2,000 Code Completion Per Month Launched for All Developers

Xiaomi 15 Ultra Allegedly Spotted on MIIT Site; Could Offer Satellite Connectivity

Oppo Reno 13 Leaked Live Image Suggests Exclusive India Colour Option

Micromax, Phison Announce MiPhi Joint Venture to Offer NAND Storage Solutions in India

Google Reportedly Asking Contractors to Rate Gemini Prompts Outside of Their Expertise

Threads Now Lets Users Reshare Photos, Videos from Others Without Quote Posting

Ripple Debuts RLUSD Stablecoin with Former RBI Chief Raghuram Rajan on Advisory Board

Android 16 Developer Preview 2 Improves Battery Life, Adds Fingerprint Unlock on Pixel With Screen Off: Report