
Editor's note: This post is part of the AI Decoded series, which demystifies AI by making the technology more accessible, and showcases new hardware, software, tools and accelerations for GeForce RTX PC and NVIDIA RTX workstation users.
From games and content creation apps to software development and productivity tools, AI is increasingly being integrated into applications to enhance user experiences and boost efficiency.
Those efficiency boosts extend to everyday tasks, like web browsing. Brave, a privacy-focused web browser, recently launched a smart AI assistant called Leo AI that, in addition to providing search results, helps users summarize articles and videos, surface insights from documents, answer questions and more.
Leo AI helps users summarize articles and videos, surface insights from documents, answer questions and more. The technology behind Brave and other AI-powered tools is a combination of hardware, libraries and ecosystem software that's optimized for the unique needs of AI.
Why Software Matters NVIDIA GPUs power the world's AI, whether running in the data center or on a local PC. They contain Tensor Cores, which are specifically designed to accelerate AI applications like Leo AI through massively parallel number crunching - rapidly processing the huge number of calculations needed for AI simultaneously, rather than doing them one at a time.
But great hardware only matters if applications can make efficient use of it. The software running on top of GPUs is just as critical for delivering the fastest, most responsive AI experience.
The first layer is the AI inference library, which acts like a translator that takes requests for common AI tasks and converts them to specific instructions for the hardware to run. Popular inference libraries include NVIDIA TensorRT, Microsoft's DirectML and the one used by Brave and Leo AI via Ollama, called llama.cpp.
Llama.cpp is an open-source library and framework. Through CUDA - the NVIDIA software application programming interface that enables developers to optimize for GeForce RTX and NVIDIA RTX GPUs - provides Tensor Core acceleration for hundreds of models, including popular large language models (LLMs) like Gemma, Llama 3, Mistral and Phi.
On top of the inference library, applications often use a local inference server to simplify integration. The inference server handles tasks like downloading and configuring specific AI models so that the application doesn't have to.
Ollama is an open-source project that sits on top of llama.cpp and provides access to the library's features. It supports an ecosystem of applications that deliver local AI capabilities. Across the entire technology stack, NVIDIA works to optimize tools like Ollama for NVIDIA hardware to deliver faster, more responsive AI experiences on RTX.
Applications like Brave's Leo AI can access RTX-powered AI acceleration to enhance user experiences. NVIDIA's focus on optimization spans the entire technology stack - from hardware to system software to the inference libraries and tools that enable applications to deliver faster, more responsive AI experiences on RTX.
Local vs. Cloud Brave's Leo AI can run in the cloud or locally on a PC through Ollama.
There are many benefits to processing inference using a local model. By not sending prompts to an outside server for processing, the experience is private and always available. For instance, Brave users can get help with their finances or medical questions without sending anything to the cloud. Running locally also eliminates the need to pay for unrestricted cloud access. With Ollama, users can take advantage of a wider variety of open-source models than most hosted services, which often support only one or two varieties of the same AI model.
Users can also interact with models that have different specializations, such as bilingual models, compact-sized models, code generation models and more.
RTX enables a fast, responsive experience when running AI locally. Using the Llama 3 8B model with llama.cpp, users can expect responses up to 149 tokens per second - or approximately 110 words per second. When using Brave with Leo AI and Ollama, this means snappier responses to questions, requests for content summaries and more.
NVIDIA internal throughput performance measurements on NVIDIA GeForce RTX GPUs, featuring a Llama 3 8B model with an input sequence length of 100 tokens, generating 100 tokens. Get Started With Brave With Leo AI and Ollama Installing Ollama is easy - download the installer from the project's website and let it run in the background. From a command prompt, users can download and install a wide variety of supported models, then interact with the local model from the command line.
For simple instructions on how to add local LLM support via Ollama, read the company's blog. Once configured to point to Ollama, Leo AI will use the locally hosted LLM for prompts and queries. Users can also switch between cloud and local models at any time.
Brave with Leo AI running on Ollama and accelerated by RTX is a great way to get more out of your browsing experience. You can even summarize and ask questions about AI Decoded blogs! Developers can learn more about how to use Ollama and llama.cpp in the NVIDIA Technical Blog.
Generative AI is transforming gaming, videoconferencing and interactive experiences of all kinds. Make sense of what's new and what's next by subscribing to the AI Decoded newsletter.
Most recent headlines
05/01/2027
Worlds first 802.15.4ab-UWB chip verified by Calterah and Rohde & Schwarz to be ...
04/08/2026
Dalet, a leading technology and service provider for media-rich organizations, t...
04/07/2026
April 7 2026, 19:00 (PDT) Detective Conan: Fallen Angel of the Highway Opens in...
01/06/2026
January 6 2026, 05:30 (PST) Dolby Sets the New Standard for Premium Entertainment at CES 2026
Throughout the week, Dolby brings to life the latest innovatio...
02/05/2026
Dalet, a leading technology and service provider for media-rich organizations, t...
01/05/2026
January 5 2026, 18:30 (PST) NBCUniversal's Peacock to Be First Streamer to ...
23/04/2026
Share
Copy link
Facebook
X
Linkedin
Bluesky
Email...
23/04/2026
Share
Copy link
Facebook
X
Linkedin
Bluesky
Email...
23/04/2026
Share
Copy link
Facebook
X
Linkedin
Bluesky
Email...
23/04/2026
Share
Copy link
Facebook
X
Linkedin
Bluesky
Email...
23/04/2026
Partnership between ARRI and SmallHD brings new Hi-5 license
Configurable monitor overlays adapt to individual working styles
Supported by SmallHD monitors ru...
23/04/2026
Lighting Master Cronenweth ASC brings a unique look to each grid world with the help of Astera
Jeff Cronenweth on the set of Disney's TRON: ARES. Photo by...
23/04/2026
DP Chloe Smolkin ( The Late Show, Kidz Bop ) joins director Danielle Beckmann and writer/actor Raji Ahsan behind the camera for the heartfelt short comedy Dr...
23/04/2026
Apply now to join the 2026 Producer Delegation to TIFF: The Market 23 April 2026
Screen Australia, in partnership with Ontario Creates, has opened application...
22/04/2026
Solid State Logic is advancing its System T platform with a stronger focus on IP...
22/04/2026
From immersive audio to live streaming, Dolby Laboratories is focused on the fut...
22/04/2026
Shallow depth-of-field cameras have taken the industry by storm. Its debut a han...
22/04/2026
Riedel Communications (Booth C4908) announced that Eastern Kentucky University (...
22/04/2026
The NAB Show is in full swing, and the SVG and SVG Europe editorial teams are chasing down the hottest stories from all over the Las Vegas Convention Center. He...
22/04/2026
Blackmagic Design has announced the URSA Cine 12K LF 100G, a new model in the URSA Cine family adding 100G Ethernet for SMPTE 2110 live production output up to ...
22/04/2026
Celebrating its 40th anniversary, NEP is leaning into hybrid production with the...
22/04/2026
NEP VP, Platform Dan Murphy sits down at the 2026 NAB Show to unpack what NEP P...
22/04/2026
Spotify and the New York Liberty are teaming up to give music and basketball fan...
22/04/2026
New 20-minute documentary explores iconic design The Focusrite Room in Mesa, Arizona, where John Aquilino hosts the Studio Console 005.
In 2025, Focusrite co...
22/04/2026
Offers compact wireless solution for pedalboards
Taiwanese audio brand Cloudvocal have announced the availability of a new pedalboard-friendly wireless syst...
22/04/2026
Latest hybrid sampling/synthesis instrument arrives
Arturia's Augmented series offerings rely on a mixture of sampling and synthesis, allowing users to ...
22/04/2026
Combines three distinct analogue EQ emulations
The latest addition to Acustica Audio's ever-expanding collection of analogue-emulation plug-ins combines...
22/04/2026
Final instalment in vintage-inspired instrument series
Analog Empire: Bass & Lead marks the final instalment in Melda Production's vintage hardware-insp...
22/04/2026
Fuzz pedal joins all-analogue Series A line
Given that Strymons reputation was built on unapologetically digital pedals, it was a little surprising to see t...
22/04/2026
SBS names shortlisted brands for 2026 SBS Media Sustainability Challenge
22 April, 2026
Media releases
National broadcaster also releases its second annual...
22/04/2026
Why Low Band Electronic Warfare Matters...
22/04/2026
The nation unites around football team's World Cup dream
Warsaw, Poland, 20.04.26: Nielsen, a global leader in audience measurement, data, and media intell...
22/04/2026
Warsaw, Poland, 22.04.26: Nielsen, a global leader in audience measurement, data...
22/04/2026
New market intelligence offering gives businesses a clearer view of local consum...
22/04/2026
Glookast Unveils New UX, YouTube and Social Media Connectors, Premiere Panel, Ci...
22/04/2026
Lightcraft Technology to Preview Spark Story at NAB 2026 with Interactive Previs...
22/04/2026
Share
Copy link
Facebook
X
Linkedin
Bluesky
Email...
22/04/2026
Share
Copy link
Facebook
X
Linkedin
Bluesky
Email...
22/04/2026
Share
Copy link
Facebook
X
Linkedin
Bluesky
Email...
22/04/2026
Share
Copy link
Facebook
X
Linkedin
Bluesky
Email...
22/04/2026
Share
Copy link
Facebook
X
Linkedin
Bluesky
Email...
22/04/2026
Share
Copy link
Facebook
X
Linkedin
Bluesky
Email...
22/04/2026
22 Apr 2026
VEON's Banglalink to Bring Starlink Mobile to Customers in Bangladesh Bangladesh becomes the third market where VEON and Starlink Mobile partne...
22/04/2026
U have unveiled exclusive first-look images for their six-part police thriller Hit Point, starring Nick Blood (Day of the Jackal) and BAFTA nominee Saffron Hock...
22/04/2026
What can I watch on UKTV and stream on U this week?
This week on UKTV and the free streaming service U, viewers can watch a range of new and returning programm...
22/04/2026
Wednesday 22 April 2026
Sky announces fifth year of WNT Fund with 30,000 bursa...
22/04/2026
Back to All News
This Earth Day, Discover the Sustainable Productions Behind Our Films and Series
Emma Stewart, Ph.D.
Netflix Sustainability Officer
Enterta...
22/04/2026
The move from Retail Media to Commerce Media is about broadening the scope of th...
22/04/2026
April 22 2026, 07:00 (PDT) Dolby and BMW Bring Dolby Atmos to the BMW 7 Series,...
22/04/2026
RT Documentary On One 7-part series breaks US market for first time
RT Programme Sales has announced its first deal with a US distribution partner for its 7-...