
Editor's note: This post is part of the AI Decoded series, which demystifies AI by making the technology more accessible, and showcases new hardware, software, tools and accelerations for GeForce RTX PC and NVIDIA RTX workstation users.
From games and content creation apps to software development and productivity tools, AI is increasingly being integrated into applications to enhance user experiences and boost efficiency.
Those efficiency boosts extend to everyday tasks, like web browsing. Brave, a privacy-focused web browser, recently launched a smart AI assistant called Leo AI that, in addition to providing search results, helps users summarize articles and videos, surface insights from documents, answer questions and more.
Leo AI helps users summarize articles and videos, surface insights from documents, answer questions and more. The technology behind Brave and other AI-powered tools is a combination of hardware, libraries and ecosystem software that's optimized for the unique needs of AI.
Why Software Matters NVIDIA GPUs power the world's AI, whether running in the data center or on a local PC. They contain Tensor Cores, which are specifically designed to accelerate AI applications like Leo AI through massively parallel number crunching - rapidly processing the huge number of calculations needed for AI simultaneously, rather than doing them one at a time.
But great hardware only matters if applications can make efficient use of it. The software running on top of GPUs is just as critical for delivering the fastest, most responsive AI experience.
The first layer is the AI inference library, which acts like a translator that takes requests for common AI tasks and converts them to specific instructions for the hardware to run. Popular inference libraries include NVIDIA TensorRT, Microsoft's DirectML and the one used by Brave and Leo AI via Ollama, called llama.cpp.
Llama.cpp is an open-source library and framework. Through CUDA - the NVIDIA software application programming interface that enables developers to optimize for GeForce RTX and NVIDIA RTX GPUs - provides Tensor Core acceleration for hundreds of models, including popular large language models (LLMs) like Gemma, Llama 3, Mistral and Phi.
On top of the inference library, applications often use a local inference server to simplify integration. The inference server handles tasks like downloading and configuring specific AI models so that the application doesn't have to.
Ollama is an open-source project that sits on top of llama.cpp and provides access to the library's features. It supports an ecosystem of applications that deliver local AI capabilities. Across the entire technology stack, NVIDIA works to optimize tools like Ollama for NVIDIA hardware to deliver faster, more responsive AI experiences on RTX.
Applications like Brave's Leo AI can access RTX-powered AI acceleration to enhance user experiences. NVIDIA's focus on optimization spans the entire technology stack - from hardware to system software to the inference libraries and tools that enable applications to deliver faster, more responsive AI experiences on RTX.
Local vs. Cloud Brave's Leo AI can run in the cloud or locally on a PC through Ollama.
There are many benefits to processing inference using a local model. By not sending prompts to an outside server for processing, the experience is private and always available. For instance, Brave users can get help with their finances or medical questions without sending anything to the cloud. Running locally also eliminates the need to pay for unrestricted cloud access. With Ollama, users can take advantage of a wider variety of open-source models than most hosted services, which often support only one or two varieties of the same AI model.
Users can also interact with models that have different specializations, such as bilingual models, compact-sized models, code generation models and more.
RTX enables a fast, responsive experience when running AI locally. Using the Llama 3 8B model with llama.cpp, users can expect responses up to 149 tokens per second - or approximately 110 words per second. When using Brave with Leo AI and Ollama, this means snappier responses to questions, requests for content summaries and more.
NVIDIA internal throughput performance measurements on NVIDIA GeForce RTX GPUs, featuring a Llama 3 8B model with an input sequence length of 100 tokens, generating 100 tokens. Get Started With Brave With Leo AI and Ollama Installing Ollama is easy - download the installer from the project's website and let it run in the background. From a command prompt, users can download and install a wide variety of supported models, then interact with the local model from the command line.
For simple instructions on how to add local LLM support via Ollama, read the company's blog. Once configured to point to Ollama, Leo AI will use the locally hosted LLM for prompts and queries. Users can also switch between cloud and local models at any time.
Brave with Leo AI running on Ollama and accelerated by RTX is a great way to get more out of your browsing experience. You can even summarize and ask questions about AI Decoded blogs! Developers can learn more about how to use Ollama and llama.cpp in the NVIDIA Technical Blog.
Generative AI is transforming gaming, videoconferencing and interactive experiences of all kinds. Make sense of what's new and what's next by subscribing to the AI Decoded newsletter.
North America Stories
23/05/2026
Share
Copy link
Facebook
X
Linkedin
Bluesky
Email...
23/05/2026
Share
Copy link
Facebook
X
Linkedin
Bluesky
Email...
23/05/2026
Share
Copy link
Facebook
X
Linkedin
Bluesky
Email...
23/05/2026
Share
Copy link
Facebook
X
Linkedin
Bluesky
Email...
22/05/2026
Germany's Magenta TV, which will have 44 exclusive FIFA World Cup match broa...
22/05/2026
DAZN, the world's leading sports entertainment platform, has acquired global broadcast rights to the International Federation of American Football's ( I...
22/05/2026
ATHLOS, the all-women's professional track and field league, has announced i...
22/05/2026
The National Academy of Television Arts & Sciences (NATAS) today announced that the 47th Annual Sports Emmy Awards and the 47th Annual News & Documentary Emmy A...
22/05/2026
Wooden Camera today announced the release of new accessories for the Blackmagic URSA Cine Immersive. The new lineup includes a redesigned Top Plate and Side Rai...
22/05/2026
YES Network and OTT Advisors have announced a sixth consecutive season of their streaming partnership, continuing their collaboration on the Gotham app. OTT Adv...
22/05/2026
NESN, New England's premier sports network, will again turn its camera to Fe...
22/05/2026
Dale Pro Audio is hosting an RF over Fiber Livestream Webinar on May 28 from 1-2:30 pm EST. With major sporting events and large-scale productions putting incre...
22/05/2026
Audio-Technica has announced key leadership appointments designed to further strengthen its sales organization and drive continued growth across the Americas. M...
22/05/2026
After nearly four decades shaping the global combat sports landscape, Scott Coker has announced a powerful return as he looks to build a new international mixed...
22/05/2026
Skyline Communications, the company behind the globally deployed DataMiner xOps platform, today announced the launch of xOps Vanguard Runway, a strategic accele...
22/05/2026
For the fully onsite production, 30 cameras - including a SkyCam and Megalodon - will capture the action in Texas
One of the world's biggest rodeo producti...
22/05/2026
Leading Argentina-based sports media company Torneos y Competencias S.A. has modernized its playout operations, implementing a fully redundant, multichannel env...
22/05/2026
As the 2026 Major League Pickleball season kicks off this weekend in Dallas, it ...
22/05/2026
Shure has become a minority investor in Edge Sound Research, a start-up company that is developing new experiential audio technologies that redefine how many au...
22/05/2026
In advance of this year's Sports Emmy Awards, SVG is taking a deep dive into...
22/05/2026
The National Hockey League (NHL) and Amazon Music announced that GRAMMY Award-winning superstar Jelly Roll will provide the official theme song of the 2026 Stan...
22/05/2026
David Pogue will keynote SVV Summer Camp and discuss Apple at 50: How the World...
22/05/2026
In its second year as rightsholder, FOX Sports goes bigger across the board for ...
22/05/2026
Heading into FOX Sports' second Indianapolis 500, Lead Director Mitch Riggin...
22/05/2026
Latest data reveals steady distributor rankings, a seasonal shift toward digital...
22/05/2026
Share
Copy link
Facebook
X
Linkedin
Bluesky
Email...
22/05/2026
Amagi Media Labs Limited (NSE: AMAGI, BSE: 544679), a cloud-native SaaS platform providing AI-enabled solutions to global media and entertainment companies, tod...
22/05/2026
An nima Post Relies on Cintel to Revive Classic Mexican Films
Brie Clayton May 22, 2026
0 Comments
Film scanner and DaVinci Resolve Studio help manage...
22/05/2026
Boris FX Sapphire Adds Optical Beauty and Hypnotic Textures
Jessie Electa Petrov May 22, 2026
0 Comments
The 2026.5 release introduces advanced defocu...
22/05/2026
Deployment Preserves Trusted Workflows While Enabling a Path to UHD and SMPTE ST 2110
Leading Argentina-based sports media company Torneos y Competencias S.A....
22/05/2026
Share
Copy link
Facebook
X
Linkedin
Bluesky
Email...
22/05/2026
Share
Copy link
Facebook
X
Linkedin
Bluesky
Email...
22/05/2026
Share
Copy link
Facebook
X
Linkedin
Bluesky
Email...
22/05/2026
Back to All News
Ex-Husband, Current Husband, One Wild Rescue: Korean Action Co...
21/05/2026
Game Creek Video Columbia and Celtic, NEP Supershooter 8 will house onsite produ...
21/05/2026
Freshly graduated, this upstart producer, director, and camera operator is already working as an AP on videoboard shows for the Philadelphia Phillies
In the li...
21/05/2026
Media Links has announced a channel partnership with Clearcast Asia, a broadcast...
21/05/2026
SiriusXM and NASCAR have announced a multi-year renewal of their broadcasting agreement. SiriusXM will continue to carry live broadcasts of every NASCAR Cup Ser...
21/05/2026
Audio-Technica held a demonstration event at its Technica House location in New ...
21/05/2026
Ateme has announced that RTL Deutschland has selected Ateme's software-based...
21/05/2026
LiveU has announced that BCC Live deployed the LU900Q intelligent production unit for the first time during the 2026 Memorial Hermann IRONMAN Texas North Americ...
21/05/2026
ATSC has announced that Mark Aitken, President of ONE Media and Senior VP of Advanced Technology at Sinclair Broadcast Group, will receive the 2026 Mark Richer ...
21/05/2026
BBright has published a technical analysis of the Media eXchange Layer (MXL), de...
21/05/2026
The Esports Foundation has announced that the 2026 Esports World Cup (EWC) will be hosted in Paris, France, from July 6 through August 23. The event marks the f...
21/05/2026
Chyron has announced PRIME Scorebug, a scorebug solution built on the PRIME Platform for on-premises sports production, and has expanded Chyron LIVE with purpos...
21/05/2026
Media Links has announced the integration of its Xscend IP transport platform with Skyline Communications' DataMiner xOps platform. The integration will be ...
21/05/2026
As live sports productions continue to demand more flexible, scalable, and cost-...
21/05/2026
In advance of this year's Sports Emmy Awards, SVG is taking a deep dive into...
21/05/2026
Hey Miami & Atlanta post-production folks!
Shade is hosting a free private suite at a Braves game (6/2) and Marlins game (6/5) and have about a dozen extra tic...
21/05/2026
The Suns and Mercury become the first NBA and WNBA teams to make games available under a single broadcast partner across both over-the-air and streaming....