
Editor's note: This post is part of the AI Decoded series, which demystifies AI by making the technology more accessible, and showcases new hardware, software, tools and accelerations for GeForce RTX PC and NVIDIA RTX workstation users.
From games and content creation apps to software development and productivity tools, AI is increasingly being integrated into applications to enhance user experiences and boost efficiency.
Those efficiency boosts extend to everyday tasks, like web browsing. Brave, a privacy-focused web browser, recently launched a smart AI assistant called Leo AI that, in addition to providing search results, helps users summarize articles and videos, surface insights from documents, answer questions and more.
Leo AI helps users summarize articles and videos, surface insights from documents, answer questions and more. The technology behind Brave and other AI-powered tools is a combination of hardware, libraries and ecosystem software that's optimized for the unique needs of AI.
Why Software Matters NVIDIA GPUs power the world's AI, whether running in the data center or on a local PC. They contain Tensor Cores, which are specifically designed to accelerate AI applications like Leo AI through massively parallel number crunching - rapidly processing the huge number of calculations needed for AI simultaneously, rather than doing them one at a time.
But great hardware only matters if applications can make efficient use of it. The software running on top of GPUs is just as critical for delivering the fastest, most responsive AI experience.
The first layer is the AI inference library, which acts like a translator that takes requests for common AI tasks and converts them to specific instructions for the hardware to run. Popular inference libraries include NVIDIA TensorRT, Microsoft's DirectML and the one used by Brave and Leo AI via Ollama, called llama.cpp.
Llama.cpp is an open-source library and framework. Through CUDA - the NVIDIA software application programming interface that enables developers to optimize for GeForce RTX and NVIDIA RTX GPUs - provides Tensor Core acceleration for hundreds of models, including popular large language models (LLMs) like Gemma, Llama 3, Mistral and Phi.
On top of the inference library, applications often use a local inference server to simplify integration. The inference server handles tasks like downloading and configuring specific AI models so that the application doesn't have to.
Ollama is an open-source project that sits on top of llama.cpp and provides access to the library's features. It supports an ecosystem of applications that deliver local AI capabilities. Across the entire technology stack, NVIDIA works to optimize tools like Ollama for NVIDIA hardware to deliver faster, more responsive AI experiences on RTX.
Applications like Brave's Leo AI can access RTX-powered AI acceleration to enhance user experiences. NVIDIA's focus on optimization spans the entire technology stack - from hardware to system software to the inference libraries and tools that enable applications to deliver faster, more responsive AI experiences on RTX.
Local vs. Cloud Brave's Leo AI can run in the cloud or locally on a PC through Ollama.
There are many benefits to processing inference using a local model. By not sending prompts to an outside server for processing, the experience is private and always available. For instance, Brave users can get help with their finances or medical questions without sending anything to the cloud. Running locally also eliminates the need to pay for unrestricted cloud access. With Ollama, users can take advantage of a wider variety of open-source models than most hosted services, which often support only one or two varieties of the same AI model.
Users can also interact with models that have different specializations, such as bilingual models, compact-sized models, code generation models and more.
RTX enables a fast, responsive experience when running AI locally. Using the Llama 3 8B model with llama.cpp, users can expect responses up to 149 tokens per second - or approximately 110 words per second. When using Brave with Leo AI and Ollama, this means snappier responses to questions, requests for content summaries and more.
NVIDIA internal throughput performance measurements on NVIDIA GeForce RTX GPUs, featuring a Llama 3 8B model with an input sequence length of 100 tokens, generating 100 tokens. Get Started With Brave With Leo AI and Ollama Installing Ollama is easy - download the installer from the project's website and let it run in the background. From a command prompt, users can download and install a wide variety of supported models, then interact with the local model from the command line.
For simple instructions on how to add local LLM support via Ollama, read the company's blog. Once configured to point to Ollama, Leo AI will use the locally hosted LLM for prompts and queries. Users can also switch between cloud and local models at any time.
Brave with Leo AI running on Ollama and accelerated by RTX is a great way to get more out of your browsing experience. You can even summarize and ask questions about AI Decoded blogs! Developers can learn more about how to use Ollama and llama.cpp in the NVIDIA Technical Blog.
Generative AI is transforming gaming, videoconferencing and interactive experiences of all kinds. Make sense of what's new and what's next by subscribing to the AI Decoded newsletter.
North America Stories
21/01/2026
Wisycom, a global leader in advanced wireless RF solutions, launches its new wideband antenna matrix, MATF, which supports RF and fiber for demanding multi-zone...
21/01/2026
Grass Valley will demonstrate how it is powering scalable, future-ready live production at FOMEX 2026, taking place February 2 4 in Riyadh, Saudi Arabia. Exhibi...
21/01/2026
BCNEXXT, the developers of the advanced playout platform Vipe, today announced that OKAST, the monetization-first OTT platform provider, is using BCNEXXT's ...
21/01/2026
Revamped design enables advanced capabilities, leading with powerful IP to HDMI conversion
Magewell, developer of innovative, high-performance video I/O and I...
21/01/2026
Jan 20th 2026, Changsha Kiloview today announced the launch of two major additions to its AV-over-IP ecosystem: the AVX24-4 Media HUB and KiloLink Station, ma...
21/01/2026
Latest version of enterprise-class Buttons brings simple, coherent control to more than 700 professional devices and applications
Bitfocus, the specialist in ...
21/01/2026
Clear-Com is pleased to announce the appointment of Kari Eythorsson as the new Regional Sales Manager (RSM) for Southeast Asia & Australia, based in Singapore,...
21/01/2026
Share Share by:
Copy link
Facebook
X
Linkedin
Pinterest
Bluesky
Email...
21/01/2026
Share Share by:
Copy link
Facebook
X
Linkedin
Pinterest
Bluesky
Email...
21/01/2026
Share Share by:
Copy link
Facebook
X
Linkedin
Pinterest
Bluesky
Email...
21/01/2026
Share Share by:
Copy link
Facebook
X
Linkedin
Pinterest
Bluesky
Email...
21/01/2026
From skilled trades to startups, AI's rapid expansion is the beginning of th...
20/01/2026
SVG Sit-Down: BitFire's Jim Akimchuk on the Future of Cloud, Remote, Softwar...
20/01/2026
From architecture to reality: What CBC's Milano Cortina 2026 workflows tell ...
20/01/2026
Milano Cortina 2026: Making the Winter Olympics and Paralympics workflow work fo...
20/01/2026
Countdown to Milano Cortina 2026: SVG Launches SportsTechLive Blog in Lead-up to...
20/01/2026
Grand finale: Rally Saudi Arabia pushes WRC to make the most of cameras and dron...
20/01/2026
State of Change: What's Happening in the Remote-Production Business?NEP Group, Game Creek Video, Dome Productions, Mobile TV Group, Program Productions exec...
20/01/2026
CFP National Championship 2026: A Look Back at SVG's Complete CoverageGo behind the scenes with ESPN, Game Creek Video, Van Wagner, and the champion Indiana...
20/01/2026
The Gotham, Film Independent, and Creators Coalition on AI Join Alliance
by Michelle Satter, Founding Senior Director, Sundance Institute's Artist Programs...
20/01/2026
December, the first month of winter and the holiday season, has traditionally encouraged families to spend time together in front of the television. During this...
20/01/2026
During December, streaming's share of TV viewing in Mexico settled at 24.3%, an increase of 0.1 share points from the previous month.
Disclaimer: YUMI TV,...
20/01/2026
Christmas Day is Most-Streamed Day Ever with 55 Billion Viewing Minutes, led by...
20/01/2026
Share Share by:
Copy link
Facebook
X
Whatsapp
Pinterest
Flipboard...
20/01/2026
Disguise and ASB GlassFloor today announced a strategic partnership that turns the arena floor into a fully customizable interactive digital surface. This will ...
20/01/2026
Last summer, ITV celebrated the tenth anniversary of its popular reality dating show, Love Island. To ramp up the excitement of series 12 (launched in the UK on...
20/01/2026
Berklee's Mark Ethier Named One of SPIN's Most Influential People in Mus...
20/01/2026
Share Share by:
Copy link
Facebook
X
Whatsapp
Pinterest
Flipboard...
20/01/2026
Share Share by:
Copy link
Facebook
X
Whatsapp
Pinterest
Flipboard...
20/01/2026
Emergent, a provider of AI-powered media production solutions, today announced its presence at ISE 2026 in Barcelona, taking place February 3 6, where it will u...
20/01/2026
Back to All News
Netflix and Warner Bros. Discovery Amend Agreement to All-Cash Transaction
Business
20 January 2026
Global
Link copied to clipboard
All-C...
20/01/2026
Back to All News
Now You Can Play a Role in Our Live Events With Real-Time Voting
Elmar Nubbemeyer
VP, Member Product
Product
20 January 2026
Global
Link...
19/01/2026
CFP National Championship 2026: Indiana Football's Digital Crew Generate Hyp...
19/01/2026
CFP National Championship 2026: Van Wagner Relies on Experience for Indiana-Miam...
19/01/2026
CFP National Championship 2026: Game Creek Video Fields Premiere IP Compound for...
19/01/2026
CFP National Championship 2026: For a Slew of Studio Shows, ESPN Turns Hard Rock...
19/01/2026
CFP National Championship 2026: ESPN's Sweeping Live Game Production Leverag...
19/01/2026
The new L3Harris NOVA binocular system is the every-soldier goggle' - and the beginning of a new era for night-vison capability....
19/01/2026
The Majority of Those Surveyed Prefer to Buy Brands Who Advertise in Content Tha...
19/01/2026
Back to All News
Straight to Hell' Sets April 27 Premiere: Teaser Trailer,...
19/01/2026
New Workflow Enhancements for All Broadcast Pix Systems, Plus 4-Channel ISO Recording for Hybrid and Roadie Tyngsboro, Mass. - January 2025 - Broadcast Pix ann...
19/01/2026
Back to All News
Netflix Announces the Series Nights & Days, Directed by Kamila...
17/01/2026
Lightware, an industry leader in signal management, is helping to elevate the Google Meet experience with the introduction of an integration with Taurus UCX. Th...
17/01/2026
Clear-Com announced a strategic partnership with NETGEAR AV to offer four high-performance networking switches as original equipment manufacturer (OEM) solutio...
17/01/2026
QuickLink, a leading provider of award-winning video production and remote guest contribution solutions, today announced a new U.S. distribution partnership wit...
17/01/2026
NUGEN Audio announces the latest evolution of MasterCheck, its industry-trusted optimization plug-in for cross-platform mastering and loudness verification. Des...
17/01/2026
BHV has announced that it will highlight its new SportsBox remote vision mixer and picture processor at ISE in Booth #4H900 marking its official introduction in...
17/01/2026
Clear-Com will return to ISE 2026 to demonstrate its latest intercom solutions for broadcast, live production, and AV professionals. Exhibiting at Stand #4P700...
17/01/2026
Grass Valley, a leading provider in live production solutions, will exhibit at ISE 2026 in Barcelona from February 3 6, presenting its vision for how modern Bro...
17/01/2026
Utelogy Corporation, the global leader in connected workplace management, monitoring, control, automation and analytics closes out a record-breaking 2025. The c...