Sony Pixel Power calrec Sony

Brave New World: Leo AI and Ollama Bring RTX-Accelerated Local LLMs to Brave Browser Users

02/10/2024

Editor's note: This post is part of the AI Decoded series, which demystifies AI by making the technology more accessible, and showcases new hardware, software, tools and accelerations for GeForce RTX PC and NVIDIA RTX workstation users.

From games and content creation apps to software development and productivity tools, AI is increasingly being integrated into applications to enhance user experiences and boost efficiency.

Those efficiency boosts extend to everyday tasks, like web browsing. Brave, a privacy-focused web browser, recently launched a smart AI assistant called Leo AI that, in addition to providing search results, helps users summarize articles and videos, surface insights from documents, answer questions and more.

Leo AI helps users summarize articles and videos, surface insights from documents, answer questions and more. The technology behind Brave and other AI-powered tools is a combination of hardware, libraries and ecosystem software that's optimized for the unique needs of AI.

Why Software Matters NVIDIA GPUs power the world's AI, whether running in the data center or on a local PC. They contain Tensor Cores, which are specifically designed to accelerate AI applications like Leo AI through massively parallel number crunching - rapidly processing the huge number of calculations needed for AI simultaneously, rather than doing them one at a time.

But great hardware only matters if applications can make efficient use of it. The software running on top of GPUs is just as critical for delivering the fastest, most responsive AI experience.

The first layer is the AI inference library, which acts like a translator that takes requests for common AI tasks and converts them to specific instructions for the hardware to run. Popular inference libraries include NVIDIA TensorRT, Microsoft's DirectML and the one used by Brave and Leo AI via Ollama, called llama.cpp.

Llama.cpp is an open-source library and framework. Through CUDA - the NVIDIA software application programming interface that enables developers to optimize for GeForce RTX and NVIDIA RTX GPUs - provides Tensor Core acceleration for hundreds of models, including popular large language models (LLMs) like Gemma, Llama 3, Mistral and Phi.

On top of the inference library, applications often use a local inference server to simplify integration. The inference server handles tasks like downloading and configuring specific AI models so that the application doesn't have to.

Ollama is an open-source project that sits on top of llama.cpp and provides access to the library's features. It supports an ecosystem of applications that deliver local AI capabilities. Across the entire technology stack, NVIDIA works to optimize tools like Ollama for NVIDIA hardware to deliver faster, more responsive AI experiences on RTX.

Applications like Brave's Leo AI can access RTX-powered AI acceleration to enhance user experiences. NVIDIA's focus on optimization spans the entire technology stack - from hardware to system software to the inference libraries and tools that enable applications to deliver faster, more responsive AI experiences on RTX.

Local vs. Cloud Brave's Leo AI can run in the cloud or locally on a PC through Ollama.

There are many benefits to processing inference using a local model. By not sending prompts to an outside server for processing, the experience is private and always available. For instance, Brave users can get help with their finances or medical questions without sending anything to the cloud. Running locally also eliminates the need to pay for unrestricted cloud access. With Ollama, users can take advantage of a wider variety of open-source models than most hosted services, which often support only one or two varieties of the same AI model.

Users can also interact with models that have different specializations, such as bilingual models, compact-sized models, code generation models and more.

RTX enables a fast, responsive experience when running AI locally. Using the Llama 3 8B model with llama.cpp, users can expect responses up to 149 tokens per second - or approximately 110 words per second. When using Brave with Leo AI and Ollama, this means snappier responses to questions, requests for content summaries and more.

NVIDIA internal throughput performance measurements on NVIDIA GeForce RTX GPUs, featuring a Llama 3 8B model with an input sequence length of 100 tokens, generating 100 tokens. Get Started With Brave With Leo AI and Ollama Installing Ollama is easy - download the installer from the project's website and let it run in the background. From a command prompt, users can download and install a wide variety of supported models, then interact with the local model from the command line.

For simple instructions on how to add local LLM support via Ollama, read the company's blog. Once configured to point to Ollama, Leo AI will use the locally hosted LLM for prompts and queries. Users can also switch between cloud and local models at any time.

Brave with Leo AI running on Ollama and accelerated by RTX is a great way to get more out of your browsing experience. You can even summarize and ask questions about AI Decoded blogs! Developers can learn more about how to use Ollama and llama.cpp in the NVIDIA Technical Blog.

Generative AI is transforming gaming, videoconferencing and interactive experiences of all kinds. Make sense of what's new and what's next by subscribing to the AI Decoded newsletter.
LINK: https://blogs.nvidia.com/blog/rtx-ai-brave-browser/...
See more stories from nvidia

North America Stories

21/05/2026

CBS Sports Expands WNBA Tip-Off Show To Cover Half of 20-Game, Regular-Season Package

Game Creek Video Columbia and Celtic, NEP Supershooter 8 will house onsite produ...

21/05/2026

SVG Students To Watch: Christina Zelin, Rowan University

Freshly graduated, this upstart producer, director, and camera operator is already working as an AP on videoboard shows for the Philadelphia Phillies In the li...

21/05/2026

Media Links Announces Channel Partnership with Clearcast Asia Ahead of BroadcastAsia 2026

Media Links has announced a channel partnership with Clearcast Asia, a broadcast...

21/05/2026

SiriusXM and NASCAR Announce Multi-Year Broadcasting Agreement Renewal

SiriusXM and NASCAR have announced a multi-year renewal of their broadcasting agreement. SiriusXM will continue to carry live broadcasts of every NASCAR Cup Ser...

21/05/2026

Audio-Technica Hosts High-Density Wireless Microphone Demo at Technica House in New York City

Audio-Technica held a demonstration event at its Technica House location in New ...

21/05/2026

RTL Deutschland Selects Ateme Frame-Rate Conversion Technology for Live Event Workflows

Ateme has announced that RTL Deutschland has selected Ateme's software-based...

21/05/2026

BCC Live Deploys LiveU LU900Q for Record-Breaking IRONMAN Texas Broadcast

LiveU has announced that BCC Live deployed the LU900Q intelligent production unit for the first time during the 2026 Memorial Hermann IRONMAN Texas North Americ...

21/05/2026

Mark Aitken to Receive 2026 ATSC Mark Richer Industry Leadership Medal

ATSC has announced that Mark Aitken, President of ONE Media and Senior VP of Advanced Technology at Sinclair Broadcast Group, will receive the 2026 Mark Richer ...

21/05/2026

BBright Outlines MXL Integration Strategy for Software-Defined Broadcast Workflows

BBright has published a technical analysis of the Media eXchange Layer (MXL), de...

21/05/2026

Esports World Cup 2026 to Be Held in Paris, France

The Esports Foundation has announced that the 2026 Esports World Cup (EWC) will be hosted in Paris, France, from July 6 through August 23. The event marks the f...

21/05/2026

Chyron Announces PRIME Scorebug and Expanded Chyron LIVE Scorebug Capabilities

Chyron has announced PRIME Scorebug, a scorebug solution built on the PRIME Platform for on-premises sports production, and has expanded Chyron LIVE with purpos...

21/05/2026

Media Links Integrates Xscend Platform with DataMiner at BroadcastAsia 2026

Media Links has announced the integration of its Xscend IP transport platform with Skyline Communications' DataMiner xOps platform. The integration will be ...

21/05/2026

SVG New Sponsor Spotlight: Nova Lume CEO Jim Casey on Bringing Flexible IP Intercom to Live Production

As live sports productions continue to demand more flexible, scalable, and cost-...

21/05/2026

SVG Rewind: ESPN's Use of POVORA Wireless Tilt Control CapCam Gives College Football Fans a First-Person View of the Game

In advance of this year's Sports Emmy Awards, SVG is taking a deep dive into...

21/05/2026

Join Shade in Miami and Atlanta for Post-Production Networking Events

Hey Miami & Atlanta post-production folks! Shade is hosting a free private suite at a Braves game (6/2) and Marlins game (6/5) and have about a dozen extra tic...

21/05/2026

Phoenix Suns, Mercury, and Gray Media Extend Broadcast Partnership Through 2030

The Suns and Mercury become the first NBA and WNBA teams to make games available under a single broadcast partner across both over-the-air and streaming....

21/05/2026

Apple TV, MLS to Produce First Major Pro Sporting Event Shot Entirely on iPhone 17 Pros

iPhones are part of the the regular production rotation for Friday Night Baseba...

21/05/2026

SVG Rewind: AIQ: Where Data Meets Dirt Technology Brings Real-Time Data Analysis to Rodeo

In advance of this year's Sports Emmy Awards, SVG is taking a deep dive into...

21/05/2026

Filmmakers' Favorites: Sundance Film Festival Alums on Welcome to the Dollhouse

Heather Matarazzo as Dawn Wiener in Todd Solondz's Welcome to the Dollhouse...

21/05/2026

Redefining Persistent and Affordable Airpower for Special Operations Forces

SKY RAIDER II INTERNATIONAL's modular open systems architecture delivers expanded operational reach and mission flexibility....

21/05/2026

From Payload to Platform: Autonomous ISR Where It Actually Matters

ASO-enabled WESCAM MX-10 systems conduct systematic wide-area maritime search patterns, autonomously managing sensor scan operations to expand coverage, reduce ...

21/05/2026

Sports content accounts for fastest-growing portion of top global SVOD catalogs

HBO Max, a new addition to Gracenote Data Hub, is home to the most sports programming among major streamers NEW YORK May 21, 2026 New analysis by Gracenote...

21/05/2026

Study: Nearly Half of U.S. Viewers Watch Video With Captions

Share Copy link Facebook X Linkedin Bluesky Email...

21/05/2026

Apple TV to Capture MLS Game Entirely on iPhone 17 Pro

Share Copy link Facebook X Linkedin Bluesky Email...

21/05/2026

MPTS 2026 Draws Record Numbers for Landmark 10th Annivers...

The UK's leading event for the creative industries united thousands of professionals for two days of networking, debate, industry insight and getting hands-...

21/05/2026

Recreating Doug Trumbull's Slitscan VFX - After Effects Mastery

Recreating Doug Trumbull's Slitscan VFX - After Effects Mastery Graham Quince May 21, 2026 0 Comments In this After Effects tutorial, I'm divi...

21/05/2026

Cavalry: An Array of Fun Stuff

Cavalry: An Array of Fun Stuff Simon Ubsdell May 21, 2026 0 Comments Arrays are a really powerful feature of Cavalry and here we'll go over some o...

21/05/2026

Chyron Launches PRIME Scorebug

Share Copy link Facebook X Linkedin Bluesky Email...

21/05/2026

Watch Brasil Selects Bitmovin to Improve Streaming Perfor...

Bitmovin, a leading provider of video streaming solutions, has announced that streaming service provider, Watch Brasil, has replaced its legacy systems with Bit...

21/05/2026

OWC Launches Dads and Grads Promotion with Exclusive Savings

OWC Launches Dads and Grads Promotion with Exclusive Savings Brie Clayton May 21, 2026 0 Comments The gifts they'll actually use - save on cutting...

21/05/2026

Spectrum Launches New California News Networks

Share Copy link Facebook X Linkedin Bluesky Email...

21/05/2026

Vivid Unease Ashley Barron ACS Lights Netflixs How to Get...

For cinematographer Ashley Barron ACS ( Rivals , Dangerous Liaisons , Disney's Doctor Who ), setting the look of Netflix's How to Get to Heaven from ...

21/05/2026

Hensgens Pairs Airglow and LiteMat on Cannes 2026 Premier...

Jean-Fran ois Hensgens, AFC, SBC, (Tueurs, Six Days in Spring, Le Fil) brings his signature visual approach to Cannes 2026 premiering biopic L'Affaire Marie...

21/05/2026

Beeble launches Canvas, a node-based AI compositor for VFX and Virtual Production pipelines

Beeble launches Canvas, a node-based AI compositor for VFX and Virtual Productio...

21/05/2026

Shelly Johnson Elected President of American Society of Cinematographers

Shelly Johnson Elected President of American Society of Cinematographers Brie Clayton May 20, 2026 0 Comments The American Society of Cinematographers...

21/05/2026

ARRI expands Management Board to accelerate next phase of growth and innovation

ARRI expands Management Board to accelerate next phase of growth and innovation Brie Clayton May 20, 2026 0 Comments ARRI expands its Management Board...

21/05/2026

25th Tribeca Festival Announces 2026 Jury And Tribeca & Chanel Artist Awards Program

May 21st, 2026 Press Assets Available Here 25TH TRIBECA FESTIVAL ANNOUNCES 202...

21/05/2026

In a Broken System, How Far Will You Bend? Netflix Releases Main Trailer for Thai Legal Drama The Evil Lawyer'

Back to All News In a Broken System, How Far Will You Bend? Netflix Releases Ma...

21/05/2026

'RHYTHM + FLOW ITALY': The Trailer Is Out Now For the Third Season of the Rap Show

Back to All News RHYTHM FLOW ITALY: The Trailer Is Out Now For the Third Seas...

21/05/2026

Our Latest Steps to Make Content More Accessible

Back to All News Our Latest Steps to Make Content More Accessible Product 21 May 2026 Global Link copied to clipboard Download all assets Listen to the a...

21/05/2026

License to Stream: 007 First Light' Coming to GeForce NOW With an Ultimate Bundle

The mission begins now. GeForce NOW is dialing up the action with a blockbuster...

20/05/2026

New Nissan Stadium Is First NFL Venue With Permanent Built-In Music Stage

True to the Music City theme, the Tennessee Titans plan to make live music a prominent fixture on game days at the new venue Reflecting the continuing converge...

20/05/2026

NHL Taps Honeywell To Make Arenas, Facilities More Energy-Efficient

The long-term goal is innovation for the next generation of hockey rinks at both the professional and the community level The National Hockey League (NHL) toda...

20/05/2026

A Look Into the Future: What It Takes To Power the Next Generation of Sports Stadiums

End-users and vendors alike work toward cost-effective yet high-quality solution...

20/05/2026

Nasvhvilles New Nissan Stadium To Host Super Bowl LXIV in 2030

The National Football League has announced that Nashville will host Super Bowl LXIV in 2030 at the new Nissan Stadium. The announcement was made at the NFL Spri...

20/05/2026

NFL Awards Minnesota as Host Site of 2028 NFL Draft

The NFL has announced that the 2028 NFL Draft presented by Bud Light will take place in Minnesota, uniting fans from around the world to celebrate one of the mo...

20/05/2026

Lawo Highlights Edge One at InfoComm 2026

As professional AV workflows continue to evolve, expectations around audio, video, networking, and control are rising rapidly. At InfoComm 2026 in Las Vegas, La...

20/05/2026

Haivision to Showcase Video Wall, IPTV, and Ultra-Low Latency Video Innovations at InfoComm 2026

Haivision will showcase its latest innovations at InfoComm 2026, taking place fr...

20/05/2026

L3Harris Achieves Major Milestones With Latest Series of Rotating Detonation Engine Tests

A Rotating Detonation Engine being hot fire tested at Purdue University's Zu...

20/05/2026

VAMPIRE Confirms Integrated Capability During Live Exercises

A U.S. Army VAMPIRE system, assigned to Bravo Battery, 1st Battalion, 51st Air Defense Artillery Regiment, 7th Infantry Division/Multi-Domain Command - Pacific ...