Sony Pixel Power calrec Sony

LM Studio Accelerates LLM Performance With NVIDIA GeForce RTX GPUs and CUDA 12.8

08/05/2025

As AI use cases continue to expand - from document summarization to custom software agents - developers and enthusiasts are seeking faster, more flexible ways to run large language models (LLMs).

Running models locally on PCs with NVIDIA GeForce RTX GPUs enables high-performance inference, enhanced data privacy and full control over AI deployment and integration. Tools like LM Studio - free to try - make this possible, giving users an easy way to explore and build with LLMs on their own hardware.

LM Studio has become one of the most widely adopted tools for local LLM inference. Built on the high-performance llama.cpp runtime, the app allows models to run entirely offline and can also serve as OpenAI-compatible application programming interface (API) endpoints for integration into custom workflows.

The release of LM Studio 0.3.15 brings improved performance for RTX GPUs thanks to CUDA 12.8, significantly improving model load and response times. The update also introduces new developer-focused features, including enhanced tool use via the tool_choice parameter and a redesigned system prompt editor.

The latest improvements to LM Studio improve its performance and usability - delivering the highest throughput yet on RTX AI PCs. This means faster responses, snappier interactions and better tools for building and integrating AI locally.

Where Everyday Apps Meet AI Acceleration LM Studio is built for flexibility - suited for both casual experimentation or full integration into custom workflows. Users can interact with models through a desktop chat interface or enable developer mode to serve OpenAI-compatible API endpoints. This makes it easy to connect local LLMs to workflows in apps like VS Code or bespoke desktop agents.

For example, LM Studio can be integrated with Obsidian, a popular markdown-based knowledge management app. Using community-developed plug-ins like Text Generator and Smart Connections, users can generate content, summarize research and query their own notes - all powered by local LLMs running through LM Studio. These plug-ins connect directly to LM Studio's local server, enabling fast, private AI interactions without relying on the cloud.

Example of using LM Studio to generate notes accelerated by RTX. The 0.3.15 update adds new developer capabilities, including more granular control over tool use via the tool_choice parameter and an upgraded system prompt editor for handling longer or more complex prompts.

The tool_choice parameter lets developers control how models engage with external tools - whether by forcing a tool call, disabling it entirely or allowing the model to decide dynamically. This added flexibility is especially valuable for building structured interactions, retrieval-augmented generation (RAG) workflows or agent pipelines. Together, these updates enhance both experimentation and production use cases for developers building with LLMs.

LM Studio supports a broad range of open models - including Gemma, Llama 3, Mistral and Orca - and a variety of quantization formats, from 4-bit to full precision.

Common use cases span RAG, multi-turn chat with long context windows, document-based Q&A and local agent pipelines. And by using local inference servers powered by the NVIDIA RTX-accelerated llama.cpp software library, users on RTX AI PCs can integrate local LLMs with ease.

Whether optimizing for efficiency on a compact RTX-powered system or maximizing throughput on a high-performance desktop, LM Studio delivers full control, speed and privacy - all on RTX.

Experience Maximum Throughput on RTX GPUs At the core of LM Studio's acceleration is llama.cpp - an open-source runtime designed for efficient inference on consumer hardware. NVIDIA partnered with the LM Studio and llama.cpp communities to integrate several enhancements to maximize RTX GPU performance.

Key optimizations include:

CUDA graph enablement: Groups multiple GPU operations into a single CPU call, reducing CPU overhead and improving model throughput by up to 35%.

Flash attention CUDA kernels: Boosts throughput by up to 15% by improving how LLMs process attention - a critical operation in transformer models. This optimization enables longer context windows without increasing memory or compute requirements.

Support for the latest RTX architectures: LM Studio's update to CUDA 12.8 ensures compatibility with the full range of RTX AI PCs - from GeForce RTX 20 Series to NVIDIA Blackwell-class GPUs, giving users the flexibility to scale their local AI workflows from laptops to high-end desktops.

Data measured using different versions of LM Studio and CUDA backends on GeForce RTX 5080 on DeepSeek-R1-Distill-Llama-8B model. All configurations measured using Q4_K_M GGUF (Int4) quantization at BS=1, ISL=4000, OSL=200, with Flash Attention ON. Graph showcases 27% speedup with the latest version of LM Studio due to NVIDIA contributions to the llama.cpp inference backend. With a compatible driver, LM Studio automatically upgrades to the CUDA 12.8 runtime, enabling significantly faster model load times and higher overall performance.

These enhancements deliver smoother inference and faster response times across the full range of RTX AI PCs - from thin, light laptops to high-performance desktops and workstations.

Get Started With LM Studio LM Studio is free to download and runs on Windows, macOS and Linux. With the latest 0.3.15 release and ongoing optimizations, users can expect continued improvements in performance, customization and usability - making local AI faster, more flexible and more accessible.

Users can load a model through the desktop chat interface or enable developer mode to expose an OpenAI-compatible API.

To quickly get started, download the latest version of LM Studio and open up the application.

Click the magnifying glass icon on the left panel to open up the Discover
LINK: https://blogs.nvidia.com/blog/rtx-ai-garage-lmstudio-llamacpp-blackwel...
See more stories from nvidia

North America Stories

08/05/2025

What to Watch: 6 Sundance Institute-Supported Films by Filipino Directors

A sinister fairy infiltrates a desperate family in Kenneth Dagatan's In My Mother's Skin, which premiered at the 2023 Sundance Film Festival. Photo co...

08/05/2025

Managing the Mission: Teaching Technique to C3ISR Operators

For skyward-bound operators, training focuses on the unique aspects of flying ISR missions, including the management of onboard surveillance equipment and the e...

08/05/2025

Cable Industry Backs Broadcasters' Move to Software-Based EAS

The cable industry has told the Federal Communications Commission it supports the National Association of Broadcasters' proposal to allow broadcasters to us...

08/05/2025

CTA Tells FCC: Dont Mandate ATSC 3.0 Tuners

WASHINGTON The Consumer Technology Association has continued its opposition to mandates requiring that NextGen TV/ATSC 3.0 tuners be included in new TV sets, sa...

08/05/2025

TAG Video Systems Appoints Paul Maroni as Vice President...

TAG Video Systems, the leader in software-based IP end-to-end workflow monitoring, deep probing, and real time visualization, has named Paul Maroni as Vice Pres...

08/05/2025

BroadcastAsia 2025 Showcases Best of British Innovation

This year's UK Pavilion in hall 5, once again managed by Tradefair, will provide visitors with the unique opportunity to discuss and be involved in cutting ...

08/05/2025

Rohde & Schwarz to highlight innovative broadcast technol...

Rohde & Schwarz will showcase its latest energy-efficient transmitters and 5G Broadcast technologies, designed to support network operators and content provider...

08/05/2025

Nexstar Appoints Bill Nardi VP of Station Operations

IRVING, Texas Nexstar Media Group has tapped Bill Nardi as vice president of station operations, responsible for overseeing the day-to-day broadcast operations ...

08/05/2025

LumaTouch Partners With CNN Academy on Training

SEATTLE LumaTouch is partnering with CNN Academy to improve mobile storytelling techniques and support training across all of CNN Academy's training simulat...

08/05/2025

SBE Backs NAB Proposals to Change EAS Rules

WASHINGTON The Society of Broadcast Engineers has filed comments with the Federal Communications Commission that support a proposal by the National Association ...

08/05/2025

OAN to Provide News to VOA, USAGM Networks

Senior adviser to the United States Agency for Global Media Kari Lake has announced that One America News Network (OAN) will provide newsfeed services for fre...

08/05/2025

EdMon Expands as AI-Driven Post Production Workflows Gains Traction in Sweden and Beyond

EdMon Expands as AI-Driven Post Production Workflows Gains Traction in Sweden an...

08/05/2025

Using Luma Mattes in Adobe Premiere Pro

Using Luma Mattes in Adobe Premiere Pro Graham Quince May 7, 2025 0 Comments This very quick tutorial shows you how to take an RGB clip and apply its ...

08/05/2025

OpenDrives Unveils Free Your Data' Initiative with New Astraeus Cloud-Native Data Services Platform

OpenDrives Unveils Free Your Data' Initiative with New Astraeus Cloud-Nativ...

08/05/2025

Student Spotlight: Grigori Balasanyan

Student Spotlight: Grigori Balasanyan The Armenian composer, who was named Boston Conservatory at Berklees 2025 student commencement speaker, talks about his ...

08/05/2025

Tribeca Festival 2025 Unveils New Premieres Spanning Film and Music

May 8th, 2025 Press Materials Available Here Tribeca Festival 2025 Unveils New Premieres Spanning Film and Music Slick Rick's Victory with Idris Elba a...

08/05/2025

Tribeca Festival 2025 Announces Lineup for Inaugural Storytelling Summit

May 8th, 2025 Press Materials Available Here Tribeca Festival 2025 Announces Lineup for Inaugural Storytelling Summit 11-Day Industry Event Launches with Tal...

08/05/2025

SVG Sit-Down: Vizrt's Nicholas Jameson on AI in Workflows, Pushing Boundaries With XR/AR

SVG Sit-Down: Vizrt's Nicholas Jameson on AI in Workflows, Pushing Boundarie...

08/05/2025

Creating Alternative Brand Experiences: Live Sports in the Age of Fortnite, Meta Horizon, and Beyond

Creating Alternative Brand Experiences: Live Sports in the Age of Fortnite, Meta...

08/05/2025

PGA TOUR's David Piccolo: Advanced Graphics and Virtual Production Tools are Elevating Live Golf Coverage

PGA TOUR's David Piccolo: Advanced Graphics and Virtual Production Tools are...

08/05/2025

Tech Focus: Advancing Immersion in Sports Broadcasting with AR and Virtual Production

Tech Focus: Advancing Immersion in Sports Broadcasting with AR and Virtual Produ...

08/05/2025

Now in Production: Comedy Action Film Husbands in Action' Puts Unlikely Allies on a Rescue Mission

Back to All News Now in Production: Comedy Action Film Husbands in Action'...

08/05/2025

Wildfire Prevention: AI Startups Support Prescribed Burns, Early Alerts

Artificial intelligence is helping identify and treat diseases faster with better results for humankind. Natural disasters like wildfires are next. Fires in th...

08/05/2025

Join the Family: GeForce NOW Welcomes 2K's Acclaimed Mafia' Franchise to the Cloud

Calling all wiseguys - 2K's acclaimed Mafia franchise is available to stream...

08/05/2025

LM Studio Accelerates LLM Performance With NVIDIA GeForce RTX GPUs and CUDA 12.8

As AI use cases continue to expand - from document summarization to custom software agents - developers and enthusiasts are seeking faster, more flexible ways t...

07/05/2025

March 2025 Less Time Spent Watching Video

Warsaw, Poland - April 28, 2025 - Nielsen, a global leader in audience measurement, data and analytics, has released its latest March All Screens Video Landscap...

07/05/2025

Studios Delay Moving Films to Streaming to Protect Box Office

LONDON Movie fans hoping to save money by waiting until their favorite new films appear on streaming services will have to wait a bit longer now, according to a...

07/05/2025

Saudi Broadcasting Authority Turns to Grass Valley for Major Tech Upgrade

MECCA, Saudi Arabia Saudi Broadcasting Authority (SBA) has selected Grass Valley to provide a major technology upgrade of its broadcast facility here....

07/05/2025

Sony and Nevion provide guidance on IP network architecture options for live production in new whitepaper

Sony and Nevion provide guidance on IP network architecture options for live pro...

07/05/2025

Media Pioneer Publishing AG Expands Editorial Capacity

Media Pioneer Publishing AG Expands Editorial Capacity Brie Clayton May 7, 2025 0 Comments Pioneer 2 boat production environment powered by Blackmagic...

07/05/2025

Studios Delaying Moving Films to Streaming to Protect Box Office

LONDON Movie fans hoping to save money by waiting until their favorite new films appear on streaming services will have to wait a bit longer now, according to a...

07/05/2025

FCC Seeks Comments on LPTV Adoption of 5G Broadcasting

WASHINGTON The Federal Communications Commission's Media Bureau is seeking public comment on a Petition for Rulemaking from HC2 Broadcasting Holdings asking...

07/05/2025

U.S. Department of Education Terminates CPB's Ready to Learn Grant

WASHINGTON Following a decision by U.S. Department of Education to terminate its 2020-2025 Ready To Learn to the Corporation for Public Broadcasting, CPB has in...

07/05/2025

Tubi Announces New Interactive Ad Formats

NEW YORK Fox's ad-supported streaming Tubi made a series of product and partnership announcements during IAB NewFronts in New York, including the launch of ...

07/05/2025

The GFiber App Gets an Upgrade

MOUNTAIN VIEW, Calif. Google Fiber (GFiber) has announced a redesigned app that the company said will simplify how customers set up service, manage devices, and...

07/05/2025

NASA+ Launches FAST Channel on Prime Video

WASHINGTON NASAs on-demand streaming service, NASA+, has launched a FAST (Free Ad-Supported Television) channel on Prime Video....

07/05/2025

The WNET Group Names Randall T. Decker Senior Director, Technology

NEW YORK The WNET Group, parent company of the PBS station Thirteen, has announced the appointment of Randall T. Decker to senior director, technology, effectiv...

07/05/2025

Peter Barber appointed new Chief Executive Officer at Ato...

Atomos announced an executive leadership transition as the Company continues to evolve and expand its strategic focus. Peter Barber, currently serving as Chie...

07/05/2025

Steadicam Vol Honored with AMPAS 2025 Scientific and Tech...

Steve Wagner, Jerry Holway and Robert Orf at the 2024 Scientific and Technical Awards at the Academy Museum of Motion Pictures on Tuesday, April 29, 2025. The ...

07/05/2025

nxtedition and TASCAM bring precision audio control to TV...

nxtedition, the Swedish company behind the leading integrated platform for news and program production, and TASCAM, the iconic Japanese manufacturer of professi...

07/05/2025

Signiant Brings Real-Time Camera Raw-to-Cloud Innovation...

Signiant is bringing its Camera-Raw-to-Any-Cloud workflow to the UK for the first time at the Media Production & Technology Show 2025 (Booth# M69) with a live d...

07/05/2025

MNC Software extends its global reach with Gencom Technol...

MNC Software Inc., a global leader in software and network solutions tailored to the broadcast and media industry, has appointed Gencom Technology as an officia...

07/05/2025

Rise AV Announces Inaugural UK Cohort for 2025 Mentoring...

Rise AV, the award-winning advocacy group championing gender diversity and professional development in the AV sector, is proud to announce 31 mentor-mentee pair...

07/05/2025

Leader to show Test and Measurement solutions for workflo...

Test & measurement innovator, Leader Electronics of Europe, is to bring a selection of its leading products for IP, SDI and hybrid workflow requirements to this...

07/05/2025

World Skills Cafe Returns at IBC2025 with Expanded Talent...

The Global Media and Entertainment Talent Manifesto announces that the World Skills Caf will return at IBC2025 with an expanded skills and diversity programme,...

07/05/2025

Moments Lab and LucidLink Expand AI-Powered Workflow Inte...

Moments Lab, a leader in AI video discovery, and LucidLink, the pioneer in real-time cloud collaboration, are proud to announce the integration of Moments Lab&#...

07/05/2025

Obvious C Broadcasts Skiing World Cup with Blackmagic Design

Obvious C Broadcasts Skiing World Cup with Blackmagic Design Brie Clayton May 6, 2025 0 Comments Blackmagic Design cameras capture cinematic sports pr...

07/05/2025

Larry Jordan Interviews Signiant's Jon Finegold at NAB 2025

Larry Jordan Interviews Signiant's Jon Finegold at NAB 2025 Brie Clayton May 6, 2025 0 Comments Jon Finegold, Chief Marketing Officer at Signiant,...

07/05/2025

Cadence Taps NVIDIA Blackwell to Accelerate AI-Driven Engineering Design and Scientific Simulation

A new supercomputer offered by Cadence, a leading provider of technology for ele...