Sony Pixel Power calrec Sony

LM Studio Accelerates LLM Performance With NVIDIA GeForce RTX GPUs and CUDA 12.8

08/05/2025

As AI use cases continue to expand - from document summarization to custom software agents - developers and enthusiasts are seeking faster, more flexible ways to run large language models (LLMs).

Running models locally on PCs with NVIDIA GeForce RTX GPUs enables high-performance inference, enhanced data privacy and full control over AI deployment and integration. Tools like LM Studio - free to try - make this possible, giving users an easy way to explore and build with LLMs on their own hardware.

LM Studio has become one of the most widely adopted tools for local LLM inference. Built on the high-performance llama.cpp runtime, the app allows models to run entirely offline and can also serve as OpenAI-compatible application programming interface (API) endpoints for integration into custom workflows.

The release of LM Studio 0.3.15 brings improved performance for RTX GPUs thanks to CUDA 12.8, significantly improving model load and response times. The update also introduces new developer-focused features, including enhanced tool use via the tool_choice parameter and a redesigned system prompt editor.

The latest improvements to LM Studio improve its performance and usability - delivering the highest throughput yet on RTX AI PCs. This means faster responses, snappier interactions and better tools for building and integrating AI locally.

Where Everyday Apps Meet AI Acceleration LM Studio is built for flexibility - suited for both casual experimentation or full integration into custom workflows. Users can interact with models through a desktop chat interface or enable developer mode to serve OpenAI-compatible API endpoints. This makes it easy to connect local LLMs to workflows in apps like VS Code or bespoke desktop agents.

For example, LM Studio can be integrated with Obsidian, a popular markdown-based knowledge management app. Using community-developed plug-ins like Text Generator and Smart Connections, users can generate content, summarize research and query their own notes - all powered by local LLMs running through LM Studio. These plug-ins connect directly to LM Studio's local server, enabling fast, private AI interactions without relying on the cloud.

Example of using LM Studio to generate notes accelerated by RTX. The 0.3.15 update adds new developer capabilities, including more granular control over tool use via the tool_choice parameter and an upgraded system prompt editor for handling longer or more complex prompts.

The tool_choice parameter lets developers control how models engage with external tools - whether by forcing a tool call, disabling it entirely or allowing the model to decide dynamically. This added flexibility is especially valuable for building structured interactions, retrieval-augmented generation (RAG) workflows or agent pipelines. Together, these updates enhance both experimentation and production use cases for developers building with LLMs.

LM Studio supports a broad range of open models - including Gemma, Llama 3, Mistral and Orca - and a variety of quantization formats, from 4-bit to full precision.

Common use cases span RAG, multi-turn chat with long context windows, document-based Q&A and local agent pipelines. And by using local inference servers powered by the NVIDIA RTX-accelerated llama.cpp software library, users on RTX AI PCs can integrate local LLMs with ease.

Whether optimizing for efficiency on a compact RTX-powered system or maximizing throughput on a high-performance desktop, LM Studio delivers full control, speed and privacy - all on RTX.

Experience Maximum Throughput on RTX GPUs At the core of LM Studio's acceleration is llama.cpp - an open-source runtime designed for efficient inference on consumer hardware. NVIDIA partnered with the LM Studio and llama.cpp communities to integrate several enhancements to maximize RTX GPU performance.

Key optimizations include:

CUDA graph enablement: Groups multiple GPU operations into a single CPU call, reducing CPU overhead and improving model throughput by up to 35%.

Flash attention CUDA kernels: Boosts throughput by up to 15% by improving how LLMs process attention - a critical operation in transformer models. This optimization enables longer context windows without increasing memory or compute requirements.

Support for the latest RTX architectures: LM Studio's update to CUDA 12.8 ensures compatibility with the full range of RTX AI PCs - from GeForce RTX 20 Series to NVIDIA Blackwell-class GPUs, giving users the flexibility to scale their local AI workflows from laptops to high-end desktops.

Data measured using different versions of LM Studio and CUDA backends on GeForce RTX 5080 on DeepSeek-R1-Distill-Llama-8B model. All configurations measured using Q4_K_M GGUF (Int4) quantization at BS=1, ISL=4000, OSL=200, with Flash Attention ON. Graph showcases 27% speedup with the latest version of LM Studio due to NVIDIA contributions to the llama.cpp inference backend. With a compatible driver, LM Studio automatically upgrades to the CUDA 12.8 runtime, enabling significantly faster model load times and higher overall performance.

These enhancements deliver smoother inference and faster response times across the full range of RTX AI PCs - from thin, light laptops to high-performance desktops and workstations.

Get Started With LM Studio LM Studio is free to download and runs on Windows, macOS and Linux. With the latest 0.3.15 release and ongoing optimizations, users can expect continued improvements in performance, customization and usability - making local AI faster, more flexible and more accessible.

Users can load a model through the desktop chat interface or enable developer mode to expose an OpenAI-compatible API.

To quickly get started, download the latest version of LM Studio and open up the application.

Click the magnifying glass icon on the left panel to open up the Discover
LINK: https://blogs.nvidia.com/blog/rtx-ai-garage-lmstudio-llamacpp-blackwel...
See more stories from nvidia

Most recent headlines

06/10/2025

France Tlvisions Wins Prestigious 2025 EBU Technology & Innovation Award in Groundbreaking Collaboration with Dalet

France T l visions, France's leading broadcaster, has received the 2025 EBU ...

04/09/2025

Monumental Sports & Entertainment and Dalet Win Prestigious 2025 NAB Show Project of the Year Award

Monumental Sports & Entertainment (MSE), in collaboration with Dalet, has been a...

01/07/2025

Sinclair Pays $500,000 to Settle FCC Investigations

WASHINGTON The Federal Communications Commission's Enforcement and Media Bureaus have entered into a consent decree with Sinclair to resolve a variety of in...

01/07/2025

LPTV Broadcasters Association to Hold Webinar on Station Sales

DENVER Low-power television (LPTV) station owners looking to navigate the complexities of selling their assets in todays dynamic media environment are invited t...

01/07/2025

Netflix to Carry Live Programming from NASA+

NASA announced today that live programming from its NASA+ channel will be available on Netflix starting sometime this summer....

01/07/2025

FCC Chair Carr Hires Katie McAuliffe as Policy Advisor

WASHINGTON Federal Communications Commission Chair Brendan Carr has appointed Katie McAuliffe to serve as policy advisor in his office....

01/07/2025

GFiber Demonstrates Network Slicing to Improve Home Internet Performance

MOUNTAIN VIEW, Calif. Alphabet's GFiber pay TV and broadband provider has announced that it recently worked with Nokia to demonstrate network slicing....

01/07/2025

DoubleVerify Debuts Attention Measurement for Social Media

NEW YORK, N.Y. DoubleVerify (DV) has announced the launch of DV Authentic Attention for Social. The product will first launch with Snap, the owner of Snapchat....

01/07/2025

FCC Rejects License Challenges to Three Baltimore TV Stations

WASHINGTON The Federal Communications Commission has rejected license challenges to three full-power Baltimore TV stations and agreed to renew the license for C...

01/07/2025

Magewell Brings NDI into Conferencing Software and More w...

Compact new converter lets users capture live NDI and streaming sources into software over a USB interface Video interface and IP workflow innovator Magewell ...

01/07/2025

Disneys Hercules Brings Mosaic Visuals To Life On Stage W...

Disguise, the award-winning tech company driving visuals for Broadway and West End hits including Redwood, Stranger Things: The First Shadow and Disney's Fr...

01/07/2025

KIT Plugins release NOIZ One Vox

Vocal-processing plug-in joins NOIZ Hub series Launched in 2024, KIT Plugins' NOIZ Hub series was created with the aim of providing a range of professio...

01/07/2025

7 Day Mastering: New course from Mastering.com

New self-paced learning programme announced Mastering.com have announced the availability of a new online course designed to cover the fundamentals of maste...

01/07/2025

Heather Gray Named Vice President and General Manager of WRAL and FOX 50

Historic appointment ushers in unified leadership for WRAL-TV, New Media, and Digital Solutions RALEIGH, N.C. - 6-27-25 - Capitol Broadcasting Company is prou...

30/06/2025

Discover Weekly Turns 10: Celebrating 100 Billion+ Tracks Streamed and a Decade of Personalized Discovery

There's nothing quite like the magic of finding music that feels made just f...

30/06/2025

Spotify's Editors Pick Their Best Songs of the Year (So Far)

When it comes to new music, Spotify's team of editors across North America is always on the hunt for songs that make them feel, think, and move. They're...

30/06/2025

SBS On Demand boosts global news offering with launch of France 24 FAST Channel

SBS On Demand boosts global news offering with launch of France 24 FAST Channel 30 June, 2025 Media releases SBS is expanding its international news offeri...

30/06/2025

The Forsytes Season 2 Commissioned by MASTERPIECE on PBS

Star Studded Ensemble Cast Are Joined by Richard Rankin as Filming Begins on the Second Season [June 12, 2025 - Boston, MA]: The Forsytes, Debbie Horsfield...

30/06/2025

Artemis II Mission Advances with Successful RS-25 Engine Checkout Tests

The Artemis II Space Launch System core stage is integrated with the solid rocket boosters inside High Bay 3 of the Vehicle Assembly Building at NASAs Kennedy S...

30/06/2025

WRAL-WRAZ Raleigh Names Heather Gray as VP and GM

RALEIGH, N.C. Capitol Broadcasting Co. has named Heather Gray vice president and general manager of WRAL-TV and WRAZ-TV here....

30/06/2025

VAB Awards JJ Freeman Engineering Award to Bill Sewell of WTKR/WGNT

The Virginia Association of Broadcasters has recognized Bill Sewell, Director of Engineering at WTKR & WGNT in Norfolk, Va. as the recipient of the 2025 J.J. Fr...

30/06/2025

SBE Recruits 49 New Members

The Society of Broadcast Engineers said its annual member drive resulted in the recruitment of 49 individual members....

30/06/2025

Avid Releases Full Integration of MediaCentral, Wolftech News

BURLINGTON, Mass. Avid today released its fully integrated news platform, uniting MediaCentral and Wolftech News in a single newsroom solution, and will demonst...

30/06/2025

FCC Fines Sinclair $500,000

WASHINGTON The Federal Communication's Enforcement and Media Bureaus have entered into a Consent Decree with Sinclair Broadcast Group to resolve a variety o...

30/06/2025

Qu-Bit announce the Bloom v2

Eurorack sequencer module reimagined California-based modular synth innovators Qu-Bit have announced the launch of a new module that offers a fresh new take...

30/06/2025

Berklee at Umbria Jazz Clinics to Host 40th Anniversary Concert

Berklee at Umbria Jazz Clinics to Host 40th Anniversary Concert The celebration will be held on July 10 in Perugia, Italy. By Colette Greenstein June 30, 202...

30/06/2025

PremiumBeat Tips and Tricks

PremiumBeat Tips and Tricks Brie Clayton June 30, 2025 0 Comments When editing to impress, you'll need quality music, and if your studio happens t...

30/06/2025

Techivation launch T-De-Esser Pro Mk2

Improved dynamic behaviour, improved audio quality & more Techivation have announced the release of an upgraded edition of their very first premium plug-in,...

30/06/2025

German premiere with live flight demonstration: German industry team showcases electromagnetic combat from the air

German premiere with live flight demonstration: German industry team showcases e...

30/06/2025

Beln Cuesta and Karra Elejalde Star in 'El nio', the New Film by Mariano Barroso

Back to All News Bel n Cuesta and Karra Elejalde Star in El ni o, the New Film ...

30/06/2025

A New Dangerous Troll Awakens: Netflix Unleashes Teaser for 'Troll 2'

Back to All News A New Dangerous Troll Awakens: Netflix Unleashes Teaser for Troll 2Play Video Play Video Entertainment 30 June 2025 GlobalNorwayDenmarkSwe...

30/06/2025

The Focusrite Summer Sale is now on

The Focusrite Summer Sale is now on Don't miss unbeatable deals on Scarlett, Vocaster, and more. Whether you're an artist, a producer, or a podcaste...

30/06/2025

Yellowstone origin story 1923 starring Harrison Ford and Helen Mirren comes to RT One and RT Player

All 8 episodes of Season 1 of 1923 will be available on RT Player from Tuesday ...

30/06/2025

Thales 2025 Global Cloud Security Study Reveals Organizations Struggle to Secure Expanding, AI-Driven Cloud Environments

Facebook Twitter LinkedIn 52% report AI security spending is displacing tr...

30/06/2025

Thales Alenia Space to develop SOLiS very-high-throughput laser communications demonstrator

Facebook Twitter LinkedIn Cannes, June 30th, 2025 - Thales Alenia Space, t...

29/06/2025

Roland introduce the Mood Pan

Handpan-inspired instrument announced Roland have announced the launch of the Mood Pan, a unique electronic hand percussion instrument that has been designe...

29/06/2025

A Secret Society, Ritualistic Killings, and a Century-Old Curse Netflix and YRF Entertainment's 'Mandala Murders' Premieres July 25

Back to All News A Secret Society, Ritualistic Killings, and a Century-Old Curs...

28/06/2025

Press Release: NFVF Marks Youth Month by Empowering Future Creatives Through Film & TV Bursaries

Johannesburg, 27 June 2025 - As the nation commemorates Youth Month 2025, the N...

28/06/2025

FCC Chair Brendan Carr Promises Very, Very Busy, Productive Summer

WASHINGTON In a press conference following the Federal Communications Commission's May Open Meeting, Chair Brendan Carr promised the agency would move rapid...

28/06/2025

Spectrum Awards $1.1 Million in 2025 Spectrum Digital Education Grants

STAMFORD, Conn. Charter Communications has awarded $1.1 million in Spectrum Digital Education grants to 55 nonprofit organizations that work to expand access to...

28/06/2025

Sonnet Announces Echo 20 Thunderbolt 4 SuperDock Now Veri...

LAKE FOREST, Calif. June 19, 2025 What's New: Sonnet Technologies today announced the certification of its Echo 20 Thunderbolt 4 SuperDock as an Engin...

28/06/2025

IDC Names MASV One of Three Most Innovative Companies in...

MASV (massive.io), the fastest and most reliable large file transfer platform for media professionals, has been named an IDC Innovator in the IDC Innovators: Me...

28/06/2025

TV SKYLINE Expands Live Production Capabilities with Late...

Grass Valley today announced that TV SKYLINE GmbH, one of Europe's top mobile production providers, has expanded its camera inventory with 30 LDX 135 UHD/HD...

28/06/2025

AgileTV has been selected to develop and implement LIWEST...

AgileTV, a European leader in TV and video technology solutions, signed an agreement with Austrian telco LIWEST to develop and implement its TV service in Austr...

28/06/2025

Scaler 3.1 update from Scaler Music

Music theory plug-in updated Three months on from the release of the latest version of their renowned music theory plug in, Scaler Music have launched an up...

28/06/2025

The 48th Annual Indian National Finals Rodeo Shot with Blackmagic PYXIS 6K

The 48th Annual Indian National Finals Rodeo Shot with Blackmagic PYXIS 6K Brie Clayton June 27, 2025 0 Comments Filmmaker Cameron Mackey relied on Bl...

28/06/2025

Social, Streaming Don't Compete, They Compliment

Social, Streaming Don't Compete, They Compliment Andy Marken June 27, 2025 0 Comments I think we've all arrived at a very special place. Spir...

28/06/2025

Blackmagic Design Captures Filipino Rock Band Drama Singtala

Blackmagic Design Captures Filipino Rock Band Drama Singtala Brie Clayton June 27, 2025 0 Comments Blackmagic URSA Mini Pro 12K and DaVinci Resolve St...

28/06/2025

Enhance Videos Faster with Aiarty Video Enhancer - Offline, Sharp, and Natural

Enhance Videos Faster with Aiarty Video Enhancer - Offline, Sharp, and Natural Brie Clayton June 27, 2025 0 Comments If you've used AI video tools...

27/06/2025

Give Me the Backstory: Get to Know Eva Victor, the Writer-Director Behind Sorry, Baby

By Jessica Herndon One of the most exciting things about the Sundance Film Fest...