
As AI use cases continue to expand - from document summarization to custom software agents - developers and enthusiasts are seeking faster, more flexible ways to run large language models (LLMs).
Running models locally on PCs with NVIDIA GeForce RTX GPUs enables high-performance inference, enhanced data privacy and full control over AI deployment and integration. Tools like LM Studio - free to try - make this possible, giving users an easy way to explore and build with LLMs on their own hardware.
LM Studio has become one of the most widely adopted tools for local LLM inference. Built on the high-performance llama.cpp runtime, the app allows models to run entirely offline and can also serve as OpenAI-compatible application programming interface (API) endpoints for integration into custom workflows.
The release of LM Studio 0.3.15 brings improved performance for RTX GPUs thanks to CUDA 12.8, significantly improving model load and response times. The update also introduces new developer-focused features, including enhanced tool use via the tool_choice parameter and a redesigned system prompt editor.
The latest improvements to LM Studio improve its performance and usability - delivering the highest throughput yet on RTX AI PCs. This means faster responses, snappier interactions and better tools for building and integrating AI locally.
Where Everyday Apps Meet AI Acceleration LM Studio is built for flexibility - suited for both casual experimentation or full integration into custom workflows. Users can interact with models through a desktop chat interface or enable developer mode to serve OpenAI-compatible API endpoints. This makes it easy to connect local LLMs to workflows in apps like VS Code or bespoke desktop agents.
For example, LM Studio can be integrated with Obsidian, a popular markdown-based knowledge management app. Using community-developed plug-ins like Text Generator and Smart Connections, users can generate content, summarize research and query their own notes - all powered by local LLMs running through LM Studio. These plug-ins connect directly to LM Studio's local server, enabling fast, private AI interactions without relying on the cloud.
Example of using LM Studio to generate notes accelerated by RTX. The 0.3.15 update adds new developer capabilities, including more granular control over tool use via the tool_choice parameter and an upgraded system prompt editor for handling longer or more complex prompts.
The tool_choice parameter lets developers control how models engage with external tools - whether by forcing a tool call, disabling it entirely or allowing the model to decide dynamically. This added flexibility is especially valuable for building structured interactions, retrieval-augmented generation (RAG) workflows or agent pipelines. Together, these updates enhance both experimentation and production use cases for developers building with LLMs.
LM Studio supports a broad range of open models - including Gemma, Llama 3, Mistral and Orca - and a variety of quantization formats, from 4-bit to full precision.
Common use cases span RAG, multi-turn chat with long context windows, document-based Q&A and local agent pipelines. And by using local inference servers powered by the NVIDIA RTX-accelerated llama.cpp software library, users on RTX AI PCs can integrate local LLMs with ease.
Whether optimizing for efficiency on a compact RTX-powered system or maximizing throughput on a high-performance desktop, LM Studio delivers full control, speed and privacy - all on RTX.
Experience Maximum Throughput on RTX GPUs At the core of LM Studio's acceleration is llama.cpp - an open-source runtime designed for efficient inference on consumer hardware. NVIDIA partnered with the LM Studio and llama.cpp communities to integrate several enhancements to maximize RTX GPU performance.
Key optimizations include:
CUDA graph enablement: Groups multiple GPU operations into a single CPU call, reducing CPU overhead and improving model throughput by up to 35%.
Flash attention CUDA kernels: Boosts throughput by up to 15% by improving how LLMs process attention - a critical operation in transformer models. This optimization enables longer context windows without increasing memory or compute requirements.
Support for the latest RTX architectures: LM Studio's update to CUDA 12.8 ensures compatibility with the full range of RTX AI PCs - from GeForce RTX 20 Series to NVIDIA Blackwell-class GPUs, giving users the flexibility to scale their local AI workflows from laptops to high-end desktops.
Data measured using different versions of LM Studio and CUDA backends on GeForce RTX 5080 on DeepSeek-R1-Distill-Llama-8B model. All configurations measured using Q4_K_M GGUF (Int4) quantization at BS=1, ISL=4000, OSL=200, with Flash Attention ON. Graph showcases 27% speedup with the latest version of LM Studio due to NVIDIA contributions to the llama.cpp inference backend. With a compatible driver, LM Studio automatically upgrades to the CUDA 12.8 runtime, enabling significantly faster model load times and higher overall performance.
These enhancements deliver smoother inference and faster response times across the full range of RTX AI PCs - from thin, light laptops to high-performance desktops and workstations.
Get Started With LM Studio LM Studio is free to download and runs on Windows, macOS and Linux. With the latest 0.3.15 release and ongoing optimizations, users can expect continued improvements in performance, customization and usability - making local AI faster, more flexible and more accessible.
Users can load a model through the desktop chat interface or enable developer mode to expose an OpenAI-compatible API.
To quickly get started, download the latest version of LM Studio and open up the application.
Click the magnifying glass icon on the left panel to open up the Discover
Most recent headlines
05/01/2027
Worlds first 802.15.4ab-UWB chip verified by Calterah and Rohde & Schwarz to be ...
01/06/2026
January 6 2026, 05:30 (PST) Dolby Sets the New Standard for Premium Entertainment at CES 2026
Throughout the week, Dolby brings to life the latest innovatio...
01/05/2026
January 5 2026, 18:30 (PST) NBCUniversal's Peacock to Be First Streamer to ...
01/04/2026
January 4 2026, 18:00 (PST) DOLBY AND DOUYIN EMPOWER THE NEXT GENERATON OF CREATORS WITH DOLBY VISION
Douyin Users Can Now Create And Share Videos With Stun...
31/01/2026
Share Share by:
Copy link
Facebook
X
Linkedin
Bluesky
Email...
31/01/2026
Share Share by:
Copy link
Facebook
X
Linkedin
Bluesky
Email...
31/01/2026
Share Share by:
Copy link
Facebook
X
Linkedin
Bluesky
Email...
31/01/2026
Share Share by:
Copy link
Facebook
X
Linkedin
Bluesky
Email...
30/01/2026
Top L-R: The Friend's House is Here, Josephine, The Lake, Bedford Park, Who Killed Alex Odeh?
Second Row L-R: Take Me Home, American Pachuco: The Legend of...
30/01/2026
Spotify, Haziran ay sonunda kadar stanbul'da yeni bir ofis a aca n ve T rkiye pazar n y netmek zere yeni bir atama ger ekle tirdi ini duyurdu. Bu kaps...
30/01/2026
The Artemis II wet dress rehearsal will simulate the launch countdown, fully loading fuel and verifying systems ahead of the first SLS and Orion crewed flight....
30/01/2026
Share Share by:
Copy link
Facebook
X
Linkedin
Bluesky
Email...
30/01/2026
Share Share by:
Copy link
Facebook
X
Linkedin
Bluesky
Email...
30/01/2026
Grass Valley , the leading technology provider for live production solutions, and NETGEAR Inc. (NASDAQ: NTGR), a global leader in network solutions, today anno...
30/01/2026
tvONE, a leading video processor, signal distribution technology and media server developer, announces the expansion of Amit Singh's role to Regional Sales ...
30/01/2026
With a career that spans four decades across television, film and post-production, Freelance Sound Designer and Post-production Sound Mixer Mike Aiton has built...
30/01/2026
DPA Microphones will feature its new, fully integrated wireless microphone ecosystem, designed to let audio professionals work faster, cleaner and with total co...
30/01/2026
As the Middle East continues to accelerate investment in next-generation media, broadcast, and immersive content technologies, Ventum Tech today announced a str...
30/01/2026
Mark Roberts Motion Control (MRMC), a Nikon company and global leader in robotic camera systems, today announced its participation at Integrated Systems Europe ...
30/01/2026
Share Share by:
Copy link
Facebook
X
Linkedin
Bluesky
Email...
30/01/2026
Share Share by:
Copy link
Facebook
X
Linkedin
Bluesky
Email...
30/01/2026
Share Share by:
Copy link
Facebook
X
Linkedin
Bluesky
Email...
30/01/2026
Share Share by:
Copy link
Facebook
X
Linkedin
Bluesky
Email...
30/01/2026
Boston Conservatory at Berklee Hosts the National Opera Association's 2026 C...
30/01/2026
Student Spotlight: Sriram Narayanan The classical pianist shares his experience growing up with a language disability and finding his voice through music.
Ja...
30/01/2026
Heading into 2026, the pace of change across radio, TV, and digital media is reaching an inflection point. Audience behaviors continue to evolve, measurement mo...
30/01/2026
30 Jan 2026
VEON Partners with MindBridge to Enhance Financial Analytics, Audit...
30/01/2026
Introducing the NEW Techtel.tv! | FEB 5% OFF Offer
30 Jan Written By Suzanne Costello
Our Website & Online Store: Now Unified for a Seamless Experience.We...
30/01/2026
Friday 30 January 2026
Easels at the ready! All new judging line up for series ...
30/01/2026
Friday 30 January 2026
Britain can switch off terrestrial TV in the 2030s, with...
30/01/2026
Back to All News
The Danish Crime Series The Asset' Returns for a Second Season
Entertainment
30 January 2026
GlobalDenmark
Link copied to clipboard
...
30/01/2026
Two key themes came through strongly:
Inconsistent measurement remains a major barrier to comparing performance across Retail Media Networks
Independent cer...
29/01/2026
The National Film and Video Foundation (NFVF), in collaboration with a distribut...
29/01/2026
Michele Fracchiolla Succeeds Andrew Barr as President of EMEA region from April 1, 2026
London, January 29, 2026
Hitachi Europe Ltd. today announces the appoi...
29/01/2026
MELBOURNE, Fla., January 29, 2026 - L3Harris Technologies (NYSE: LHX) reports fu...
29/01/2026
Bluey' Wins Second Consecutive Top Streaming Title of the Year with 45 Billi...
29/01/2026
Share Share by:
Copy link
Facebook
X
Linkedin
Bluesky
Email...
29/01/2026
Share Share by:
Copy link
Facebook
X
Linkedin
Bluesky
Email...
29/01/2026
Share Share by:
Copy link
Facebook
X
Linkedin
Bluesky
Email...
29/01/2026
Share Share by:
Copy link
Facebook
X
Linkedin
Bluesky
Email...
29/01/2026
Share Share by:
Copy link
Facebook
X
Linkedin
Bluesky
Email...
29/01/2026
Boston Conservatory Orchestra Presents East Coast Premiere of Peter and Leonardo...
29/01/2026
29 Jan 2026
Kyivstar Announces Pricing of Secondary Offering of Common Shares Held by VEON NEW YORK, New York, January 29, 2026 -- VEON Ltd. (Nasdaq: VEON), a ...
29/01/2026
Mercedes-Benz is marking 140 years of automotive innovation with a new S-Class b...
29/01/2026
X-Rite Pantone Appoints Cindy Cooperman as Vice President and General Manager of...
29/01/2026
New two-part true crime documentary, OUTBACK TERROR: THE FALCONIO MURDER, aims to shed new light on a case that continues to intrigue on both sides of the world...
29/01/2026
Back to All News
Love is Blind: Sweden Returns for a Third Season - Premiering ...
29/01/2026
Back to All News
Unmask Bridgerton' Season 4 With Our Complete Coverage Guide
Yerin Ha as Sophie Baek and Luke Thompson as Benedict Bridgerton in Season ...
29/01/2026
Back to All News
Extraordinary Crime Mysteries, Mythical Worlds and High-Stakes...
29/01/2026
FOX Sports Unveils Historic FIFA World Cup 2026 Broadcast Schedule Monumental Slate Features 340 Hours of Live First-Run Programming Across FOX Sports Platfo...