
As AI use cases continue to expand - from document summarization to custom software agents - developers and enthusiasts are seeking faster, more flexible ways to run large language models (LLMs).
Running models locally on PCs with NVIDIA GeForce RTX GPUs enables high-performance inference, enhanced data privacy and full control over AI deployment and integration. Tools like LM Studio - free to try - make this possible, giving users an easy way to explore and build with LLMs on their own hardware.
LM Studio has become one of the most widely adopted tools for local LLM inference. Built on the high-performance llama.cpp runtime, the app allows models to run entirely offline and can also serve as OpenAI-compatible application programming interface (API) endpoints for integration into custom workflows.
The release of LM Studio 0.3.15 brings improved performance for RTX GPUs thanks to CUDA 12.8, significantly improving model load and response times. The update also introduces new developer-focused features, including enhanced tool use via the tool_choice parameter and a redesigned system prompt editor.
The latest improvements to LM Studio improve its performance and usability - delivering the highest throughput yet on RTX AI PCs. This means faster responses, snappier interactions and better tools for building and integrating AI locally.
Where Everyday Apps Meet AI Acceleration LM Studio is built for flexibility - suited for both casual experimentation or full integration into custom workflows. Users can interact with models through a desktop chat interface or enable developer mode to serve OpenAI-compatible API endpoints. This makes it easy to connect local LLMs to workflows in apps like VS Code or bespoke desktop agents.
For example, LM Studio can be integrated with Obsidian, a popular markdown-based knowledge management app. Using community-developed plug-ins like Text Generator and Smart Connections, users can generate content, summarize research and query their own notes - all powered by local LLMs running through LM Studio. These plug-ins connect directly to LM Studio's local server, enabling fast, private AI interactions without relying on the cloud.
Example of using LM Studio to generate notes accelerated by RTX. The 0.3.15 update adds new developer capabilities, including more granular control over tool use via the tool_choice parameter and an upgraded system prompt editor for handling longer or more complex prompts.
The tool_choice parameter lets developers control how models engage with external tools - whether by forcing a tool call, disabling it entirely or allowing the model to decide dynamically. This added flexibility is especially valuable for building structured interactions, retrieval-augmented generation (RAG) workflows or agent pipelines. Together, these updates enhance both experimentation and production use cases for developers building with LLMs.
LM Studio supports a broad range of open models - including Gemma, Llama 3, Mistral and Orca - and a variety of quantization formats, from 4-bit to full precision.
Common use cases span RAG, multi-turn chat with long context windows, document-based Q&A and local agent pipelines. And by using local inference servers powered by the NVIDIA RTX-accelerated llama.cpp software library, users on RTX AI PCs can integrate local LLMs with ease.
Whether optimizing for efficiency on a compact RTX-powered system or maximizing throughput on a high-performance desktop, LM Studio delivers full control, speed and privacy - all on RTX.
Experience Maximum Throughput on RTX GPUs At the core of LM Studio's acceleration is llama.cpp - an open-source runtime designed for efficient inference on consumer hardware. NVIDIA partnered with the LM Studio and llama.cpp communities to integrate several enhancements to maximize RTX GPU performance.
Key optimizations include:
CUDA graph enablement: Groups multiple GPU operations into a single CPU call, reducing CPU overhead and improving model throughput by up to 35%.
Flash attention CUDA kernels: Boosts throughput by up to 15% by improving how LLMs process attention - a critical operation in transformer models. This optimization enables longer context windows without increasing memory or compute requirements.
Support for the latest RTX architectures: LM Studio's update to CUDA 12.8 ensures compatibility with the full range of RTX AI PCs - from GeForce RTX 20 Series to NVIDIA Blackwell-class GPUs, giving users the flexibility to scale their local AI workflows from laptops to high-end desktops.
Data measured using different versions of LM Studio and CUDA backends on GeForce RTX 5080 on DeepSeek-R1-Distill-Llama-8B model. All configurations measured using Q4_K_M GGUF (Int4) quantization at BS=1, ISL=4000, OSL=200, with Flash Attention ON. Graph showcases 27% speedup with the latest version of LM Studio due to NVIDIA contributions to the llama.cpp inference backend. With a compatible driver, LM Studio automatically upgrades to the CUDA 12.8 runtime, enabling significantly faster model load times and higher overall performance.
These enhancements deliver smoother inference and faster response times across the full range of RTX AI PCs - from thin, light laptops to high-performance desktops and workstations.
Get Started With LM Studio LM Studio is free to download and runs on Windows, macOS and Linux. With the latest 0.3.15 release and ongoing optimizations, users can expect continued improvements in performance, customization and usability - making local AI faster, more flexible and more accessible.
Users can load a model through the desktop chat interface or enable developer mode to expose an OpenAI-compatible API.
To quickly get started, download the latest version of LM Studio and open up the application.
Click the magnifying glass icon on the left panel to open up the Discover
Most recent headlines
04/09/2025
Monumental Sports & Entertainment (MSE), in collaboration with Dalet, has been a...
08/05/2025
The cable industry has told the Federal Communications Commission it supports the National Association of Broadcasters' proposal to allow broadcasters to us...
08/05/2025
WASHINGTON The Consumer Technology Association has continued its opposition to mandates requiring that NextGen TV/ATSC 3.0 tuners be included in new TV sets, sa...
08/05/2025
TAG Video Systems, the leader in software-based IP end-to-end workflow monitoring, deep probing, and real time visualization, has named Paul Maroni as Vice Pres...
08/05/2025
This year's UK Pavilion in hall 5, once again managed by Tradefair, will provide visitors with the unique opportunity to discuss and be involved in cutting ...
08/05/2025
Rohde & Schwarz will showcase its latest energy-efficient transmitters and 5G Broadcast technologies, designed to support network operators and content provider...
08/05/2025
IRVING, Texas Nexstar Media Group has tapped Bill Nardi as vice president of station operations, responsible for overseeing the day-to-day broadcast operations ...
08/05/2025
SEATTLE LumaTouch is partnering with CNN Academy to improve mobile storytelling techniques and support training across all of CNN Academy's training simulat...
08/05/2025
WASHINGTON The Society of Broadcast Engineers has filed comments with the Federal Communications Commission that support a proposal by the National Association ...
08/05/2025
Senior adviser to the United States Agency for Global Media Kari Lake has announced that One America News Network (OAN) will provide newsfeed services for fre...
08/05/2025
EdMon Expands as AI-Driven Post Production Workflows Gains Traction in Sweden an...
08/05/2025
Using Luma Mattes in Adobe Premiere Pro
Graham Quince May 7, 2025
0 Comments
This very quick tutorial shows you how to take an RGB clip and apply its ...
08/05/2025
OpenDrives Unveils Free Your Data' Initiative with New Astraeus Cloud-Nativ...
08/05/2025
Student Spotlight: Grigori Balasanyan The Armenian composer, who was named Boston Conservatory at Berklees 2025 student commencement speaker, talks about his ...
08/05/2025
TenneT relies on Arvato Systems for market communication
Energy industry: Impressive market communication know-how and system integration expertise
G tersloh...
08/05/2025
When Taiki Hamamoto, 22, came across a Hanafuda deck at his local game shop, he was intrigued. He had grown up playing the traditional Japanese card game with f...
08/05/2025
The Liveline is now open , said Joe Duffy earlier today, as he previewed this af...
08/05/2025
RT , in association with the BBC, Screen Ireland and Cineflix Rights has reveale...
08/05/2025
Artificial intelligence is helping identify and treat diseases faster with better results for humankind. Natural disasters like wildfires are next.
Fires in th...
08/05/2025
Calling all wiseguys - 2K's acclaimed Mafia franchise is available to stream...
08/05/2025
As AI use cases continue to expand - from document summarization to custom software agents - developers and enthusiasts are seeking faster, more flexible ways t...
07/05/2025
Discovering music should feel effortless and fun. That's why Spotify continu...
07/05/2025
SBS and NITV mark National Reconciliation Week with compelling premieres recogni...
07/05/2025
SBS commences search for a new Western Sydney production hub location
7 May, 2025
Media releases
SBS has today launched a Request for Expressions of Intere...
07/05/2025
Warsaw, Poland - April 28, 2025 - Nielsen, a global leader in audience measurement, data and analytics, has released its latest March All Screens Video Landscap...
07/05/2025
LONDON Movie fans hoping to save money by waiting until their favorite new films appear on streaming services will have to wait a bit longer now, according to a...
07/05/2025
MECCA, Saudi Arabia Saudi Broadcasting Authority (SBA) has selected Grass Valley to provide a major technology upgrade of its broadcast facility here....
07/05/2025
Sony and Nevion provide guidance on IP network architecture options for live pro...
07/05/2025
Media Pioneer Publishing AG Expands Editorial Capacity
Brie Clayton May 7, 2025
0 Comments
Pioneer 2 boat production environment powered by Blackmagic...
07/05/2025
LONDON Movie fans hoping to save money by waiting until their favorite new films appear on streaming services will have to wait a bit longer now, according to a...
07/05/2025
WASHINGTON The Federal Communications Commission's Media Bureau is seeking public comment on a Petition for Rulemaking from HC2 Broadcasting Holdings asking...
07/05/2025
WASHINGTON Following a decision by U.S. Department of Education to terminate its 2020-2025 Ready To Learn to the Corporation for Public Broadcasting, CPB has in...
07/05/2025
NEW YORK Fox's ad-supported streaming Tubi made a series of product and partnership announcements during IAB NewFronts in New York, including the launch of ...
07/05/2025
MOUNTAIN VIEW, Calif. Google Fiber (GFiber) has announced a redesigned app that the company said will simplify how customers set up service, manage devices, and...
07/05/2025
WASHINGTON NASAs on-demand streaming service, NASA+, has launched a FAST (Free Ad-Supported Television) channel on Prime Video....
07/05/2025
NEW YORK The WNET Group, parent company of the PBS station Thirteen, has announced the appointment of Randall T. Decker to senior director, technology, effectiv...
07/05/2025
Atomos announced an executive leadership transition as the Company continues to evolve and expand its strategic focus.
Peter Barber, currently serving as Chie...
07/05/2025
Steve Wagner, Jerry Holway and Robert Orf at the 2024 Scientific and Technical Awards at the Academy Museum of Motion Pictures on Tuesday, April 29, 2025.
The ...
07/05/2025
nxtedition, the Swedish company behind the leading integrated platform for news and program production, and TASCAM, the iconic Japanese manufacturer of professi...
07/05/2025
Signiant is bringing its Camera-Raw-to-Any-Cloud workflow to the UK for the first time at the Media Production & Technology Show 2025 (Booth# M69) with a live d...
07/05/2025
MNC Software Inc., a global leader in software and network solutions tailored to the broadcast and media industry, has appointed Gencom Technology as an officia...
07/05/2025
Rise AV, the award-winning advocacy group championing gender diversity and professional development in the AV sector, is proud to announce 31 mentor-mentee pair...
07/05/2025
Test & measurement innovator, Leader Electronics of Europe, is to bring a selection of its leading products for IP, SDI and hybrid workflow requirements to this...
07/05/2025
The Global Media and Entertainment Talent Manifesto announces that the World Skills Caf will return at IBC2025 with an expanded skills and diversity programme,...
07/05/2025
Moments Lab, a leader in AI video discovery, and LucidLink, the pioneer in real-time cloud collaboration, are proud to announce the integration of Moments Lab...
07/05/2025
Obvious C Broadcasts Skiing World Cup with Blackmagic Design
Brie Clayton May 6, 2025
0 Comments
Blackmagic Design cameras capture cinematic sports pr...
07/05/2025
Larry Jordan Interviews Signiant's Jon Finegold at NAB 2025
Brie Clayton May 6, 2025
0 Comments
Jon Finegold, Chief Marketing Officer at Signiant,...
07/05/2025
A new supercomputer offered by Cadence, a leading provider of technology for ele...
07/05/2025
Abu Dhabi, UAE [Date] Space42 (ADX: SPACE42), the UAE-based AI-powered Space...
07/05/2025
07 May 2025
VEON's QazCode Signs MoU with Seekr to Develop AI-Powered Solut...