Sony Pixel Power calrec Sony

Igniting the Future: TensorRT-LLM Release Accelerates AI Inference Performance, Adds Support for New Models Running on RTX-Powered Windows 11 PCs

15/11/2023

Artificial intelligence on Windows 11 PCs marks a pivotal moment in tech history, revolutionizing experiences for gamers, creators, streamers, office workers, students and even casual PC users.

It offers unprecedented opportunities to enhance productivity for users of the more than 100 million Windows PCs and workstations that are powered by RTX GPUs. And NVIDIA RTX technology is making it even easier for developers to create AI applications to change the way people use computers.

New optimizations, models and resources announced at Microsoft Ignite will help developers deliver new end-user experiences, quicker.

An upcoming update to TensorRT-LLM - open-source software that increases AI inference performance - will add support for new large language models and make demanding AI workloads more accessible on desktops and laptops with RTX GPUs starting at 8GB of VRAM.

TensorRT-LLM for Windows will soon be compatible with OpenAI's popular Chat API through a new wrapper. This will enable hundreds of developer projects and applications to run locally on a PC with RTX, instead of in the cloud - so users can keep private and proprietary data on Windows 11 PCs.

Custom generative AI requires time and energy to maintain projects. The process can become incredibly complex and time-consuming, especially when trying to collaborate and deploy across multiple environments and platforms.

AI Workbench is a unified, easy-to-use toolkit that allows developers to quickly create, test and customize pretrained generative AI models and LLMs on a PC or workstation. It provides developers a single platform to organize their AI projects and tune models to specific use cases.

This enables seamless collaboration and deployment for developers to create cost-effective, scalable generative AI models quickly. Join the early access list to be among the first to gain access to this growing initiative and to receive future updates.

To support AI developers, NVIDIA and Microsoft will release DirectML enhancements to accelerate one of the most popular foundational AI models, Llama 2. Developers now have more options for cross-vendor deployment, in addition to setting a new standard for performance.

Portable AI Last month, NVIDIA announced TensorRT-LLM for Windows, a library for accelerating LLM inference.

The next TensorRT-LLM release, v0.6.0 coming later this month, will bring improved inference performance - up to 5x faster - and enable support for additional popular LLMs, including the new Mistral 7B and Nemotron-3 8B. Versions of these LLMs will run on any GeForce RTX 30 Series and 40 Series GPU with 8GB of RAM or more, making fast, accurate, local LLM capabilities accessible even in some of the most portable Windows devices.

Up to 5X performance with the new TensorRT-LLM v0.6.0. The new release of TensorRT-LLM will be available for install on the /NVIDIA/TensorRT-LLM GitHub repo. New optimized models will be available on ngc.nvidia.com.

Conversing With Confidence Developers and enthusiasts worldwide use OpenAI's Chat API for a wide range of applications - from summarizing web content and drafting documents and emails to analyzing and visualizing data and creating presentations.

One challenge with such cloud-based AIs is that they require users to upload their input data, making them impractical for private or proprietary data or for working with large datasets.

To address this challenge, NVIDIA is soon enabling TensorRT-LLM for Windows to offer a similar API interface to OpenAI's widely popular ChatAPI, through a new wrapper, offering a similar workflow to developers whether they are designing models and applications to run locally on a PC with RTX or in the cloud. By changing just one or two lines of code, hundreds of AI-powered developer projects and applications can now benefit from fast, local AI. Users can keep their data on their PCs and not worry about uploading datasets to the cloud.

Perhaps the best part is that many of these projects and applications are open source, making it easy for developers to leverage and extend their capabilities to fuel the adoption of generative AI on Windows, powered by RTX.

The wrapper will work with any LLM that's been optimized for TensorRT-LLM (for example, Llama 2, Mistral and NV LLM) and is being released as a reference project on GitHub, alongside other developer resources for working with LLMs on RTX.

Model Acceleration Developers can now leverage cutting-edge AI models and deploy with a cross-vendor API. As part of an ongoing commitment to empower developers, NVIDIA and Microsoft have been working together to accelerate Llama on RTX via the DirectML API.

Building on the announcements for the fastest inference performance for these models announced last month, this new option for cross-vendor deployment makes it easier than ever to bring AI capabilities to PC.

Developers and enthusiasts can experience the latest optimizations by downloading the latest ONNX runtime and following the installation instructions from Microsoft, and installing the latest driver from NVIDIA, which will be available on Nov. 21.

These new optimizations, models and resources will accelerate the development and deployment of AI features and applications to the 100 million RTX PCs worldwide, joining the more than 400 partners shipping AI-powered apps and games already accelerated by RTX GPUs.

As models become even more accessible and developers bring more generative AI-powered functionality to RTX-powered Windows PCs, RTX GPUs will be critical for enabling users to take advantage of this powerful technology.
LINK: https://blogs.nvidia.com/blog/ignite-rtx-ai-tensorrt-llm-chat-api/...
See more stories from nvidia

Most recent headlines

28/11/2025

Brides Asks for Compassion for Our Youths

Nadia Fall attends the 2025 Sundance Film Festival premiere of Brides at the Egyptian Theatre on January 24, 2025, in Park City, Utah. (Photo by Donyale West/...

28/11/2025

4 Reasons Why Keeping Your Spotify App Updated Matters and What You Might Be Missing

It's easy to ignore those little red update available badges. But when it ...

28/11/2025

FCC to Vote on LPTV Rules at Dec. Public Meeting

WASHINGTON Federal Communications Commission has released a tentative agenda for the December Open Commission Meeting scheduled for Thursday, December 18, 2025 ...

28/11/2025

Professional Fighters League Packs a Domestic, International MMA Punch (TV Sportsplay)

The Professional Fighters League is looking to super-serve fans of mixed martial...

28/11/2025

Fubo Launches Multiview Beta on Roku

Fubo has released in beta on select Roku devices a new feature that lets users display up to four simultaneous streams at once....

28/11/2025

WNBA Playoffs Continue: What's On This Weekend in TV Sports (Sept. 28-29)

The WNBA playoffs and Week 4 of the NFL regular season highlight the list of live sports events airing on television this weekend....

28/11/2025

Freeze Frame: B+C Hall of Fame 2024

The 32nd class of honorees to the B+C Hall of Fame took to the stage at New York's Ziegfeld Ballroom on September 26 for a gala induction event. Click below...

28/11/2025

Next Text: As DirecTV and Dish Try to Seize the Remains of the Day, Does It Even Matter?

We hold in our hands the very last Next Text for Next TV, the weekly back-and-fo...

28/11/2025

DirecTV Acquires Dish, Unifying Struggling Satellite Business

DirecTV said it made a deal with EchoStar to buy EchoStar's video businesses, including satellite-TV provider Dish TV and virtual MVPD Sling TV, for $1 plus...

28/11/2025

B+C Hall of Fame Announces Its Class of 2025

The Broadcasting+Cable Hall of Fame, the premier industry event paying tribute to the influencers, innovators and shining lights of broadcast, cable and streami...

28/11/2025

Sky Sports x Slawn drop limited-edition football jersey that unlocks a month of free content from the home of sport

Friday 28 November 2025 Sky Sports x Slawn drop limited-edition football jersey...

28/11/2025

Rohde & Schwarz shows resilience in a challenging environment, revenue exceeds three billion euros for the first time

Rohde & Schwarz shows resilience in a challenging environment, revenue exceeds t...

28/11/2025

Changing children's lives for good: Donations for the RT Toy Show Appeal 2025 open tonight

Unwrapped: The Toy Show Appeal - airing this Sunday on RT One and RT Player- s...

27/11/2025

Vizrt Launches Viz One 8.1 With AI-Powered Features

LONDON Vizrt has added several AI-driven advanced features offering improved speed, intelligence and accuracy in the newest version of its media asset managemen...

27/11/2025

Prime Video Debuts AI-Powered Video Recaps

Prime Video has launched AI-powered video season recaps in a beta version for select English-language Prime Original series in the U.S., a move Amazon is callin...

27/11/2025

Netflix's 'Raat Akeli Hai: The Bansal Murders' Marks a Grand World Premiere at IFFI Ahead of Its Global Release on 19th December

Back to All News Netflix's Raat Akeli Hai: The Bansal Murders Marks a Grand...

27/11/2025

Sky unveils first look image from high-stakes action thriller Prisoner, coming 2026

Tahar Rahim and Izuka Hoyle star in the gripping six-part Sky Original from Acad...

27/11/2025

Sky Arts Reveals the Nations Greatest Basslines and Queen Reign Supreme

Thursday 27 November 2025 Sky Arts Reveals the Nation's Greatest Basslines - and Queen Reign Supreme The UK's most iconic basslines have been revealed...

27/11/2025

Stranger Things 5': Prepare for One Last Adventure With Our Final Season Coverage Guide

Back to All News Stranger Things 5': Prepare for One Last Adventure With O...

27/11/2025

Elastic Compute for a Sustainable Media Industry

The media industry has a paradox at its core. It's an industry built on light, color and imagination, yet behind the scenes, it's powered by one of the ...

27/11/2025

Arqiva Achieves Five-Star GRESB Rating

Rating reflects rating progress across areas including policies, diversity & inclusion, health & safety and Net Zero leadership Winchester, UK, 27 November 202...

27/11/2025

Retail Media Audits Explained: What Networks Need to Know

What are the industry standards for Retail Media? Kathryn explains that certification is based on the IAB Europe Retail Media Measurement Standards and the IAB ...

27/11/2025

Katie Taylor, Rachael Blackmore and Arthur Gourounlian among the guests on this week's Late Late Show

World champion boxer and Irish sporting icon Katie Taylor will be in studio this...

27/11/2025

Tonight on RT Prime Time, serious child protection concerns emerge over online gaming platform, Roblox

Roblox, one of the world's most popular online gaming platforms for primary ...

27/11/2025

The Ultimate Black Friday Deal Is Here

Black Friday is leveling up. Get ready to score one of the biggest deals of the season - 50% off the first three months of a new GeForce NOW Ultimate membership...

26/11/2025

SVG Sit-Down: Prime Video EP Mike Muriano Previews Massive Black Friday Slate Featuring NFL, NBA, and Golf

SVG Sit-Down: Prime Video EP Mike Muriano Previews Massive Black Friday Slate Fe...

26/11/2025

Inside the Archives: Winter Is in the Air and in Our Festival Films

A cinematic snow sculpture at the 1995 Sundance Film Festival. Photo by Randall Michelson...

26/11/2025

10 Book Podcasts You Can't Miss

Book podcasts are booming. On Spotify, you'll find everything from celebrity book clubs to deep dives with bestselling authors. And in markets where audiobo...

26/11/2025

JioStar and Nielsen Unveil Breakthrough Cross-Screen MeasurementStudy, Redefining Advertising Effectiveness in Live Sports

Mumbai, November 24, 2025: In a first-of-its-kind initiative, JioStar, in collab...

26/11/2025

ITN Deploys IP-Based Production Control Room

LONDON Factual content producer ITN Productions has launched a new low-latency IP gallery for news bulletins....

26/11/2025

YouTube TV, TelevisaUnivision End Lengthy Blackout

MIAMI TelevisaUnivision said it struck a new multiyear distribution agreement with YouTube TV that includes distribution of TelevisaUnivision's U.S. network...

26/11/2025

OpenDrives Bridges the Gap Between IT and Creatives with...

OpenDrives, Inc., a leader in software-defined data storage and data services, today announced the launch of the Atlas Corporate Creative Solution. This new Atl...

26/11/2025

Disguise to Showcase Future of Event Visuals at LDI 2025

Disguise, the industry-leading company powering the world's biggest live performances, is partnering with pioneering LED wall manufacturer DVS to give atten...

26/11/2025

HighField AI Expands Global Channel Partner Network to Ac...

HighField AI, the pioneer in agentic and multimodal automation for broadcast and media production, today announced the expansion of its global channel partner n...

26/11/2025

Mono Streaming selects PlayBox Neo to manage English Prem...

As high-stakes Premier League fixtures approach and additional premium content launches, with MONO positioning themselves to dominate Thailand's sports stre...

26/11/2025

Bell Centre arena in Montreal elevates fan experience wit...

Hosting a wide variety of events from high-intensity NHL games to complex live music concerts and major entertainment productions, Montreal's 21,000 capacit...

26/11/2025

Vizrt launches AI-powered advances for speed and accuracy...

Vizrt, the leader in live production technology revolutionizing viewer engagement and experience, releases AI-driven advances focusing on speed, intelligence, a...

26/11/2025

ITN Launches Low-Latency IP Control Room Powered by Teche...

ITN Productions, an award-winning factual content producer, today launched a new low-latency IP gallery for news bulletins. Responsible for delivering a leading...

26/11/2025

Ikegami Maintains Initiative in Broadcast Systems Develop...

Ikegami reports ongoing advances throughout 2025 in developing and delivering coordinated television production solutions that maximize quality, versatility and...

26/11/2025

Fubo, NBCUniversal Trade Barbs in Carriage Dispute

Following the Nov. 21 blackout of NBCUniversal channels on Fubo, the two sides have traded barbs about their inability to reach a new carriage deal....

26/11/2025

Global Sports Rights Spending to Top $78 Billion in 2030

LONDON As TV sports rights become increasingly important for both broadcasters and streamers, Ampere Analysis predicts global investment in the genre will surpa...

26/11/2025

Vubiquity Earns AWS Media & Entertainment Competency Status

LOS ANGELES Vubiquity said it has achieved the Amazon Web Services (AWS) Media & Entertainment Competency as part of the AWS Partner Network (APN). This designa...

26/11/2025

Comcast Pays $1.5 Million to Settle FCC Data Breach Probe

WASHINGTON The Federal Communications Commission's Enforcement Bureau said it has entered into a consent decree with Comcast calling for the cable company t...

26/11/2025

Berklee Named to the Hollywood Reporters Top Music Schools List

Berklee Named to the Hollywood Reporters Top Music Schools List The publication highlights the college's screen scoring program, industry partnerships, and ...

26/11/2025

Animated Series Love Through a Prism' Casts New Light on Romance Between Aristocrat and Exchange Student in London

Back to All News Animated Series Love Through a Prism' Casts New Light on ...

26/11/2025

NALIP Unveils Fifth Cohort of Director Incubator

Back to All News NALIP Unveils Fifth Cohort of Director Incubator Social Impact 26 November 2025 United States Link copied to clipboard The National Assoc...

26/11/2025

YouView Achieves Greenly Gold Certification for Sustainability

YouView Achieves Greenly Gold Certification for SustainabilityNov 26, 2025 YouView is proud to announce a Gold Certification award from Greenly for our perform...