Sony Pixel Power calrec Sony

Igniting the Future: TensorRT-LLM Release Accelerates AI Inference Performance, Adds Support for New Models Running on RTX-Powered Windows 11 PCs

15/11/2023

Artificial intelligence on Windows 11 PCs marks a pivotal moment in tech history, revolutionizing experiences for gamers, creators, streamers, office workers, students and even casual PC users.

It offers unprecedented opportunities to enhance productivity for users of the more than 100 million Windows PCs and workstations that are powered by RTX GPUs. And NVIDIA RTX technology is making it even easier for developers to create AI applications to change the way people use computers.

New optimizations, models and resources announced at Microsoft Ignite will help developers deliver new end-user experiences, quicker.

An upcoming update to TensorRT-LLM - open-source software that increases AI inference performance - will add support for new large language models and make demanding AI workloads more accessible on desktops and laptops with RTX GPUs starting at 8GB of VRAM.

TensorRT-LLM for Windows will soon be compatible with OpenAI's popular Chat API through a new wrapper. This will enable hundreds of developer projects and applications to run locally on a PC with RTX, instead of in the cloud - so users can keep private and proprietary data on Windows 11 PCs.

Custom generative AI requires time and energy to maintain projects. The process can become incredibly complex and time-consuming, especially when trying to collaborate and deploy across multiple environments and platforms.

AI Workbench is a unified, easy-to-use toolkit that allows developers to quickly create, test and customize pretrained generative AI models and LLMs on a PC or workstation. It provides developers a single platform to organize their AI projects and tune models to specific use cases.

This enables seamless collaboration and deployment for developers to create cost-effective, scalable generative AI models quickly. Join the early access list to be among the first to gain access to this growing initiative and to receive future updates.

To support AI developers, NVIDIA and Microsoft will release DirectML enhancements to accelerate one of the most popular foundational AI models, Llama 2. Developers now have more options for cross-vendor deployment, in addition to setting a new standard for performance.

Portable AI Last month, NVIDIA announced TensorRT-LLM for Windows, a library for accelerating LLM inference.

The next TensorRT-LLM release, v0.6.0 coming later this month, will bring improved inference performance - up to 5x faster - and enable support for additional popular LLMs, including the new Mistral 7B and Nemotron-3 8B. Versions of these LLMs will run on any GeForce RTX 30 Series and 40 Series GPU with 8GB of RAM or more, making fast, accurate, local LLM capabilities accessible even in some of the most portable Windows devices.

Up to 5X performance with the new TensorRT-LLM v0.6.0. The new release of TensorRT-LLM will be available for install on the /NVIDIA/TensorRT-LLM GitHub repo. New optimized models will be available on ngc.nvidia.com.

Conversing With Confidence Developers and enthusiasts worldwide use OpenAI's Chat API for a wide range of applications - from summarizing web content and drafting documents and emails to analyzing and visualizing data and creating presentations.

One challenge with such cloud-based AIs is that they require users to upload their input data, making them impractical for private or proprietary data or for working with large datasets.

To address this challenge, NVIDIA is soon enabling TensorRT-LLM for Windows to offer a similar API interface to OpenAI's widely popular ChatAPI, through a new wrapper, offering a similar workflow to developers whether they are designing models and applications to run locally on a PC with RTX or in the cloud. By changing just one or two lines of code, hundreds of AI-powered developer projects and applications can now benefit from fast, local AI. Users can keep their data on their PCs and not worry about uploading datasets to the cloud.

Perhaps the best part is that many of these projects and applications are open source, making it easy for developers to leverage and extend their capabilities to fuel the adoption of generative AI on Windows, powered by RTX.

The wrapper will work with any LLM that's been optimized for TensorRT-LLM (for example, Llama 2, Mistral and NV LLM) and is being released as a reference project on GitHub, alongside other developer resources for working with LLMs on RTX.

Model Acceleration Developers can now leverage cutting-edge AI models and deploy with a cross-vendor API. As part of an ongoing commitment to empower developers, NVIDIA and Microsoft have been working together to accelerate Llama on RTX via the DirectML API.

Building on the announcements for the fastest inference performance for these models announced last month, this new option for cross-vendor deployment makes it easier than ever to bring AI capabilities to PC.

Developers and enthusiasts can experience the latest optimizations by downloading the latest ONNX runtime and following the installation instructions from Microsoft, and installing the latest driver from NVIDIA, which will be available on Nov. 21.

These new optimizations, models and resources will accelerate the development and deployment of AI features and applications to the 100 million RTX PCs worldwide, joining the more than 400 partners shipping AI-powered apps and games already accelerated by RTX GPUs.

As models become even more accessible and developers bring more generative AI-powered functionality to RTX-powered Windows PCs, RTX GPUs will be critical for enabling users to take advantage of this powerful technology.
LINK: https://blogs.nvidia.com/blog/ignite-rtx-ai-tensorrt-llm-chat-api/...
See more stories from nvidia

Most recent headlines

05/01/2027

Worlds first 802.15.4ab-UWB chip verified by Calterah and Rohde & Schwarz to be demoed at CES 2026

Worlds first 802.15.4ab-UWB chip verified by Calterah and Rohde & Schwarz to be ...

01/06/2026

Dolby Sets the New Standard for Premium Entertainment at CES 2026

January 6 2026, 05:30 (PST) Dolby Sets the New Standard for Premium Entertainment at CES 2026 Throughout the week, Dolby brings to life the latest innovatio...

01/05/2026

NBCUniversal's Peacock to Be First Streamer to Integrate Dolby's Full Suite of Premium Picture and Sound Innovations

January 5 2026, 18:30 (PST) NBCUniversal's Peacock to Be First Streamer to ...

01/04/2026

DOLBY AND DOUYIN EMPOWER THE NEXT GENERATON OF CREATORS WITH DOLBY VISION

January 4 2026, 18:00 (PST) DOLBY AND DOUYIN EMPOWER THE NEXT GENERATON OF CREATORS WITH DOLBY VISION Douyin Users Can Now Create And Share Videos With Stun...

27/01/2026

EVS Officially Launches eShop

Share Share by: Copy link Facebook X Linkedin Pinterest Bluesky Email...

27/01/2026

Miri to Showcase New V410 Video Encoder/Decoder at ISE 2026

Share Share by: Copy link Facebook X Linkedin Pinterest Bluesky Email...

27/01/2026

FCC Settles Two Investigations, Renews Three TV Station Licenses

Share Share by: Copy link Facebook X Linkedin Pinterest Bluesky Email...

27/01/2026

UML Tsongas Center Upgrades to Ikegami UHK-X600 Cameras

Share Share by: Copy link Facebook X Linkedin Pinterest Bluesky Email...

27/01/2026

Australian Defence Force Secures Satellite Communications on SES IS-22

New agreement for uninterrupted UHF connectivity for Australian Defence Force through 2033, With Options Extending to 2041 Luxembourg, January 13, 2026 - Satel...

26/01/2026

New Directed By' Series Explores the Making of Music Videos and Kicks Off With Maisie Peters

Music videos play a huge part in how fans connect with their favorite songs, but...

26/01/2026

NAIDOC launches 2026 theme 50 Years of Deadly, marks major milestone

NAIDOC launches 2026 theme 50 Years of Deadly , marks major milestone 25 January, 2026 Media releases The National NAIDOC Committee has today unveiled its...

26/01/2026

Amagi Lists on Indian Stock Exchanges

Share Share by: Copy link Facebook X Linkedin Pinterest Bluesky Email...

26/01/2026

FCC Releases Tentative Agenda for Jan. 29 Open Meeting

Share Share by: Copy link Facebook X Linkedin Pinterest Bluesky Email...

26/01/2026

Dante Turns 20

Share Share by: Copy link Facebook X Linkedin Pinterest Bluesky Email...

26/01/2026

APTS, PBS Tell FCC Not to Set a Firm Date for ATSC 1.0 Sunset

Share Share by: Copy link Facebook X Linkedin Pinterest Bluesky Email...

26/01/2026

Advanced Systems Group and Nu Studio Announce Exclusive P...

Advanced Systems Group (ASG) today announced an exclusive partnership with Nu Studio, the company behind the first modular, portable studio designed for immersi...

26/01/2026

OpenDrives Announces New Funding Led by IAG Capital Partn...

OpenDrives, the leader in high-end video data management and workflow solutions, today announced an add-on investment to its previous funding rounds, led by IAG...

26/01/2026

SmallHD Releases Major PageOS Update - Featuring Portrait...

SmallHD today announced a major update to its award-winning PageOS software, introducing Fleet Control, Portrait Mode, expanded Camera Control features, and Dow...

26/01/2026

Capturing the Chaos - Filipic Uses MSE Grip to Go Live a...

Whether rigging a PTZ on a truss, sliding a Blackmagic camera across a mezzanine railing, or capturing the mayhem from inside a drum, Filipic delivers visuals t...

26/01/2026

VEON Unveils the New Beeline Uzbekistan Network Operations Center, Launches BuildX to Accelerate Software Development in Uzbekistan

26 Jan 2026 VEON Unveils the New Beeline Uzbekistan Network Operations Center, ...

26/01/2026

FIRST LOOK images released for Mark Gatiss' Bookish S2, as Ruth Codd joins cast

Bookish is created by and stars Mark Gatiss Images available HERE Following th...

26/01/2026

New Sky Original Series explores Entertainment Juggernaut, The X Factor

Monday 26 January 2026 New Sky Original Series explores Entertainment Juggernaut, The X Factor Sky today confirmed it has greenlit a premium, definitive docum...

26/01/2026

Sky confirmsmajorinvestment to transform Livingston campus, supporting thousands oflocaljobs

Monday 26 January 2026 Sky confirms major investment to transform Livingston ca...

26/01/2026

Netflix Unveils Official Trailer for 'Salvador'

Back to All News Netflix Unveils Official Trailer for Salvador Entertainment 26 January 2026 GlobalSpain Link copied to clipboard DISCOVER THE TRAILER DO...

26/01/2026

Netflix Presents Our 2026 Series and Films and Announces New Projects

Back to All News Netflix Presents Our 2026 Series and Films and Announces New Projects Entertainment 26 January 2026 GlobalSpain Link copied to clipboard ...

26/01/2026

Made in Louisiana: People We Meet on Vacation' Lights Up New Orleans

Back to All News Made in Louisiana: People We Meet on Vacation' Lights Up New Orleans Emily Bader and Tom Blyth film on Royal Street in New Orleans' ...

26/01/2026

LinkedIn Gives Professionals the Edge with Verified Skills ...

LinkedIn Gives Professionals the Edge with Verified Skills and Tools to Navigate the Job Search Show Proficiency of AI Tools such as Descript, Lovable, Relay.ap...

26/01/2026

Industry Leaders Unite to Create First-Ever End-to-End Critical Connectivity Ecosystem Showcase at ISE Barcelona 2026

Alfalite, Brainstorm, Dejero, Domo Broadcast Systems, FOR-A, KitPlus, Ontario So...

26/01/2026

Join Broadcast Pix at ACM West 2026

Tyngsboro, Mass. - January 26, 2026 - Broadcast Pix is excited to exhibit at the Alliance for Community Media West Region Conference and Trade Show, taking plac...

26/01/2026

Software for Stable Data Management and Lasting Business Success in Uncertain Times

Software for Stable Data Management and Lasting Business Success in Uncertain Ti...

26/01/2026

RT publishes Register of External Activities for Q3/2025 (statistical summary)

RT is today publishing a statistical summary of the Register of External Activities for the third quarter of 2025. The RT Register of External Activities com...

25/01/2026

Sins of Kujo' Premieres April 2: Teaser Trailer, Art and Additional Cast Unveiled

Back to All News Sins of Kujo' Premieres April 2: Teaser Trailer, Art and ...

24/01/2026

Merata Mita and Graton Fellows Celebrated at the 2026 Sundance Film Festival

Masami Kawai Selected as the 2026 Merata Mita Fellow; Isabella Madrigal and Tsanavi Spoonhunter Named 2026 Graton Fellows During Native Forum Celebration in Par...

24/01/2026

L3Harris Delivers Multi-Intelligence Aircraft to US Air Force

The MC-55A Peregrine aircraft will give the Royal Australian Air Force information superiority and serve as strategic assets for future Australian Defence Force...

24/01/2026

Pay TV Groups Rebut NABs ATSC 3.0 Transition Plans

Share Share by: Copy link Facebook X Linkedin Pinterest Bluesky Email...

24/01/2026

APTS, PBS Tell FCC Not to Set a Firm Date for ATSC Sunset

Share Share by: Copy link Facebook X Linkedin Pinterest Bluesky Email...

24/01/2026

Baron Weather Promotes Senior Executives Steve Bray, Cherie Smyly

Share Share by: Copy link Facebook X Linkedin Pinterest Bluesky Email...

24/01/2026

RT and Virgin Media Television kick off comprehensive free-to-air coverage of the 2026 Six Nations Championship

RT and Virgin Media Television kick off comprehensive free-to-air coverage of t...

23/01/2026

UEFA Women's EURO 2025 drives surge in ad-funded streaming as Yospace powers 6 billion one-to-one advertisements

Staines-upon-Thames, UK, 29 July, 2025 - Yospace, the global leader in Dynamic A...

23/01/2026

WWE's Virtual Production Playbook: How the Professional Wrestling Super Power Built Creative Flexibility in the Studio

WWE's Virtual Production Playbook: How the Professional Wrestling Super Powe...

23/01/2026

Tight Set Up: Squeezing the PSA's Tournament of Champions Into Grand Central Station for the Annual Squash Extravaganza

Tight set up: Squeezing the PSA's Tournament of Champions into Grand Central...

23/01/2026

AFC Championship Preview: Behind the Scenes With NFL on CBS' Producer Jim Rikhoff and Director Mike Arnold

AFC Championship Preview: Behind the Scenes With NFL on CBS' Producer Jim R...

23/01/2026

NFC Championship Preview: FOX Sports Director Rich Russo Talks Technology, Storytelling Heading Into Season Finale

NFC Championship Preview: FOX Sports Director Rich Russo Talks Technology, Story...

23/01/2026

Introducing the Best New Artist 2026 Nominees

Spotify's annual Best New Artist celebration honors the rising stars whose talent, creativity, and dedication have propelled them to the music industry'...

23/01/2026

L3Harris Data Links: Resilient, Mission-Proven C5ISR Solutions

Coalition military forces operating across the vast geography of the Indo-Pacific rely on interoperable, secure data links to share intelligence, surveillance a...

23/01/2026

Why Air Forces Are Choosing L3Harris Airborne Early Warning and Control

Artist rendering of L3Harris Technologies' AERIS next generation airborne early warning and control solution....

23/01/2026

The C-130 Cockpit Advantage

The U.S. Air Force AMP Increment II aircraft at L3Harris' facility in Waco, Texas. L3Harris has modernized C-130 avionics since 1985, delivering digital coc...

23/01/2026

A Game-Changer for Caption & Subtitle QC - Now Integrated!

Paramount is transforming its operations by unifying the media supply chains of their top brands into a scalable global pipeline. This transformation enhances ...

23/01/2026

Should we automate or augment with AI?

Every delay costs. When a subtitle fails QC, even the smallest issue can mean missed deadlines, extra vendor costs, or frustrated teams. The new Accurate.Video ...