Striking Performance: Large Language Models up to 4x Faster on RTX With TensorRT-LLM for Windows
17/10/2023
And GeForce RTX and NVIDIA RTX GPUs, which are packed with dedicated AI processors called Tensor Cores, are bringing the power of generative AI natively to more than 100 million Windows PCs and workstations.
Today, generative AI on PC is getting up to 4x faster via TensorRT-LLM for Windows, an open-source library that accelerates inference performance for the latest AI large language models, like Llama 2 and Code Llama. This follows the announcement of TensorRT-LLM for data centers last month.
NVIDIA has also released tools to help developers accelerate their LLMs, including scripts that optimize custom models with TensorRT-LLM, TensorRT-optimized open-source models and a developer reference project that showcases both the speed and quality of LLM responses.
TensorRT acceleration is now available for Stable Diffusion in the popular Web UI by Automatic1111 distribution. It speeds up the generative AI diffusion model by up to 2x over the previous fastest implementation.
Plus, RTX Video Super Resolution (VSR) version 1.5 is available as part of today's Game Ready Driver release - and will be available in the next NVIDIA Studio Driver, releasing early next month.
Supercharging LLMs With TensorRT LLMs are fueling productivity - engaging in chat, summarizing documents and web content, drafting emails and blogs - and are at the core of new pipelines of AI and other software that can automatically analyze data and generate a vast array of content.
TensorRT-LLM, a library for accelerating LLM inference, gives developers and end users the benefit of LLMs that can now operate up to 4x faster on RTX-powered Windows PCs.
At higher batch sizes, this acceleration significantly improves the experience for more sophisticated LLM use - like writing and coding assistants that output multiple, unique auto-complete results at once. The result is accelerated performance and improved quality that lets users select the best of the bunch.
TensorRT-LLM acceleration is also beneficial when integrating LLM capabilities with other technology, such as in retrieval-augmented generation (RAG), where an LLM is paired with a vector library or vector database. RAG enables the LLM to deliver responses based on a specific dataset, like user emails or articles on a website, to provide more targeted answers.
To show this in practical terms, when the question How does NVIDIA ACE generate emotional responses? was asked of the LLaMa 2 base model, it returned an unhelpful response.
Better responses, faster. Conversely, using RAG with recent GeForce news articles loaded into a vector library and connected to the same Llama 2 model not only returned the correct answer - using NeMo SteerLM - but did so much quicker with TensorRT-LLM acceleration. This combination of speed and proficiency gives users smarter solutions.
TensorRT-LLM will soon be available to download from the NVIDIA Developer website. TensorRT-optimized open source models and the RAG demo with GeForce news as a sample project are available at ngc.nvidia.com and GitHub.com/NVIDIA.
Automatic Acceleration Diffusion models, like Stable Diffusion, are used to imagine and create stunning, novel works of art. Image generation is an iterative process that can take hundreds of cycles to achieve the perfect output. When done on an underpowered computer, this iteration can add up to hours of wait time.
TensorRT is designed to accelerate AI models through layer fusion, precision calibration, kernel auto-tuning and other capabilities that significantly boost inference efficiency and speed. This makes it indispensable for real-time applications and resource-intensive tasks.
And now, TensorRT doubles the speed of Stable Diffusion.
Compatible with the most popular distribution, WebUI from Automatic1111, Stable Diffusion with TensorRT acceleration helps users iterate faster and spend less time waiting on the computer, delivering a final image sooner. On a GeForce RTX 4090, it runs 7x faster than the top implementation on Macs with an Apple M2 Ultra. The extension is available for download today.
The TensorRT demo of a Stable Diffusion pipeline provides developers with a reference implementation on how to prepare diffusion models and accelerate them using TensorRT. This is the starting point for developers interested in turbocharging a diffusion pipeline and bringing lightning-fast inferencing to applications.
Video That's Super AI is improving everyday PC experiences for all users. Streaming video - from nearly any source, like YouTube, Twitch, Prime Video, Disney+ and countless others - is among the most popular activities on a PC. Thanks to AI and RTX, it's getting another update in image quality.
RTX VSR is a breakthrough in AI pixel processing that improves the quality of streamed video content by reducing or eliminating artifacts caused by video compression. It also sharpens edges and details.
Available now, RTX VSR version 1.5 further improves visual quality with updated models, de-artifacts content played in its native resolution and adds support for RTX GPUs based on the NVIDIA Turing architecture - both professional RTX and GeForce RTX 20 Series GPUs.
Retraining the VSR AI model helped it learn to accurately identify the difference between subtle details and compression artifacts. As a result, AI-enhanced images more accurately preserve details during the upscaling process. Finer details are more visible, and the overall image looks sharper and crisper.
RTX Video Super Resolution v1.5 improves detail and sharpness. New with version 1.5 is the ability to de-artifact video played at the display's native resolution. The original release only enhanced video when it was
LINK: | https://blogs.nvidia.com/blog/2023/10/17/tensorrt-llm-windows-stable-d... |
See more stories from nvidia |
Most recent headlines
04/08/2024
Dalet Appoints Santiago Solanas as CEO to Lead Next Era of Growth and Innovation
Dalet, a leading technology and service provider for media-rich organizations, is excited to announce Santiago Solanas as its new Chief Executive Officer (CEO)....
03/06/2024
Dalet and Veritone Reach Agreement to Distribute, Transact and Monetize Media Archives
Dalet, a leading technology and service provider for media-rich organizations, a...
31/05/2024
Samsung, LG Adopt IAB Software Kit for CTV Ad Measurement
NEW YORK The global body that sets technical standards for digital advertising has expanded the reach of its measurement software development kit to include Sam...
31/05/2024
The Old Investment Cycles of Broadcast Technology are Being Run Over'
The rise of digital platforms is fundamentally changing the broadcast newsroom and its investment in technology according to Jon Roberts, director of technology...
31/05/2024
Study: Max, Apple TV Have Fewest 'Committed Customers'
In a new analysis of what services might make up a good bundle and what services might benefit the most from bundling, Antenna has released new data that highli...
31/05/2024
State Broadcasters Urge Passage of Broadcast VOICES Act
WASHINGTON, D.C. Fifty state broadcasters associations, including the District of Columbia and the Commonwealth of Puerto Rico, have sent a letter to Congressio...
31/05/2024
EVS Names Richard Katz SVP Operations for NALA region
EVS has announced the appointment of Richard Katz as senior vice president of operations for the North and Latin American (NALA) region. This appointment is par...
31/05/2024
CommScope to Buy Cable Business Assets of Casa Systems for $45.1M
CommScope has announced that it was selected by Casa Systems, Inc. as the highest to acquire Casa's cable business assets and that the two companies have en...
31/05/2024
NAB Withdraws De Niro's Service Award
Robert De Niro won't be receiving the Service to America Leadership Award from the NAB Leadership Foundation after all....
31/05/2024
Viant Integrates With Google Cloud's BigQuery Clean Rooms
Viant Technology said it formed an integration with Google Cloud's BigQuery data clean rooms...
31/05/2024
The CW To Broadcast Snoop Dogg Arizona Bowl
The CW said it will broadcast the 2024 Snoop Dogg Arizona Bowl presented by Gin & Juice by Dre and Snoop Last year, The CW aired the game when it was called Bar...
31/05/2024
Allen Media Makes Carriage Deal With Amazon Fire Channels
Allen Media Group said it made a deal with Amazon's Fire TV Channel that will make content from several of Allen's outlet available via Fire TV and Echo...
31/05/2024
Robert De Niro Will Not Receive Service to America Award
Robert De Niro, announced as the winner of the NAB's Leadership Foundation's Service to America Leadership Award on May 28, will not get the award. The ...
31/05/2024
Comcast Technology Solutions To Manage Channels for AccuWeather
Comcast Technology Solutions was selected by AccuWeather to create, manage and distribute linear and over-the-top video channels....
31/05/2024
Attention to Connected TV Ads Grew in Q1, TVision Study Finds
Connected TV is becoming a bigger deal for consumers and advertisers, according to a new report from TVision....
30/05/2024
When to Upgrade Software or Firmware
When to Upgrade Software or Firmware This post is from our blog and news archive. The information may be out of date. Please contact us for further information ...
30/05/2024
The City of San Jose Chooses Utah ScientificAgain!
The City of San Jose Chooses Utah Scientific Again! This post is from our blog and news archive. The information may be out of date. Please contact us for furth...
30/05/2024
Empowering Media Leaders: Transformative Insights from the Executive Learning Series
Navigating Strategy, Ethics, and Innovation in the evolving European media lands...
30/05/2024
Give Me the Backstory: Get to Know Thea Hvistendahl, the Filmmaker Behind Handling the Undead
By Bailey Pennick One of the most exciting things about the Sundance Film Festi...
30/05/2024
CHOOSE CHICAGO, CITY OF CHICAGO ANNOUNCE PROGRAMMING DETAILS OF SUNDANCE INSTITUTE X CHICAGO 2024 (June 28 - 30)
Tickets are now available for the first-of-its-kind event in the United States f...
30/05/2024
UNICEF and Spotify's Award-Winning Mental Health Hub, Our Minds Matter, Comes to Latin America
Mental health and well-being are fundamental components of a child's healthy...
30/05/2024
After 64 days, the winner of Alone Australia has been revealed
After 64 days, the winner of Alone Australia has been revealed 29 May, 2024 Media releases *Contains Spoilers* After 64 days alone in the extreme and wild...
30/05/2024
Alone Australia finale delivers strongly as SBS confirms it will join VOZ streaming
Alone Australia finale delivers strongly as SBS confirms it will join VOZ stream...
30/05/2024
Alone Australia finale delivers strongly for national broadcaster SBS with 4.2m tuning in across the season
Alone Australia finale delivers strongly for national broadcaster SBS with 4.2m ...
30/05/2024
Lockheed Martin Canada Awards L3Harris the Integrated Communications System Contract
Photo credit: BAE...
30/05/2024
The New Standard for Unmanned Ground Vehicle ISR
L3Harris' WESCAM MX -10 RSTA provides advanced ISR and targeting capabilities for land-based platforms including Unmanned Ground Vehicles....
30/05/2024
AccuWeather Inks Deal With Comcast Technology Solutions For Channel Origination
DENVER AccuWeather has selected Comcast Technology Solutions' (CTS') Managed Channel Origination (MCO) to create, manage and distribute linear TV, on-de...
30/05/2024
Samsung, LG Adopt IAB Software Kit
NEW YORK The global body that sets technical standards for digital advertising has expanded the reach of its measurement software development kit to include Sam...
30/05/2024
Amazon Celebrates 10th Anniversary of Fire TV Devices with an AI Upgrade
As Amazon celebrates the 10th anniversary of the launch of the first Fire TV devices in 2014, the company has unveiled new AI-powered search capabilities for th...
30/05/2024
Finding the sustainability balance with personalised video streaming
As the expansion of personalised video streaming continues apace, JUMPs CEO and co-founder, Jer nimo Macan s and Fran ois Polarczyk, sustainability director at ...
30/05/2024
EditShare boosts sales direction with alumnus Grant Carro...
EditShare, the technology leader that enables storytellers to create and manage collaborative workflows at every stage from storyboard to screening, has appoint...
30/05/2024
COW Featured Resume: Paula Zimmerman - Video Editor - Motion Designer
COW Featured Resume: Paula Zimmerman - Video Editor - Motion Designer Brie Clayton May 30, 2024 0 Comments Paula Zimmerman Looking for work? Sign up...
30/05/2024
Palme d'Or Winner Anatomy of a Fall Finished with DaVinci Resolve Studio
Palme d'Or Winner Anatomy of a Fall Finished with DaVinci Resolve Studio Brie Clayton May 30, 2024 0 Comments Winner of the Palme d'Or at the ...
30/05/2024
Real-Time Workflow Masterclass
Real-Time Workflow Masterclass Michael Cioni May 30, 2024 0 Comments Would you rather have your workflow move at the same speed it always has, or woul...
30/05/2024
Ending a Loop Expression in After Effects
Ending a Loop Expression in After Effects Graham Quince May 30, 2024 0 Comments This question comes up quite a bit on forums and it seems to trip peop...
30/05/2024
Parks: Prime Video Has Lowest Churn Rate
DALLAS Consumers who subscribe to streaming services are the least likely to cancel Prime Video among all major providers, according to Parks Associates' St...
30/05/2024
SMPTE/RIS-OSVP tests model for circle of confusion'
According to Camera and Lens Metadata committee, modern digital cinema cameras are not accurately representing the usable depth of field achieved with various l...
30/05/2024
Meet the vice president of product management
Dave MacKinnon, vice president of product management at Clear-Com, explains the value of collaboration and networking in building a media career By Matthew Cor...
30/05/2024
That Station Launches New App, Website, Filled with Features, Music, Interviews and More
That Station's app and website just got a major upgrade. The station crew co...
30/05/2024
SWR Deploys Rohde & Schwarz Pixel Power Software Playout Solution
MUNICH, Germany Regional German broadcaster S dwestrundfunk (SWR) has deployed the Rohde & Schwarz Pixel Power graphics and playout solution....
30/05/2024
Hollyland Technology Unveils Pyro Wireless Video Transmission Series
IRVINE, Calif. Hollyland Technology has launched Pyro, a wireless video transmission system designed for the multi-person, mobile transmission and monitoring re...
30/05/2024
AIMS Opens Call for Presentations for Media-Over-IP Pavilion at AES New York
AIMS has announced it will once again collaborate with the Audio Engineering Society (AES) to bring the popular Media-Over-IP Pavilion to the AES New York show,...
30/05/2024
FCC Announces Opportunity for LPTV Stations to Change Channels
WASHINGTON, D.C. The Federal Communications Commission (FCC) Media Bureau has announced that beginning on August 20, 2024, Class A television, Low Power Televis...
30/05/2024
Comcast's StreamSaver Streaming Bundle Goes Live
Comcast has officially launched its discounted $15-a-month StreamSaver streaming bundle of Netflix, Peacock and Apple TV+ services. The previously announced bun...
30/05/2024
AMG Launches TV Stations, Local Now FAST Channels on Amazon's Fire TV Channels
LOS ANGELES Allen Media Group (AMG) is launching three of its streaming brands o...
30/05/2024
Viant Integrates with Google Cloud's BigQuery Data Clean Rooms
IRVINE, Calif. The ad tech company Viant Technology Inc. has announced a new integration with Google Cloud's BigQuery data clean rooms that enables the seam...
30/05/2024
Marshall Brings Selection of New and Proven AV Solutions...
Marshall Electronics will highlight several of its newest product offerings at InfoComm 2024 (Booth C8982), including the CV612 auto-tracking PTZ camera and the...
30/05/2024
Triveni Digital Streamlines TV Operations at the 2024 ATS...
Triveni Digital today announced that the company will showcase its end-to-end ATSC 3.0 offering at the 2024 ATSC NEXTGEN Broadcast Conference, June 12-14 in Was...
30/05/2024
Riedel Bolero Empowers Student Production Excellence at O...
Riedel Communications today announced that Orange County School of the Arts (OCSA) has successfully integrated the Bolero wireless intercom system into their st...
30/05/2024
SWR moves to software playout with integrated Pixel Power...
Rohde & Schwarz has implemented its Pixel Power graphics and playout solution at S dwestrundfunk (SWR), the regional German broadcaster based in Baden-Baden. SW...