Sony Pixel Power calrec Sony

Striking Performance: Large Language Models up to 4x Faster on RTX With TensorRT-LLM for Windows

17/10/2023

Generative AI is one of the most important trends in the history of personal computing, bringing advancements to gaming, creativity, video, productivity, development and more.

And GeForce RTX and NVIDIA RTX GPUs, which are packed with dedicated AI processors called Tensor Cores, are bringing the power of generative AI natively to more than 100 million Windows PCs and workstations.

Today, generative AI on PC is getting up to 4x faster via TensorRT-LLM for Windows, an open-source library that accelerates inference performance for the latest AI large language models, like Llama 2 and Code Llama. This follows the announcement of TensorRT-LLM for data centers last month.

NVIDIA has also released tools to help developers accelerate their LLMs, including scripts that optimize custom models with TensorRT-LLM, TensorRT-optimized open-source models and a developer reference project that showcases both the speed and quality of LLM responses.

TensorRT acceleration is now available for Stable Diffusion in the popular Web UI by Automatic1111 distribution. It speeds up the generative AI diffusion model by up to 2x over the previous fastest implementation.

Plus, RTX Video Super Resolution (VSR) version 1.5 is available as part of today's Game Ready Driver release - and will be available in the next NVIDIA Studio Driver, releasing early next month.

Supercharging LLMs With TensorRT LLMs are fueling productivity - engaging in chat, summarizing documents and web content, drafting emails and blogs - and are at the core of new pipelines of AI and other software that can automatically analyze data and generate a vast array of content.

TensorRT-LLM, a library for accelerating LLM inference, gives developers and end users the benefit of LLMs that can now operate up to 4x faster on RTX-powered Windows PCs.

At higher batch sizes, this acceleration significantly improves the experience for more sophisticated LLM use - like writing and coding assistants that output multiple, unique auto-complete results at once. The result is accelerated performance and improved quality that lets users select the best of the bunch.

TensorRT-LLM acceleration is also beneficial when integrating LLM capabilities with other technology, such as in retrieval-augmented generation (RAG), where an LLM is paired with a vector library or vector database. RAG enables the LLM to deliver responses based on a specific dataset, like user emails or articles on a website, to provide more targeted answers.

To show this in practical terms, when the question How does NVIDIA ACE generate emotional responses? was asked of the LLaMa 2 base model, it returned an unhelpful response.

Better responses, faster. Conversely, using RAG with recent GeForce news articles loaded into a vector library and connected to the same Llama 2 model not only returned the correct answer - using NeMo SteerLM - but did so much quicker with TensorRT-LLM acceleration. This combination of speed and proficiency gives users smarter solutions.

TensorRT-LLM will soon be available to download from the NVIDIA Developer website. TensorRT-optimized open source models and the RAG demo with GeForce news as a sample project are available at ngc.nvidia.com and GitHub.com/NVIDIA.

Automatic Acceleration Diffusion models, like Stable Diffusion, are used to imagine and create stunning, novel works of art. Image generation is an iterative process that can take hundreds of cycles to achieve the perfect output. When done on an underpowered computer, this iteration can add up to hours of wait time.

TensorRT is designed to accelerate AI models through layer fusion, precision calibration, kernel auto-tuning and other capabilities that significantly boost inference efficiency and speed. This makes it indispensable for real-time applications and resource-intensive tasks.

And now, TensorRT doubles the speed of Stable Diffusion.

Compatible with the most popular distribution, WebUI from Automatic1111, Stable Diffusion with TensorRT acceleration helps users iterate faster and spend less time waiting on the computer, delivering a final image sooner. On a GeForce RTX 4090, it runs 7x faster than the top implementation on Macs with an Apple M2 Ultra. The extension is available for download today.

The TensorRT demo of a Stable Diffusion pipeline provides developers with a reference implementation on how to prepare diffusion models and accelerate them using TensorRT. This is the starting point for developers interested in turbocharging a diffusion pipeline and bringing lightning-fast inferencing to applications.

Video That's Super AI is improving everyday PC experiences for all users. Streaming video - from nearly any source, like YouTube, Twitch, Prime Video, Disney+ and countless others - is among the most popular activities on a PC. Thanks to AI and RTX, it's getting another update in image quality.

RTX VSR is a breakthrough in AI pixel processing that improves the quality of streamed video content by reducing or eliminating artifacts caused by video compression. It also sharpens edges and details.

Available now, RTX VSR version 1.5 further improves visual quality with updated models, de-artifacts content played in its native resolution and adds support for RTX GPUs based on the NVIDIA Turing architecture - both professional RTX and GeForce RTX 20 Series GPUs.

Retraining the VSR AI model helped it learn to accurately identify the difference between subtle details and compression artifacts. As a result, AI-enhanced images more accurately preserve details during the upscaling process. Finer details are more visible, and the overall image looks sharper and crisper.

RTX Video Super Resolution v1.5 improves detail and sharpness. New with version 1.5 is the ability to de-artifact video played at the display's native resolution. The original release only enhanced video when it was
LINK: https://blogs.nvidia.com/blog/2023/10/17/tensorrt-llm-windows-stable-d...
See more stories from nvidia

Most recent headlines

04/08/2024

Dalet Appoints Santiago Solanas as CEO to Lead Next Era of Growth and Innovation

Dalet, a leading technology and service provider for media-rich organizations, is excited to announce Santiago Solanas as its new Chief Executive Officer (CEO)....

03/06/2024

Dalet and Veritone Reach Agreement to Distribute, Transact and Monetize Media Archives

Dalet, a leading technology and service provider for media-rich organizations, a...

31/05/2024

Samsung, LG Adopt IAB Software Kit for CTV Ad Measurement

NEW YORK The global body that sets technical standards for digital advertising has expanded the reach of its measurement software development kit to include Sam...

31/05/2024

The Old Investment Cycles of Broadcast Technology are Being Run Over'

The rise of digital platforms is fundamentally changing the broadcast newsroom and its investment in technology according to Jon Roberts, director of technology...

31/05/2024

Study: Max, Apple TV Have Fewest 'Committed Customers'

In a new analysis of what services might make up a good bundle and what services might benefit the most from bundling, Antenna has released new data that highli...

31/05/2024

State Broadcasters Urge Passage of Broadcast VOICES Act

WASHINGTON, D.C. Fifty state broadcasters associations, including the District of Columbia and the Commonwealth of Puerto Rico, have sent a letter to Congressio...

31/05/2024

EVS Names Richard Katz SVP Operations for NALA region

EVS has announced the appointment of Richard Katz as senior vice president of operations for the North and Latin American (NALA) region. This appointment is par...

31/05/2024

CommScope to Buy Cable Business Assets of Casa Systems for $45.1M

CommScope has announced that it was selected by Casa Systems, Inc. as the highest to acquire Casa's cable business assets and that the two companies have en...

31/05/2024

NAB Withdraws De Niro's Service Award

Robert De Niro won't be receiving the Service to America Leadership Award from the NAB Leadership Foundation after all....

31/05/2024

Viant Integrates With Google Cloud's BigQuery Clean Rooms

Viant Technology said it formed an integration with Google Cloud's BigQuery data clean rooms...

31/05/2024

The CW To Broadcast Snoop Dogg Arizona Bowl

The CW said it will broadcast the 2024 Snoop Dogg Arizona Bowl presented by Gin & Juice by Dre and Snoop Last year, The CW aired the game when it was called Bar...

31/05/2024

Allen Media Makes Carriage Deal With Amazon Fire Channels

Allen Media Group said it made a deal with Amazon's Fire TV Channel that will make content from several of Allen's outlet available via Fire TV and Echo...

31/05/2024

Robert De Niro Will Not Receive Service to America Award

Robert De Niro, announced as the winner of the NAB's Leadership Foundation's Service to America Leadership Award on May 28, will not get the award. The ...

31/05/2024

Comcast Technology Solutions To Manage Channels for AccuWeather

Comcast Technology Solutions was selected by AccuWeather to create, manage and distribute linear and over-the-top video channels....

31/05/2024

Attention to Connected TV Ads Grew in Q1, TVision Study Finds

Connected TV is becoming a bigger deal for consumers and advertisers, according to a new report from TVision....

30/05/2024

When to Upgrade Software or Firmware

When to Upgrade Software or Firmware This post is from our blog and news archive. The information may be out of date. Please contact us for further information ...

30/05/2024

The City of San Jose Chooses Utah ScientificAgain!

The City of San Jose Chooses Utah Scientific Again! This post is from our blog and news archive. The information may be out of date. Please contact us for furth...

30/05/2024

Empowering Media Leaders: Transformative Insights from the Executive Learning Series

Navigating Strategy, Ethics, and Innovation in the evolving European media lands...

30/05/2024

Give Me the Backstory: Get to Know Thea Hvistendahl, the Filmmaker Behind Handling the Undead

By Bailey Pennick One of the most exciting things about the Sundance Film Festi...

30/05/2024

CHOOSE CHICAGO, CITY OF CHICAGO ANNOUNCE PROGRAMMING DETAILS OF SUNDANCE INSTITUTE X CHICAGO 2024 (June 28 - 30)

Tickets are now available for the first-of-its-kind event in the United States f...

30/05/2024

UNICEF and Spotify's Award-Winning Mental Health Hub, Our Minds Matter, Comes to Latin America

Mental health and well-being are fundamental components of a child's healthy...

30/05/2024

After 64 days, the winner of Alone Australia has been revealed

After 64 days, the winner of Alone Australia has been revealed 29 May, 2024 Media releases *Contains Spoilers* After 64 days alone in the extreme and wild...

30/05/2024

Alone Australia finale delivers strongly as SBS confirms it will join VOZ streaming

Alone Australia finale delivers strongly as SBS confirms it will join VOZ stream...

30/05/2024

Alone Australia finale delivers strongly for national broadcaster SBS with 4.2m tuning in across the season

Alone Australia finale delivers strongly for national broadcaster SBS with 4.2m ...

30/05/2024

The New Standard for Unmanned Ground Vehicle ISR

L3Harris' WESCAM MX -10 RSTA provides advanced ISR and targeting capabilities for land-based platforms including Unmanned Ground Vehicles....

30/05/2024

AccuWeather Inks Deal With Comcast Technology Solutions For Channel Origination

DENVER AccuWeather has selected Comcast Technology Solutions' (CTS') Managed Channel Origination (MCO) to create, manage and distribute linear TV, on-de...

30/05/2024

Samsung, LG Adopt IAB Software Kit

NEW YORK The global body that sets technical standards for digital advertising has expanded the reach of its measurement software development kit to include Sam...

30/05/2024

Amazon Celebrates 10th Anniversary of Fire TV Devices with an AI Upgrade

As Amazon celebrates the 10th anniversary of the launch of the first Fire TV devices in 2014, the company has unveiled new AI-powered search capabilities for th...

30/05/2024

Finding the sustainability balance with personalised video streaming

As the expansion of personalised video streaming continues apace, JUMPs CEO and co-founder, Jer nimo Macan s and Fran ois Polarczyk, sustainability director at ...

30/05/2024

EditShare boosts sales direction with alumnus Grant Carro...

EditShare, the technology leader that enables storytellers to create and manage collaborative workflows at every stage from storyboard to screening, has appoint...

30/05/2024

COW Featured Resume: Paula Zimmerman - Video Editor - Motion Designer

COW Featured Resume: Paula Zimmerman - Video Editor - Motion Designer Brie Clayton May 30, 2024 0 Comments Paula Zimmerman Looking for work? Sign up...

30/05/2024

Palme d'Or Winner Anatomy of a Fall Finished with DaVinci Resolve Studio

Palme d'Or Winner Anatomy of a Fall Finished with DaVinci Resolve Studio Brie Clayton May 30, 2024 0 Comments Winner of the Palme d'Or at the ...

30/05/2024

Real-Time Workflow Masterclass

Real-Time Workflow Masterclass Michael Cioni May 30, 2024 0 Comments Would you rather have your workflow move at the same speed it always has, or woul...

30/05/2024

Ending a Loop Expression in After Effects

Ending a Loop Expression in After Effects Graham Quince May 30, 2024 0 Comments This question comes up quite a bit on forums and it seems to trip peop...

30/05/2024

Parks: Prime Video Has Lowest Churn Rate

DALLAS Consumers who subscribe to streaming services are the least likely to cancel Prime Video among all major providers, according to Parks Associates' St...

30/05/2024

SMPTE/RIS-OSVP tests model for circle of confusion'

According to Camera and Lens Metadata committee, modern digital cinema cameras are not accurately representing the usable depth of field achieved with various l...

30/05/2024

Meet the vice president of product management

Dave MacKinnon, vice president of product management at Clear-Com, explains the value of collaboration and networking in building a media career By Matthew Cor...

30/05/2024

That Station Launches New App, Website, Filled with Features, Music, Interviews and More

That Station's app and website just got a major upgrade. The station crew co...

30/05/2024

SWR Deploys Rohde & Schwarz Pixel Power Software Playout Solution

MUNICH, Germany Regional German broadcaster S dwestrundfunk (SWR) has deployed the Rohde & Schwarz Pixel Power graphics and playout solution....

30/05/2024

Hollyland Technology Unveils Pyro Wireless Video Transmission Series

IRVINE, Calif. Hollyland Technology has launched Pyro, a wireless video transmission system designed for the multi-person, mobile transmission and monitoring re...

30/05/2024

AIMS Opens Call for Presentations for Media-Over-IP Pavilion at AES New York

AIMS has announced it will once again collaborate with the Audio Engineering Society (AES) to bring the popular Media-Over-IP Pavilion to the AES New York show,...

30/05/2024

FCC Announces Opportunity for LPTV Stations to Change Channels

WASHINGTON, D.C. The Federal Communications Commission (FCC) Media Bureau has announced that beginning on August 20, 2024, Class A television, Low Power Televis...

30/05/2024

Comcast's StreamSaver Streaming Bundle Goes Live

Comcast has officially launched its discounted $15-a-month StreamSaver streaming bundle of Netflix, Peacock and Apple TV+ services. The previously announced bun...

30/05/2024

AMG Launches TV Stations, Local Now FAST Channels on Amazon's Fire TV Channels

LOS ANGELES Allen Media Group (AMG) is launching three of its streaming brands o...

30/05/2024

Viant Integrates with Google Cloud's BigQuery Data Clean Rooms

IRVINE, Calif. The ad tech company Viant Technology Inc. has announced a new integration with Google Cloud's BigQuery data clean rooms that enables the seam...

30/05/2024

Marshall Brings Selection of New and Proven AV Solutions...

Marshall Electronics will highlight several of its newest product offerings at InfoComm 2024 (Booth C8982), including the CV612 auto-tracking PTZ camera and the...

30/05/2024

Triveni Digital Streamlines TV Operations at the 2024 ATS...

Triveni Digital today announced that the company will showcase its end-to-end ATSC 3.0 offering at the 2024 ATSC NEXTGEN Broadcast Conference, June 12-14 in Was...

30/05/2024

Riedel Bolero Empowers Student Production Excellence at O...

Riedel Communications today announced that Orange County School of the Arts (OCSA) has successfully integrated the Bolero wireless intercom system into their st...

30/05/2024

SWR moves to software playout with integrated Pixel Power...

Rohde & Schwarz has implemented its Pixel Power graphics and playout solution at S dwestrundfunk (SWR), the regional German broadcaster based in Baden-Baden. SW...