Sony Pixel Power calrec Sony

TOPS of the Class: Decoding AI Performance on RTX AI PCs and Workstations

12/06/2024

Editor's note: This post is part of the AI Decoded series, which demystifies AI by making the technology more accessible, and showcases new hardware, software, tools and accelerations for RTX PC users.

The era of the AI PC is here, and it's powered by NVIDIA RTX and GeForce RTX technologies. With it comes a new way to evaluate performance for AI-accelerated tasks, and a new language that can be daunting to decipher when choosing between the desktops and laptops available.

While PC gamers understand frames per second (FPS) and similar stats, measuring AI performance requires new metrics.

Coming Out on TOPS The first baseline is TOPS, or trillions of operations per second. Trillions is the important word here - the processing numbers behind generative AI tasks are absolutely massive. Think of TOPS as a raw performance metric, similar to an engine's horsepower rating. More is better.

Compare, for example, the recently announced Copilot+ PC lineup by Microsoft, which includes neural processing units (NPUs) able to perform upwards of 40 TOPS. Performing 40 TOPS is sufficient for some light AI-assisted tasks, like asking a local chatbot where yesterday's notes are.

But many generative AI tasks are more demanding. NVIDIA RTX and GeForce RTX GPUs deliver unprecedented performance across all generative tasks - the GeForce RTX 4090 GPU offers more than 1,300 TOPS. This is the kind of horsepower needed to handle AI-assisted digital content creation, AI super resolution in PC gaming, generating images from text or video, querying local large language models (LLMs) and more.

Insert Tokens to Play TOPS is only the beginning of the story. LLM performance is measured in the number of tokens generated by the model.

Tokens are the output of the LLM. A token can be a word in a sentence, or even a smaller fragment like punctuation or whitespace. Performance for AI-accelerated tasks can be measured in tokens per second.

Another important factor is batch size, or the number of inputs processed simultaneously in a single inference pass. As an LLM will sit at the core of many modern AI systems, the ability to handle multiple inputs (e.g. from a single application or across multiple applications) will be a key differentiator. While larger batch sizes improve performance for concurrent inputs, they also require more memory, especially when combined with larger models.

The more you batch, the more (time) you save. RTX GPUs are exceptionally well-suited for LLMs due to their large amounts of dedicated video random access memory (VRAM), Tensor Cores and TensorRT-LLM software.

GeForce RTX GPUs offer up to 24GB of high-speed VRAM, and NVIDIA RTX GPUs up to 48GB, which can handle larger models and enable higher batch sizes. RTX GPUs also take advantage of Tensor Cores - dedicated AI accelerators that dramatically speed up the computationally intensive operations required for deep learning and generative AI models. That maximum performance is easily accessed when an application uses the NVIDIA TensorRT software development kit (SDK), which unlocks the highest-performance generative AI on the more than 100 million Windows PCs and workstations powered by RTX GPUs.

The combination of memory, dedicated AI accelerators and optimized software gives RTX GPUs massive throughput gains, especially as batch sizes increase.

Text-to-Image, Faster Than Ever Measuring image generation speed is another way to evaluate performance. One of the most straightforward ways uses Stable Diffusion, a popular image-based AI model that allows users to easily convert text descriptions into complex visual representations.

With Stable Diffusion, users can quickly create and refine images from text prompts to achieve their desired output. When using an RTX GPU, these results can be generated faster than processing the AI model on a CPU or NPU.

That performance is even higher when using the TensorRT extension for the popular Automatic1111 interface. RTX users can generate images from prompts up to 2x faster with the SDXL Base checkpoint - significantly streamlining Stable Diffusion workflows.

ComfyUI, another popular Stable Diffusion user interface, added TensorRT acceleration last week. RTX users can now generate images from prompts up to 60% faster, and can even convert these images to videos using Stable Video Diffuson up to 70% faster with TensorRT.

TensorRT acceleration can be put to the test in the new UL Procyon AI Image Generation benchmark, which delivers speedups of 50% on a GeForce RTX 4080 SUPER GPU compared with the fastest non-TensorRT implementation.

TensorRT acceleration will soon be released for Stable Diffusion 3 - Stability AI's new, highly anticipated text-to-image model - boosting performance by 50%. Plus, the new TensorRT-Model Optimizer enables accelerating performance even further. This results in a 70% speedup compared with the non-TensorRT implementation, along with a 50% reduction in memory consumption.

Of course, seeing is believing - the true test is in the real-world use case of iterating on an original prompt. Users can refine image generation by tweaking prompts significantly faster on RTX GPUs, taking seconds per iteration compared with minutes on a Macbook Pro M3 Max. Plus, users get both speed and security with everything remaining private when running locally on an RTX-powered PC or workstation.

The Results Are in and Open Sourced But don't just take our word for it. The team of AI researchers and engineers behind the open-source Jan.ai recently integrated TensorRT-LLM into its local chatbot app, then tested these optimizations for themselves.

Source: Jan.ai The researchers tested its implementation of TensorRT-LLM against the open-source llama.cpp inference engine across a variety of GPUs and CPUs used by the community. They found that TensorRT is 30-70% faster than llam
LINK: https://blogs.nvidia.com/blog/ai-decoded-tops/...
See more stories from nvidia

Most recent headlines

05/01/2027

Worlds first 802.15.4ab-UWB chip verified by Calterah and Rohde & Schwarz to be demoed at CES 2026

Worlds first 802.15.4ab-UWB chip verified by Calterah and Rohde & Schwarz to be ...

06/09/2026

Dolby and MagentaTV Bring Fans Closer to the FIFA World Cup 2026 in Germany with Dolby Vision and Dolby Atmos

June 9 2026, 23:00 (PDT) Dolby and MagentaTV Bring Fans Closer to the FIFA Worl...

04/08/2026

Dalet Announces Commercial Availability of Dalia, Bringing Media-Aware Agentic AI to Enterprise Productions

Dalet, a leading technology and service provider for media-rich organizations, t...

04/07/2026

Detective Conan: Fallen Angel of the Highway Opens in Dolby Cinemas Across Japan, Presented in Dolby Atmos and Dolby ...

April 7 2026, 19:00 (PDT) Detective Conan: Fallen Angel of the Highway Opens in...

22/06/2026

Kaleidescape Breaks the 8K and 4:4:4 Barriers

Share Copy link Facebook X Linkedin Bluesky Email...

22/06/2026

Xilica introduces Dynamic Voice Lift in new Designer

Xilica today announced the release of Dynamic Voice Lift, a new feature in Xilica Designer v4.12 that brings adaptive speech reinforcement to large meeting spac...

22/06/2026

Eco Wave Power Turns Waves Into Watts With NVIDIA AI Infrastructure and Digital Twins

The next era of AI will not be defined by compute alone. Its growth will be dete...

22/06/2026

NVIDIA Vera CPU Opens the Way for Agentic Scientific AI at Los Alamos National Laboratory

Mission, Vision and Veritas - new Los Alamos National Laboratory (LANL) supercom...

22/06/2026

From Materials Simulation to Experimental Astronomy, New NVIDIA AI Software Unlocks Scientific Discoveries

At the ISC conference running in Hamburg this week, NVIDIA is introducing new so...

22/06/2026

NAIRR Science Program Reshapes Scientific Research, Powered by NVIDIA AI Infrastructure

For the past two years, the U.S. National Science Foundation's National Arti...

22/06/2026

At ISC, JUPITER Shows What Exascale Science Looks Like

JUPITER, Europe's first exascale supercomputer at Germany's Forschungszentrum J lich, runs on NVIDIA Grace Hopper Superchips and NVIDIA Quantum-X800 Inf...

21/06/2026

FIFAs Oscar Sanchez on World Cup Effort: Were Feeling Good and Where We Want to Be

To call the 2026 FIFA World Cup a big undertaking would be a big understatement....

21/06/2026

John Walden's Cubase Video Tutorials

New series now live on Udemy Regular SOS contributor and Cubase workshop columnist John Walden has just released a new Cubase video course that is now avail...

21/06/2026

Hotter Than a Hot Tub: The 45C Breakthrough to Cool AI's Biggest Machines

Hot tubs sit at about 38 to 40 degrees Celsius, warm enough that most people can only soak for about 15 minutes. NVIDIA's newest AI servers can run their co...

21/06/2026

Sky announces immersive documentary series The Wargame

Sunday 21 June 2026 Sky announces immersive documentary series The Wargame The Wargame first looks ZIP (2MB) Sky today confirms the commission of The Wargam...

20/06/2026

IK Multimedia introduce ReSing Doubling

New add-on creates doubles & vocal stacks IK Multimedia's latest ReSing add-on kits the innovative software out with the ability to automatically genera...

20/06/2026

What's Next for Apogee? Start Here.

What exactly is Apogee Control V3? Control V3 is a new mixer application that controls Apogee interfaces. The new hit feature is that V3 finally allows for...

19/06/2026

NBC Sports U.S. Open Coverage Fires Up 92 Cameras, Bunker cams

Split compound eases operational challenges at Shinnecock Hills Golf Club...

19/06/2026

ESPN's Men's College World Series Production Adds Onsite Studio, POVORA CapCams, Expanded Drone Coverage for Finale in Omaha

North Carolina, Oklahoma meet in the best-of-three Finals as ESPN leans into spe...

19/06/2026

Ninja AB from The Him DSP

Company launch comprehensive mix-comparison tool The Him DSP are a plug-in company founded by The Him, an EDM DJ and producer who has amassed over half a bi...

19/06/2026

Bitwig Studio 6.1 enters beta testing

Major Sampler upgrades introduced The latest version of Bitwig's DAW software has just entered public beta testing, and is available now for all users w...

19/06/2026

Akai Pro's MPC One & MPC Key 37 get G2 upgrade

Four times the power of their predecessors Akai Pro have just introduced upgraded versions of two of their popular standalone MPC systems, kitting them out ...

19/06/2026

Eurovision secures top four position as content distributor rankings hold steady in Poland

Data from May shows seasonal outdoor trends triggers lower viewing Warsaw, Pola...

19/06/2026

Bitfocus Buttons wins another top industry award

Buttons is best control system in the rAVe Best of Infocomm Awards 2026...

19/06/2026

Mavis Studio Makes iPad Production More Powerful

Mavis Studio Makes iPad Production More Powerful Brie Clayton June 19, 2026 0 Comments InfoComm update brings new NDI Preview, PTZ control, USB audio ...

19/06/2026

Immersive Studio Metaverse Stage Tackles Post with Blackmagic Design

Immersive Studio Metaverse Stage Tackles Post with Blackmagic Design Brie Clayton June 19, 2026 0 Comments New narrative projects rely on DaVinci Reso...

19/06/2026

How to Run the Original 1993 After Effects

How to Run the Original 1993 After Effects Graham Quince June 19, 2026 0 Comments How to the original After Effects v1 in an emulator, and you don'...

19/06/2026

IBC Show to Increase Focus on Networking, Startups

Share Copy link Facebook X Linkedin Bluesky Email...

19/06/2026

Irdeto Taps Axel Gallant as CEO

Share Copy link Facebook X Linkedin Bluesky Email...

19/06/2026

SMPTE Makes Its Standards Freely Accessible - Opening St...

SMPTE , the home of media professionals, technologists and engineers, has announced that its entire Standards catalog is now freely available to the global medi...

19/06/2026

nsign and BrightSign partner to expand deployment options...

nsign, the digital signage SaaS platform built around its core principle of Simplify Complexity, has announced a partnership with BrightSign , expanding the dep...

19/06/2026

Visual Productions Unveils RdmRelay2 Four-channel Relay...

Visual Productions announces the availability of its new RdmRelay2 at InfoComm 2026 (ACT Entertainment, Booth N6813). A networked, four-channel DMX relay, it is...

19/06/2026

Adobe Unveils Major Expansion of Creative Agent Across Firefly and Creative Cloud Apps Including Photoshop and Premiere

Adobe Unveils Major Expansion of Creative Agent Across Firefly and Creative Clou...

19/06/2026

Immersive Studio Metaverse Stage Innovates Storytelling with URSA Cine Immersive

Immersive Studio Metaverse Stage Innovates Storytelling with URSA Cine Immersive Brie Clayton June 18, 2026 0 Comments Two new narrative short films c...

19/06/2026

How to watch the 2026/27 Premier League season on Sky Sports

Friday 19 June 2026 How to watch the 2026/27 Premier League season on Sky Sports Which matches are Sky Sports showing on the 2026/27 Premier League opening we...

19/06/2026

What's New at FilmLight? New York. 8 July 2026

Catch up on the latest developments across Baselight and Daylight v7, Nara and FilmLight API Wednesday 8 July, 5pm onwards Firehouse: DCTV, 87 Lafayette St, Ne...

19/06/2026

June 18, 2026

Lab studies explain how new cancer drug works as it enters patient testing Immunologists at Scripps Research show how a new, experimental drug revives immune ce...

18/06/2026

Ratings Roundup: Knicks-Spurs NBA Finals Is Most Watched Since Jordan Bulls Era; FIFA World Cup Opens Big

Ratings Roundup is a rundown of recent rating news and is derived from press rel...

18/06/2026

InfoComm 2026: PTZOptics Debuts Healthcare Visual Reasoning Integration with LayerJot

PTZOptics has unveiled new Visual Reasoning demonstrations at InfoComm 2026 (Boo...

18/06/2026

IBC 2026: Announces New Future Tech Ignite Initiative, Expanded Exhibitor Participation

IBC2026 will take place at the RAI Amsterdam from September 11-14, bringing toge...

18/06/2026

InfoComm 2026 Holds Inaugural Media Day with Product Announcements from 19 Exhibitors

InfoComm 2026 held its first-ever Media Day on June 17, providing journalists an...

18/06/2026

Info Comm 2026: FOR-A America Announces TAA-Compliant LED Display Package

FOR-A America has announced a Trade Agreements Act (TAA)-compliant LED display solution combining Alfalite's Litepix LED displays and Brompton Technology...

18/06/2026

SVG GameDay, Ep. 20: Cleveland Browns Kyle Millen - Run of Shows & Rock n Roll

In-venue and creative video staffers at the professional and collegiate level have one major thing in common: the intensity and attention to detail ramps up dur...

18/06/2026

International Federation of American Football, TMRW Sports Partner on Global Growth of Flag Football

The International Federation of American Football (IFAF) and TMRW Sports have an...

18/06/2026

AJA Debuts Io Xpand Thunderbolt 5 Expansion Chassis for KONA and Corvid PCIe I/O Cards

AJA Video Systems has unveiled Io Xpand, a Thunderbolt 5-enabled PCIe expansion ...

18/06/2026

ESPN Marks 30th Anniversary of WNBA's Inaugural Game With Liberty-Sparks Broadcast

ESPN has announced its coverage plans for the 30th anniversary of the WNBA's...

18/06/2026

FOX Sports' Big Noon Kickoff Heads to London for Union Jack Classic From Wembley Stadium

FOX Sports' Big Noon Kickoff will broadcast live from Wembley Stadium in Lon...

18/06/2026

InfoComm 2026: Show Opens With Microsoft Keynote, Media Day, and New Industry Initiatives

InfoComm 2026 opened on Wednesday at the Las Vegas Convention Center, bringing t...

18/06/2026

SVG New Sponsor Spotlight: Akta's Matt Smith on Building AI Into the Foundation of the Video Workflow

As media companies look to deliver more live, VOD, and snackable sports content ...

18/06/2026

2026 Sundance Film Festival: Local Lens

Top L-R: Take Me Home, The Lake Bottom L-R: TheyDream, Union County Free Summer Screening Series Announced Screenings for the Local Utah Community at...