Sony Pixel Power calrec Sony

New Performance Optimizations Supercharge NVIDIA RTX AI PCs for Gamers, Creators and Developers


NVIDIA today announced at Microsoft Build new AI performance optimizations and integrations for Windows that help deliver maximum performance on NVIDIA GeForce RTX AI PCs and NVIDIA RTX workstations.

Large language models (LLMs) power some of the most exciting new use cases in generative AI and now run up to 3x faster with ONNX Runtime (ORT) and DirectML using the new NVIDIA R555 Game Ready Driver. ORT and DirectML are high-performance tools used to run AI models locally on Windows PCs.

WebNN, an application programming interface for web developers to deploy AI models, is now accelerated with RTX via DirectML, enabling web apps to incorporate fast, AI-powered capabilities. And PyTorch will support DirectML execution backends, enabling Windows developers to train and infer complex AI models on Windows natively. NVIDIA and Microsoft are collaborating to scale performance on RTX GPUs.

These advancements build on NVIDIA's world-leading AI platform, which accelerates more than 500 applications and games on over 100 million RTX AI PCs and workstations worldwide.

RTX AI PCs - Enhanced AI for Gamers, Creators and Developers NVIDIA introduced the first PC GPUs with dedicated AI acceleration, the GeForce RTX 20 Series with Tensor Cores, along with the first widely adopted AI model to run on Windows, NVIDIA DLSS, in 2018. Its latest GPUs offer up to 1,300 trillion operations per second of dedicated AI performance.

In the coming months, Copilot+ PCs equipped with new power-efficient systems-on-a-chip and RTX GPUs will be released, giving gamers, creators, enthusiasts and developers increased performance to tackle demanding local AI workloads, along with Microsoft's new Copilot+ features.

For gamers on RTX AI PCs, NVIDIA DLSS boosts frame rates by up to 4x, while NVIDIA ACE brings game characters to life with AI-driven dialogue, animation and speech.

For content creators, RTX powers AI-assisted production workflows in apps like Adobe Premiere, Blackmagic Design DaVinci Resolve and Blender to automate tedious tasks and streamline workflows. From 3D denoising and accelerated rendering to text-to-image and video generation, these tools empower artists to bring their visions to life.

For game modders, NVIDIA RTX Remix, built on the NVIDIA Omniverse platform, provides AI-accelerated tools to create RTX remasters of classic PC games. It makes it easier than ever to capture game assets, enhance materials with generative AI tools and incorporate full ray tracing.

For livestreamers, the NVIDIA Broadcast application delivers high-quality AI-powered background subtraction and noise removal, while NVIDIA RTX Video provides AI-powered upscaling and auto-high-dynamic range to enhance streamed video quality.

Enhancing productivity, LLMs powered by RTX GPUs execute AI assistants and copilots faster, and can process multiple requests simultaneously.

And RTX AI PCs allow developers to build and fine-tune AI models directly on their devices using NVIDIA's AI developer tools, which include NVIDIA AI Workbench, NVIDIA cuDNN and CUDA on Windows Subsystem for Linux. Developers also have access to RTX-accelerated AI frameworks and software development kits like NVIDIA TensorRT, NVIDIA Maxine and RTX Video.

The combination of AI capabilities and performance deliver enhanced experiences for gamers, creators and developers.

Faster LLMs and New Capabilities for Web Developers Microsoft recently released the generative AI extension for ORT, a cross-platform library for AI inference. The extension adds support for optimization techniques like quantization for LLMs like Phi-3, Llama 3, Gemma and Mistral. ORT supports different execution providers for inferencing via various software and hardware stacks, including DirectML.

ORT with the DirectML backend offers Windows AI developers a quick path to develop AI capabilities, with stability and production-grade support for the broad Windows PC ecosystem. NVIDIA optimizations for the generative AI extension for ORT, available now in R555 Game Ready, Studio and NVIDIA RTX Enterprise Drivers, help developers get up to 3x faster performance on RTX compared to previous drivers.

Inference performance for three LLMs using ONNX Runtime and the DirectML execution provider with the latest R555 GeForce driver compared to the previous R550 driver. INSEQ=2000 representative of document summarization workloads. All data captured with GeForce RTX 4090 GPU using batch size 1. The generative AI extension support for int4 quantization, plus the NVIDIA optimizations, result in up to 3x faster performance for LLMs. Developers can unlock the full capabilities of RTX hardware with the new R555 driver, bringing better AI experiences to consumers, faster. It includes:

Support for DQ-GEMM metacommand to handle INT4 weight-only quantization for LLMs

New RMSNorm normalization methods for Llama 2, Llama 3, Mistral and Phi-3 models

Group and multi-query attention mechanisms, and sliding window attention to support Mistral

In-place KV updates to improve attention performance

Support for GEMM of non-multiple-of-8 tensors to improve context phase performance

Additionally, NVIDIA has optimized AI workflows within WebNN to deliver the powerful performance of RTX GPUs directly within browsers. The WebNN standard helps web app developers accelerate deep learning models with on-device AI accelerators, like Tensor Cores.

Now available in developer preview, WebNN uses DirectML and ORT Web, a Javascript library for in-browser model execution, to make AI applications more accessible across multiple platforms. With this acceleration, popular models like Stable Diffusion, SD Turbo and Whisper run up to 4x faster on WebNN compared to WebGPU and are now available for developers to use. Microsoft Build attendees can learn more about developing on RTX in the Accelerating development on Windows PCs w
See more stories from nvidia

More from Nvidia


The Proudest Refugee': How Veronica Miller Charts Her Own Path at NVIDIA

When she was five years old, Veronica Miller (n e Teklai) and her family left their homeland of Eritrea, in the Horn of Africa, to escape an ongoing war with Et...


NVIDIA Releases Open Synthetic Data Generation Pipeline for Training Large Language Models

NVIDIA today announced Nemotron-4 340B, a family of open models that developers ...


Cloud Ahoy! Treasure Awaits With Sea of Thieves' on GeForce NOW

Set sail for adventure, pirates. Sea of Thieves makes waves in the cloud this week. It's an adventure-filled GFN Thursday with four new games joining the Ge...


Every Company's Data is Their Gold Mine,' NVIDIA CEO Says at Databricks Data + AI Summit

Accelerated computing is transforming data processing and analytics for enterpri...


Scaling to New Heights: NVIDIA MLPerf Training Results Showcase Unprecedented Performance and Elasticity

The full-stack NVIDIA accelerated computing platform has once again demonstrated...


Nerding About NeRFs: How Neural Radiance Fields Transform 2D Images Into Hyperrealistic 3D Models

Let's talk about NeRFs - no, not the neon-colored foam dart blasters, but ne...


TOPS of the Class: Decoding AI Performance on RTX AI PCs and Workstations

Editor's note: This post is part of the AI Decoded series, which demystifies AI by making the technology more accessible, and showcases new hardware, softwa...


Why Accelerated Data Processing Is Crucial for AI Innovation in Every Industry

Across industries, AI is supercharging innovation with machine-powered computation. In finance, bankers are using AI to detect fraud more quickly and keep accou...


Here Comes a New Challenger: Street Fighter 6' Joins GeForce NOW

Capcom's latest entry in the iconic Street Fighter series, Street Fighter 6, punches its way into the cloud this GFN Thursday. The game, along with Ubisoft&...


Yotta CEO Sunil Gupta on Supercharging India's Fast-Growing AI Market

India's AI market is expected to be massive. Yotta Data Services is setting its sights on supercharging it. In this episode of NVIDIA's AI Podcast, Suni...


Creativity Accelerated: New RTX-Powered AI Hardware and Software Announced at COMPUTEX

NVIDIA launched NVIDIA Studio at COMPUTEX in 2019. Five years and more than 500 ...


SAP and NVIDIA Create AI for The Most Valuable Language,' CEOs Unveil at Sapphire Orlando

German enterprise cloud leader SAP is harnessing generative AI and industrial di...


NVIDIA and Cisco Weave Fabric for Generative AI

Building and deploying AI applications at scale requires a new class of computing infrastructure - one that can handle the massive amounts of data, compute powe...


Digital Bank Debunks Financial Fraud With Generative AI

European neobank bunq is debunking financial fraudsters with the help of NVIDIA accelerated computing and AI. Dubbed the bank of the free, bunq offers online...


Foxconn Trains Robots, Streamlines Assembly With NVIDIA AI and Omniverse

Foxconn operates more than 170 factories around the world - the latest one a virtual plant pushing the state of the art in industrial automation. It's the ...


Taiwan Electronics Giants Drive Industrial Automation With NVIDIA Metropolis and NIM

Taiwan's leading consumer electronics giants are making advances with AI aut...


KServe Providers Dish Up NIMble Inference in Clouds and Data Centers

Deploying generative AI in the enterprise is about to get easier than ever. NVIDIA NIM, a set of generative AI inference microservices, works with KServe, open...


Accelerate Everything,' NVIDIA CEO Says Ahead of COMPUTEX

Generative AI is reshaping industries and opening new opportunities for innovation and growth, NVIDIA founder and CEO Jensen Huang said in an address ahead of ...


Power Tool: Generative AI Tracks Typhoons, Tames Energy Use

Weather forecasters in Taiwan had their hair blown back when they saw a typhoon up close, created on a computer that slashed the time and energy needed for the ...


NVIDIA Grace Hopper Superchip Accelerates Murex MX.3 Analytics Performance, Reduces Power Consumption

After the 2008 financial crisis and increased risk-management regulations that f...


Elevate Your Expertise: NVIDIA Introduces AI Infrastructure and Operations Training and Certification

NVIDIA has introduced a self-paced course, called AI Infrastructure and Operatio...


GeForce NOW Brings the Heat With World of Warcraft'

World of Warcraft comes to the cloud this week, part of the 17 games joining the GeForce NOW library, with seven available to stream this week. Plus, it's ...


Riding the Wayve of AV 2.0, Driven by Generative AI

Generative AI is propelling AV 2.0, a new era in autonomous vehicle technology characterized by large, unified, end-to-end AI models capable of managing various...


Tidy Tech: How Two Stanford Students Are Building Robots for Handling Household Chores

Imagine having a robot that could help you clean up after a party - or fold heap...


Decoding How NVIDIA RTX AI PCs and Workstations Tap the Cloud to Supercharge Generative AI

Editor's note: This post is part of the AI Decoded series, which demystifies...


NVIDIA Scoops Up Wins at COMPUTEX Best Choice Awards

Building on more than a dozen years of stacking wins at the COMPUTEX trade show's annual Best Choice Awards, NVIDIA was today honored with BCAs for its late...


Senua's Story Continues: GeForce NOW Brings Senua's Saga: Hellblade II' to the Cloud

Every week, GFN Thursday brings new games to the cloud, featuring some of the la...


Into the Omniverse: SoftServe and Continental Drive Digitalization With OpenUSD and Generative AI

Editor's note: This post is part of Into the Omniverse, a series focused on ...


Watt a Win: NVIDIA Sweeps New Ranking of World's Most Energy-Efficient Supercomputers

In the latest ranking of the world's most energy-efficient supercomputers, k...


New Performance Optimizations Supercharge NVIDIA RTX AI PCs for Gamers, Creators and Developers

NVIDIA today announced at Microsoft Build new AI performance optimizations and i...


NVIDIA Expands Collaboration With Microsoft to Help Developers Build, Deploy AI Applications Faster

If optimized AI workflows are like a perfectly tuned orchestra - where each comp...


A Superbloom of Updates in the May Studio Driver Gives Fresh Life to Content Creation

Editor's note: This post is part of our In the NVIDIA Studio series, which c...


Every Company to Be an Intelligence Manufacturer,' Declares NVIDIA CEO Jensen Huang at Dell Technologies World

AI heralds a new era of innovation for every business in every industry, NVIDIA ...


Fight for Honor in Men of War II' on GFN Thursday

Whether looking for new adventures, epic storylines or games to play with a friend, GeForce NOW members are covered. Start off with the much-anticipated sequel...


NVIDIA, Teradyne and Siemens Gather in the City of Robotics' to Discuss Autonomous Machines and AI

Senior executives from NVIDIA, Siemens and Teradyne Robotics gathered this week ...


Fire It Up: Mozilla Firefox Adds Support for AI-Powered NVIDIA RTX Video

Editor's note: This post is part of the AI Decoded series, which demystifies AI by making the technology more accessible, and which showcases new hardware, ...


How Basecamp Research Helps Catalog Earth's Biodiversity

Basecamp Research is on a mission to capture the vastness of life on Earth at an unprecedented scale. Phil Lorenz, CTO at Basecamp Research, discusses using AI ...


Needle-Moving AI Research Trains Surgical Robots in Simulation

A collaboration between NVIDIA and academic researchers is prepping robots for surgery. ORBIT-Surgical - developed by researchers from the University of Toront...


Gemma, Meet NIM: NVIDIA Teams Up With Google DeepMind to Drive Large Language Model Innovation

Large language models that power generative AI are seeing intense innovation - m...


Drug Discovery, STAT! NVIDIA, Recursion Speed Pharma R&D With AI Supercomputer

Described as the largest system in the pharmaceutical industry, BioHive-2 at the Salt Lake City headquarters of Recursion debuts today at No. 35, up more than 1...


Drug Discovery, STAT! NVIDIA, Recursion Speed Pharma R&D With AI Supercomputer

Described as the largest system in the pharmaceutical industry, BioHive-2 at the...


Dial It In: Data Centers Need New Metric for Energy Efficiency

Data centers need an upgraded dashboard to guide their journey to greater energy efficiency, one that shows progress running real-world applications. The formu...


Generating Science: NVIDIA AI Accelerates HPC Research

Generative AI is taking root at national and corporate labs, accelerating high-performance computing for business and science. Researchers at Sandia National L...


NVIDIA Blackwell Platform Pushes the Boundaries of Scientific Computing

Quantum computing. Drug discovery. Fusion energy. Scientific computing and physics-based simulations are poised to make giant steps across domains that benefit ...


Through the Wormhole: Media.Monks' Vision for Enhancing Media and Marketing With AI

Meet Media.Monks' Wormhole, an alien-like, conversational robot with a quirk...


Honkai: Star Rail' Blasts Off on GeForce NOW

Gear up, Trailblazers - Honkai: Star Rail lands on GeForce NOW this week, along with an in-game reward for members to celebrate the title's launch in the cl...


Get On the Train' NVIDIA CEO Says at ServiceNow's Knowledge 2024

Now's the time to hop aboard AI, NVIDIA founder and CEO Jensen Huang declared Wednesday as ServiceNow unveiled a demo of futuristic AI avatars together with...


‘Get On the Train,’ NVIDIA CEO Says at ServiceNow's Knowledge 2024

Now's the time to hop aboard AI, NVIDIA founder and CEO Jensen Huang declare...


NVIDIA CEO Jensen Huang to Deliver Keynote Ahead of COMPUTEX 2024

Amid an AI revolution sweeping through trillion-dollar industries worldwide, NVIDIA founder and CEO Jensen Huang will deliver a keynote address ahead of COMPUTE...


AI Decoded: New DaVinci Resolve Tools Bring RTX-Accelerated Renaissance to Editors

AI tools accelerated by NVIDIA RTX have made it easier than ever to edit and wor...