Sony Pixel Power calrec Sony

NVIDIA Rubin CPX Accelerates Inference Performance and Efficiency for 1M+ Token Context Workloads

09/09/2025

Inference has emerged as the new frontier of complexity in AI. Modern models are evolving into agentic systems capable of multi-step reasoning, persistent memory, and long-horizon context-enabling them to tackle complex tasks across domains such as software development, video generation, and deep research. These workloads place unprecedented demands on infrastructure, introducing new challenges in compute, memory, and networking that require a fundamental rethinking of how inference is scaled and optimized.

Among these challenges, processing massive context for a specific class of workloads has become increasingly critical. In software development, for example, AI systems must reason over entire codebases, maintain cross-file dependencies, and understand repository-level structure-transforming coding assistants from autocomplete tools into intelligent collaborators. Similarly, long-form video and research applications demand sustained coherence and memory across millions of tokens. These requirements are pushing the boundaries of what current infrastructure can support.

To address this shift, the NVIDIA SMART framework provides a path forward-optimizing inference across scale, multidimensional performance, architecture, ROI, and the broader technology ecosystem. It emphasizes a full-stack disaggregated infrastructure that enables efficient allocation of compute and memory resources. Platforms like NVIDIA Blackwell and NVIDIA GB200 NVL72, combined with NVFP4 for low-precision inference and open source software such as NVIDIA TensorRT-LLM and NVIDIA Dynamo, are redefining inference performance across the AI landscape.

This blog explores the next evolution in disaggregated inference infrastructure and introduces NVIDIA Rubin CPX-a purpose-built GPU designed to meet the demands of long-context AI workloads with greater efficiency and ROI.

Disaggregated inference: a scalable approach to AI complexity Inference consists of two distinct phases: the context phase and the generation phase, each placing fundamentally different demands on infrastructure. The context phase is compute-bound, requiring high-throughput processing to ingest and analyze large volumes of input data to produce the first token output result. In contrast, the generation phase is memory bandwidth-bound, relying on fast memory transfers and high-speed interconnects, such as NVLink, to sustain token-by-token output performance.

Disaggregated inference enables these phases to be processed independently, enabling targeted optimization of compute and memory resources. This architectural shift improves throughput, reduces latency, and enhances overall resource utilization (Figure 1).

data-src=https://developer-blogs.nvidia.com/wp-content/uploads/2025/09/Disaggregated-inference.gif alt=Diagram of a disaggregated inference pipeline. Documents/databases/videos feed a context processor (shown as GPU B with a swap to GPU A); its output goes to a key-value cache read by a GPU B generation node to produce results. Labels note GPU A is optimized for long-context processing, while GPU B delivers strong TCO for both context and generation. class=lazyload wp-image-105631/>Figure 1. Optimizing inference by aligning GPU capabilities with context and generation workloads

However, disaggregation introduces new layers of complexity, requiring precise coordination across low-latency KV cache transfers, LLM-aware routing, and efficient memory management. NVIDIA Dynamo serves as the orchestration layer for these components, and its capabilities played a pivotal role in the latest MLPerf Inference results. Learn how disaggregation with Dynamo on GB200 NVL72 set new performance records.

To capitalize on the benefits of disaggregated inference-particularly in the compute-intensive context phase-specialized acceleration is essential. Addressing this need, NVIDIA is introducing Rubin CPX GPU-a purpose-built solution designed to deliver high-throughput performance for high-value long-context inference workloads while seamlessly integrating into disaggregated infrastructure.

Rubin CPX: built to accelerate long-context processing The Rubin CPX GPU is designed to enhance long-context performance, complementing existing infrastructure while delivering scalable efficiency and maximizing ROI in context-aware inference deployments. Rubin CPX, built with the Rubin architecture, delivers breakthrough performance for the compute-intensive context phase of inference. It features 30 petaFLOPs of NVFP4 compute, 128 GB of GDDR7 memory, hardware support for video decoding and encoding, and 3x attention acceleration (compared to NVIDIA GB300 NVL72).

Optimized for efficiently processing long sequences, Rubin CPX is critical for high-value inference use cases like software application development and HD video generation. Designed to complement existing disaggregated inference architectures, it enhances throughput and responsiveness while maximizing ROI for large-scale generative AI workloads.

Rubin CPX works in tandem with NVIDIA Vera CPUs and Rubin GPUs for generation-phase processing, forming a complete, high-performance disaggregated serving solution for long-context use cases. The NVIDIA Vera Rubin NVL144 CPX rack integrates 144 Rubin CPX GPUs, 144 Rubin GPUs, and 36 Vera CPUs to deliver 8 exaFLOPs of NVFP4 compute-7.5 more than the GB300 NVL72-alongside 100 TB of high-speed memory and 1.7 PB/s of memory bandwidth, all within a single rack.

Using NVIDIA Quantum-X800 InfiniBand or Spectrum-X Ethernet, paired with NVIDIA ConnectX-9 SuperNICs and orchestrated by the Dynamo platform, the Vera Rubin NVL144 CPX is built to power the next wave of million-token context AI inference workloads-cutting inference costs and unlocking advanced capabilities for developers and creators worldwide.

At scale, the platform can deliver 30x to 50x return on investment, transl
LINK: https://developer.nvidia.com/blog/nvidia-rubin-cpx-accelerates-inferen...
See more stories from nvidia

Most recent headlines

09/11/2025

Dalet Unveils Agentic AI Media Workflows at IBC2025

Dalet today announced a transformative leap forward for media operations: Agentic Artificial Intelligence (AI) that unifies the Dalet ecosystem under one natura...

28/10/2025

SVG All-Stars: Catherine Chalfant, Manager, Remote Operations, ESPN

SVG All-Stars: Catherine Chalfant, Manager, Remote Operations, ESPNThe Ole Miss alum is an operational force behind ESPN's extensive college-football catalo...

28/10/2025

Elevating the Experience: AI and Data Take Ryder Cup to the Next Level

Elevating the experience: AI and data take Ryder Cup to the next level By Joe OHalloran Tuesday, October 28, 2025 - 10:25 Print This Story NBC produced th...

28/10/2025

Conquering the Air (Waves): Taking a Close Up Look at the IBC Accelerator Private 5G from Land to Sea to Sky'

Conquering the Air (waves): Taking a close up look at the IBC Accelerator Priva...

28/10/2025

World Series 2025: Spectrum SportsNet LA Brings Dodgers Fans Closer to the Action With Pre/Postgame Coverage

World Series 2025: Spectrum SportsNet LA Brings Dodgers Fans Closer to the Actio...

28/10/2025

The Thing with Feathers Brings the Horror of Grief to the Screen

Dylan Southern and Benedict Cumberbatch at the premiere of The Thing with Feathers (photo by George Pimentel / Shutterstock for Sundance Film Festival)...

28/10/2025

Spotify's Greasy Tunes Caf Serves Up the Sights, Sounds, and Flavors of Lagos

For three weeks in Lagos, Spotify's Greasy Tunes Caf pop-up brought the cit...

28/10/2025

Spotify's New OFFCULT Playlist Is a Love Letter to the Future of German Rap

Once a niche subculture, German rap has evolved into an influential cultural movement. Now, Spotify is giving the genre a new home with OFFCULT, a playlist dedi...

28/10/2025

Shane Delia's Malta serves up the Mediterranean this summer

Shane Delia's Malta serves up the Mediterranean this summer 28 October, 2025 Media releases Feast on 9,000 years of culinary history Mondays from 24 No...

28/10/2025

SBS's global sporting festival continues with the FIVB Beach Volleyball World Championships in Adelaide

SBS's global sporting festival continues with the FIVB Beach Volleyball Worl...

28/10/2025

AgileTV achieves ISO/IEC 27001 certification, strengthening its commitment to secure and reliable video services

Bilbao, October 28, 2025 - AgileTV, a leading technology solutions company for t...

28/10/2025

Football Scores Extra Points for Multi-Platform Companies in Nielsen's September Media Distributor Gauge

Disney, NBCUniversal, FOX, Paramount Each Achieve Double-Digit Monthly Growth ...

28/10/2025

Scripps to Sell WRTV to Circle City Broadcasting for $83 million

CINCINNATI The E.W. Scripps Company has announced an agreement to sell WRTV, its local ABC-affiliated station in Indianapolis, to Circle City Broadcasting for $...

28/10/2025

Berklee College of Music and Berklee Valencia Named to Billboards 2025 Top Music Business Schools List

Berklee College of Music and Berklee Valencia Named to Billboards 2025 Top Music...

28/10/2025

Survey: Consumers Rank AI as a Major Influence on Their Shopping Decisions

NEW YORK As AI usage continues to spike, a new study from IAB delves into an important aspect of how AI is transforming the advertising business with new data s...

28/10/2025

Broadcast Tech Pioneer Charlie Jablonski Has Died

Charlie Jablonski, a broadcast tech pioneer who helped shape the modern era of Olympics television coverage, died Oct. 25 at his home in Lake George N.Y., the N...

28/10/2025

Bitmovin Unveils Real-Time Observability Solution for Video Streaming

VIENNA, Austria Bitmovin has launched Bitmovin Observability, a new stand-alone video data solution that delivers real-time insights into video playback. The so...

28/10/2025

LucidLink Now Integrated With Adobe Frame.io

LOS ANGELES LucidLink, the file streaming platform, has announced a Frame.io integration and expanded mobile capabilities at Adobe Max....

28/10/2025

Mediagenix Joins AWS ISV Accelerate Program

Mediagenix, a global leader in smart content solutions to profitably connect the right content to the right audience, today announced that it has joined the Ama...

28/10/2025

Lightware Taurus product family introduces 5K support

Lightware, an industry leader in connectivity and signal management solutions, has announced a major update to its Taurus platform, which now delivers flawless...

28/10/2025

Hiltron to Promote its Broad Range of Satcom Products and...

Following a successful mid-September International Broadcasting Convention in Amsterdam, Hiltron Communications will promote its full range of satellite communi...

28/10/2025

Open Broadcast Systems Selects Media Consulting and Servi...

Open Broadcast Systems has chosen MC&S (Media Consulting & Services) as a reseller to help strengthen its presence in France. With over twenty years of experi...

28/10/2025

Bitmovin Unveils Real-Time Observability Solution for Vid...

Bitmovin, leading provider of video streaming solutions, has launched Bitmovin Observability, a new stand-alone video data solution that delivers real-time insi...

28/10/2025

Ease Live Powers Interactive Champions League Viewer Expe...

Ease Live, the leader in interactive TV technology, today announced the successful launch of interactive graphical overlays for UEFA Champions League matches fo...

28/10/2025

LucidLink unveils Frame io integration and expanded mobil...

LucidLink, the file streaming platform, today at Adobe MAX announced a Frame.io integration and expanded mobile capabilities, streamlining collaboration and hel...

28/10/2025

Nick Hascenez Named GM of WNDU South Bend

ATLANTA Gray Media has promoted Nick Hasenecz to general manager of WNDU, its NBC affiliate in the South Bend-Elkhart, Ind., market....

28/10/2025

Applications Open for Berklee Fenway Neighborhood Improvement Grant

Applications Open for Berklee Fenway Neighborhood Improvement Grant Boston nonprofits can apply by December 12 for funding to support community projects that ...

28/10/2025

VEON's JazzCash Wins Silver Award for Innovation in Lending at Money20/20 USA 2025

28 Oct 2025 VEON's JazzCash Wins Silver Award for Innovation in Lending at ...

28/10/2025

A League of Their Own returns on 12 November as Romesh, Jamie, Jill and Micah celebrate the farewell series with a star-studded line-up

Guests include Wayne Rooney, Maya Jama, Dame Laura Kenny, Chloe Kelly, Chris McC...

28/10/2025

Cynthia Erivo and Ariana Grande to lead a once-in-a-lifetime musical event, Wicked: One Wonderful Night

The two-hour special, recorded live from the iconic Dolby Theatre in LA, will ai...

28/10/2025

Eutelsat upgrades teleports with Rohde & Schwarz satellite uplink amplifiers

Eutelsat upgrades teleports with Rohde & Schwarz satellite uplink amplifiers High efficiency and resilient Ku-band amplifiers for excellent RF performance ...

28/10/2025

ABC Welcomes Three New Board Members

These appointments come at a pivotal time for ABC, as the organisation continues to evolve to meet the changing needs of a digital-first media ecosystem. The ne...

28/10/2025

October 27, 2025

Scripps Research awarded $4 million to advance platform for neurodevelopmental disorders The California Institute for Regenerative Medicine (CIRM) grant support...

27/10/2025

You Can Touch This: Haptics Becoming Central to the Virtual Live Experience

You can touch this: Haptics becoming central to the virtual live experience By Adrian Pennington Friday, October 24, 2025 - 09:12 Print This Story The vid...

27/10/2025

A Tale of Two Trailers: France's Stop & Go Doubles Up With its New Hybrid Truck

A tale of two trailers: France's Stop & Go doubles up with its new hybrid tr...

27/10/2025

Pro Padel League Stages City's Cup Finals Inside NYC's Hammerstein Ballroom

Pro Padel League Stages City's Cup Finals Inside NYC's Hammerstein Ballr...

27/10/2025

World Series 2025: Sportsnet Delivers Made-in-Canada' Moment for a Nation United Behind the Toronto Blue Jays

World Series 2025: Sportsnet Delivers Made-in-Canada' Moment for a Nation U...

27/10/2025

ESPN Extends Partnership With Sony's Beyond Sports To Expand Animated Alternate Telecasts

ESPN Extends Partnership With Sony's Beyond Sports To Expand Animated Altern...

27/10/2025

Life After Examines the Implications of a Growing Right-to-Die Movement

Reid Davenport attends the 2025 Sundance Film Festival premiere of Life After at The Ray Theatre on January 27, 2025, in Park City, UT. (Photo by Robin Marsha...

27/10/2025

5 Eerie Audiobooks to Listen to During the Halloween Season

As the days grow shorter and the nights get darker, there's nothing like getting swept up in a story that sends shivers down your spine. In honor of spooky ...

27/10/2025

[UPDATED] Verizon Fios TV, Nexstar Blackout Looms as Contract Ends on Oct. 24

UPDATE: Both parties have reached a new carriage agreement....

27/10/2025

Study: Mini-Dramas Attract Mega Audiences

LONDON As Hollywood jumps into the production of mini-dramas, a new study from Ampere Analysis finds that more than one in 10 internet users have watched drama ...

27/10/2025

Yealink Unveils SmartVision 80 PTZ Camera With NDI Support

LONDON Yealink, a provider of unified communication and collaboration solutions, has joined the NDI ecosystem with the availability of its SmartVision 80 premiu...

27/10/2025

Gray Media Taps Chris Conroy as GM of Cleveland Stations

ATLANTA Gray Media has named Chris Conroy as general manager of its stations in Cleveland, leading WOIO, a CBS affiliate, The CW station WUAB and Telemundo outl...

27/10/2025

Ericsson, Nokia and Fraunhofer HHI Partner on 6G Video Coding Standard

ESPOO, Finland European connectivity leaders Nokia and Ericsson, have partnered with Berlin's Fraunhofer Heinrich Hertz Institute (HHI), to shape and drive ...

27/10/2025

Comcast Expands Its NOW TV Latino Offering

PHILADELPHIA Comcast has expanded NOW TV Latino, its Spanish-language live TV and streaming offering, adding five more channels from Univision, ViX Premium with...

27/10/2025

Leader to showcase hybrid IP and remote production soluti...

Test & measurement innovator, Leader Electronics, will present its latest products and solutions at InterBEE 2025 (Hall 5, Booth 5218) Makuhari Messe in Chiba, ...

27/10/2025

'City of Shadows,' the New Netflix Thriller Arrives on December 12

Back to All News City of Shadows, the New Netflix Thriller Arrives on December 12 Entertainment 27 October 2025 GlobalSpain Link copied to clipboard Downl...