
DeepSeek-R1 is an open model with state-of-the-art reasoning capabilities. Instead of offering direct responses, AI models like DeepSeek-R1 perform reasoning through the chain-of-thought method to generate the best answer.
Performing this sequence of inference passes - using reason to arrive at the best answer - is known as test-time scaling. DeepSeek-R1 is a perfect example of this scaling law, demonstrating why accelerated computing is critical for the demands of agentic AI inference.
As models are allowed to iteratively think through the problem, they create more output tokens and longer generation cycles, so model quality continues to scale. Significant test-time compute is critical to enable both real-time inference and higher-quality responses from reasoning models like DeepSeek-R1, requiring larger inference deployments.
R1 delivers leading accuracy for tasks demanding logical inference, reasoning, math, coding and language understanding while also delivering high inference efficiency.
To help developers securely experiment with these capabilities and build their own specialized agents, the 671-billion-parameter DeepSeek-R1 model is now available as an NVIDIA NIM microservice preview on build.nvidia.com. The DeepSeek-R1 NIM microservice can deliver up to 3,872 tokens per second on a single NVIDIA HGX H200 system.
Developers can test and experiment with the application programming interface (API), which is expected to be available soon as a downloadable NIM microservice, part of the NVIDIA AI Enterprise software platform.
The DeepSeek-R1 NIM microservice simplifies deployments with support for industry-standard APIs. Enterprises can maximize security and data privacy by running the NIM microservice on their preferred accelerated computing infrastructure. Using NVIDIA AI Foundry with NVIDIA NeMo software, enterprises will also be able to create customized DeepSeek-R1 NIM microservices for specialized AI agents.
DeepSeek-R1 - a Perfect Example of Test-Time Scaling DeepSeek-R1 is a large mixture-of-experts (MoE) model. It incorporates an impressive 671 billion parameters - 10x more than many other popular open-source LLMs - supporting a large input context length of 128,000 tokens. The model also uses an extreme number of experts per layer. Each layer of R1 has 256 experts, with each token routed to eight separate experts in parallel for evaluation.
Delivering real-time answers for R1 requires many GPUs with high compute performance, connected with high-bandwidth and low-latency communication to route prompt tokens to all the experts for inference. Combined with the software optimizations available in the NVIDIA NIM microservice, a single server with eight H200 GPUs connected using NVLink and NVLink Switch can run the full, 671-billion-parameter DeepSeek-R1 model at up to 3,872 tokens per second. This throughput is made possible by using the NVIDIA Hopper architecture's FP8 Transformer Engine at every layer - and the 900 GB/s of NVLink bandwidth for MoE expert communication.
Getting every floating point operation per second (FLOPS) of performance out of a GPU is critical for real-time inference. The next-generation NVIDIA Blackwell architecture will give test-time scaling on reasoning models like DeepSeek-R1 a giant boost with fifth-generation Tensor Cores that can deliver up to 20 petaflops of peak FP4 compute performance and a 72-GPU NVLink domain specifically optimized for inference.
Get Started Now With the DeepSeek-R1 NIM Microservice Developers can experience the DeepSeek-R1 NIM microservice, now available on build.nvidia.com. Watch how it works:
With NVIDIA NIM, enterprises can deploy DeepSeek-R1 with ease and ensure they get the high efficiency needed for agentic AI systems.
See notice regarding software product information.
Most recent headlines
18/12/2025
Share Share by:
Copy link
Facebook
X
Whatsapp
Pinterest
Flipboard...
18/12/2025
Share Share by:
Copy link
Facebook
X
Whatsapp
Pinterest
Flipboard...
18/12/2025
Share Share by:
Copy link
Facebook
X
Whatsapp
Pinterest
Flipboard...
18/12/2025
Long-term agreement includes the SES SCORE platform and hybrid distribution worldwide to deliver more than 5,000 hours of golf tournaments annually featuring th...
17/12/2025
Investigative journalists across the Western Balkans and T rkiye continue to con...
17/12/2025
Sports Broadcasting Hall of Fame Inducts 10 Industry Icons During Unforgettable ...
17/12/2025
ESPN to Debut MNF Playbook with Next Gen Stats, a New AI-Driven NFL Data-AltCastThe series, powered by Adrenaline TruPlay AI, launches Dec. 22 and runs through ...
17/12/2025
Inaugural Optum Golf Channel Games Debut Under the Lights' in Primetime on ...
17/12/2025
The right playlist is essential on New Year's Eve, building the energy as you get ready and keeping it high as you count down to midnight. This year, Spotif...
17/12/2025
eds3_5_jq(document).ready(function($) { $(#eds_sliderM519).chameleonSlider_2_1({...
17/12/2025
Audiences Watched Over 103 Billion Minutes of TV on Thanksgiving Day
NFL Games ...
17/12/2025
Share Share by:
Copy link
Facebook
X
Whatsapp
Pinterest
Flipboard...
17/12/2025
Share Share by:
Copy link
Facebook
X
Whatsapp
Pinterest
Flipboard...
17/12/2025
Share Share by:
Copy link
Facebook
X
Whatsapp
Pinterest
Flipboard...
17/12/2025
Share Share by:
Copy link
Facebook
X
Whatsapp
Pinterest
Flipboard...
17/12/2025
Share Share by:
Copy link
Facebook
X
Whatsapp
Pinterest
Flipboard...
17/12/2025
Share Share by:
Copy link
Facebook
X
Whatsapp
Pinterest
Flipboard...
17/12/2025
Share Share by:
Copy link
Facebook
X
Whatsapp
Pinterest
Flipboard...
17/12/2025
Share Share by:
Copy link
Facebook
X
Whatsapp
Pinterest
Flipboard...
17/12/2025
Share Share by:
Copy link
Facebook
X
Whatsapp
Pinterest
Flipboard...
17/12/2025
Share Share by:
Copy link
Facebook
X
Whatsapp
Pinterest
Flipboard...
17/12/2025
Share Share by:
Copy link
Facebook
X
Whatsapp
Pinterest
Flipboard...
17/12/2025
KPop Demon Hunters Stars Visit Berklee for Weeklong Celebration Andrew Choi and EJAE, who voiced the film's main characters and contributed to its soundtr...
17/12/2025
December 17 2025, 17:00 (PST) Dolby and LG Unveil a New Era of Home Audio With ...
17/12/2025
Wednesday 17 December 2025
Heated Rivalry will be coming to Sky and streaming service NOW on 10 JanuaryTurn on cookies to view this content. Go to Privacy opti...
17/12/2025
Back to All News
Inside The Unseen World Of Indian Customs: Netflix Reveals The...
17/12/2025
Back to All News
Netflix announces PAPARAZZI KING: the docu series coming to Ne...
17/12/2025
Back to All News
Netflix Unveils First Look at Jo Nesbo's Detective Hole Pr...
17/12/2025
Back to All News
Netflix Welcomes Warner Bros. Discovery Board Recommendation
Business
17 December 2025
Global
Link copied to clipboard
After Careful Revi...
17/12/2025
RT has announced that Kathy Fox has been appointed Commissioning Editor with re...
17/12/2025
The Hao AI Lab research team at the University of California San Diego - at the forefront of pioneering AI model innovation - recently received an NVIDIA DGX B...
17/12/2025
Editor's note: This post is part of Into the Omniverse, a series focused on ...
17/12/2025
With the new season of Dancing with the Stars shimmering in the not-too-distant future this New Year, the celebrity and dancer pairings of the twelve couples ha...
16/12/2025
Hawkins has landed on Spotify, just in time for Stranger Things Season 5, Volume...
16/12/2025
Wherever you are, your favorite music and audio content should go seamlessly with you. That's why Spotify has partnered with NAVER Corp, Korea's leading...
16/12/2025
2025 Wrapped arrived bigger and bolder than ever. This year's experience is designed to be ultra personal and shareable, with new features like Wrapped Part...
16/12/2025
Three 12-kilowatt Advanced Electric Propulsion System thrusters, supplied by L3Harris Technologies, form the core of Gateway's propulsion system. Pictured i...
16/12/2025
The challenge facing America's defense industrial base is not just about speed - its about rebuilding the foundation that makes speed possible. Our nations ...
16/12/2025
Share Share by:
Copy link
Facebook
X
Whatsapp
Pinterest
Flipboard...
16/12/2025
SEVILLE, Spain Canal Sur, the public broadcasting service for Andalusia, Spain, has completed a total technology refresh based on Pebble's resilient, softwa...
16/12/2025
NEW YORK Teleprompting hardware provider Telescript International has acquired all software code and intellectual property previously owned by Telescript West. ...
16/12/2025
As cable operators face increased competition from 5G fixed wireless access providers, a new report from Ookla Research finds that T-Mobile is the FWA speed lea...
16/12/2025
Apple has announced a major upgrade to the Apple TV app for device owners outside the Apple ecosystem with news that the Apple TV app for Android now supports G...
16/12/2025
Space42 grows Direct-to-Device partner ecosystem through a Memorandum of Underst...
16/12/2025
16 Dec 2025
VEON Announces Release Date for Full Year and Fourth Quarter 2025 R...
16/12/2025
16 Dec 2025
VEON's Kyivstar Invests in Renewable Energy in Ukraine with Acq...
16/12/2025
Back to All News
Emma Appleton, Fares Fares, Frida Gustavsson and Jakob Oftebro...
16/12/2025
Back to All News
Docu-reality My Korean Boyfriend Gets a Trailer and Premiere D...
16/12/2025
Harmonic's XOS Advanced Media Processor Improves Streaming Video Quality and Boosts Viewer Engagement SAN JOSE, Calif. - Dec. 16, 2025 - Harmonic (NASDAQ: ...