
Businesses seeking to harness the power of AI need customized models tailored to their specific industry needs.
NVIDIA AI Foundry is a service that enables enterprises to use data, accelerated computing and software tools to create and deploy custom models that can supercharge their generative AI initiatives.
Just as TSMC manufactures chips designed by other companies, NVIDIA AI Foundry provides the infrastructure and tools for other companies to develop and customize AI models - using DGX Cloud, foundation models, NVIDIA NeMo software, NVIDIA expertise, as well as ecosystem tools and support.
The key difference is the product: TSMC produces physical semiconductor chips, while NVIDIA AI Foundry helps create custom models. Both enable innovation and connect to a vast ecosystem of tools and partners.
Enterprises can use AI Foundry to customize NVIDIA and open community models, including the new Llama 3.1 collection, as well as NVIDIA Nemotron, CodeGemma by Google DeepMind, CodeLlama, Gemma by Google DeepMind, Mistral, Mixtral, Phi-3, StarCoder2 and others.
Industry Pioneers Drive AI Innovation Industry leaders Amdocs, Capital One, Getty Images, KT, Hyundai Motor Company, SAP, ServiceNow and Snowflake are among the first using NVIDIA AI Foundry. These pioneers are setting the stage for a new era of AI-driven innovation in enterprise software, technology, communications and media.
Organizations deploying AI can gain a competitive edge with custom models that incorporate industry and business knowledge, said Jeremy Barnes, vice president of AI Product at ServiceNow. ServiceNow is using NVIDIA AI Foundry to fine-tune and deploy models that can integrate easily within customers' existing workflows.
The Pillars of NVIDIA AI Foundry NVIDIA AI Foundry is supported by the key pillars of foundation models, enterprise software, accelerated computing, expert support and a broad partner ecosystem.
Its software includes AI foundation models from NVIDIA and the AI community as well as the complete NVIDIA NeMo software platform for fast-tracking model development.
The computing muscle of NVIDIA AI Foundry is NVIDIA DGX Cloud, a network of accelerated compute resources co-engineered with the world's leading public clouds - Amazon Web Services, Google Cloud and Oracle Cloud Infrastructure. With DGX Cloud, AI Foundry customers can develop and fine-tune custom generative AI applications with unprecedented ease and efficiency, and scale their AI initiatives as needed without significant upfront investments in hardware. This flexibility is crucial for businesses looking to stay agile in a rapidly changing market.
If an NVIDIA AI Foundry customer needs assistance, NVIDIA AI Enterprise experts are on hand to help. NVIDIA experts can walk customers through each of the steps required to build, fine-tune and deploy their models with proprietary data, ensuring the models tightly align with their business requirements.
NVIDIA AI Foundry customers have access to a global ecosystem of partners that can provide a full range of support. Accenture, Deloitte, Infosys, Tata Consultancy Services and Wipro are among the NVIDIA partners that offer AI Foundry consulting services that encompass design, implementation and management of AI-driven digital transformation projects. Accenture is first to offer its own AI Foundry-based offering for custom model development, the Accenture AI Refinery framework.
Additionally, service delivery partners such as Data Monsters, Quantiphi, Slalom and SoftServe help enterprises navigate the complexities of integrating AI into their existing IT landscapes, ensuring that AI applications are scalable, secure and aligned with business objectives.
Customers can develop NVIDIA AI Foundry models for production using AIOps and MLOps platforms from NVIDIA partners, including ActiveFence, AutoAlign, Cleanlab, DataDog, Dataiku, Dataloop, DataRobot, Deepchecks, Domino Data Lab, Fiddler AI, Giskard, New Relic, Scale, Tumeryk and Weights & Biases.
Customers can output their AI Foundry models as NVIDIA NIM inference microservices - which include the custom model, optimized engines and a standard API - to run on their preferred accelerated infrastructure.
Inferencing solutions like NVIDIA TensorRT-LLM deliver improved efficiency for Llama 3.1 models to minimize latency and maximize throughput. This enables enterprises to generate tokens faster while reducing total cost of running the models in production. Enterprise-grade support and security is provided by the NVIDIA AI Enterprise software suite.
NVIDIA NIM and TensorRT-LLM minimize inference latency and maximize throughput for Llama 3.1 models to generate tokens faster. The broad range of deployment options includes NVIDIA-Certified Systems from global server manufacturing partners including Cisco, Dell Technologies, Hewlett Packard Enterprise, Lenovo and Supermicro, as well as cloud instances from Amazon Web Services, Google Cloud and Oracle Cloud Infrastructure.
Additionally, Together AI, a leading AI acceleration cloud, today announced it will enable its ecosystem of over 100,000 developers and enterprises to use its NVIDIA GPU-accelerated inference stack to deploy Llama 3.1 endpoints and other open models on DGX Cloud.
Every enterprise running generative AI applications wants a faster user experience, with greater efficiency and lower cost, said Vipul Ved Prakash, founder and CEO of Together AI. Now, developers and enterprises using the Together Inference Engine can maximize performance, scalability and security on NVIDIA DGX Cloud.
NVIDIA NeMo Speeds and Simplifies Custom Model Development With NVIDIA NeMo integrated into AI Foundry, developers have at their fingertips the tools needed to curate data, customize foundation models and evaluate performance. NeMo technologies include:
NeMo Curator is a GPU-accelerated data
More from Nvidia
12/03/2026
Editor's note: This post is part of Into the Omniverse, a series focused on ...
12/03/2026
GeForce NOW is bringing the game to the Game Developers Conference (GDC), running this week in San Francisco. While developers build the future of gaming, GeFor...
11/03/2026
Launched today, NVIDIA Nemotron 3 Super is a 120 billion parameter open model with 12 billion active parameters designed to run complex agentic AI systems at sc...
10/03/2026
Game developers and artists are building cinematic worlds and iconic characters ...
10/03/2026
Game development teams are working across larger worlds, more complex pipelines and more distributed teams than ever. At the same time, many studios still rely ...
10/03/2026
The Cat 306 CR mini-excavator weighs just under eight tons and fits inside a standard shipping container. It's the machine a contractor rents when the job s...
10/03/2026
NVIDIA and Thinking Machines Lab announced today a multiyear strategic partnersh...
09/03/2026
AI is everywhere and accelerating everything - becoming essential infrastructure...
09/03/2026
ABB Robotics and NVIDIA today announced a breakthrough partnership that brings i...
05/03/2026
March is in full bloom, and that means a fresh wave of games heading to the cloud. 15 new titles are joining the GeForce NOW library this month.
Leading the Ma...
28/02/2026
AI-RAN is moving from lab to field, showing that a software-defined approach is ...
28/02/2026
Autonomous networks - intelligent, self-managing telecommunications operations -...
26/02/2026
GeForce NOW's anniversary celebration reaches a chilling crescendo as Capcom...
26/02/2026
GeForce NOW's anniversary celebration reaches a chilling crescendo as Capcom...
24/02/2026
AI is accelerating every aspect of healthcare - from radiology and drug discover...
23/02/2026
As technologies and systems become more digitalized and connected across the world, operational technology (OT) environments and industrial control systems (ICS...
19/02/2026
The GeForce NOW anniversary celebration keeps on rolling, and this week is all about the games that make it possible. With more than 4,500 titles supported in t...
19/02/2026
AI is accelerating the telecommunications industry's transformation, becomin...
17/02/2026
India is entering a new age of industrialization, as AI transforms how the world...
17/02/2026
Agentic AI is reshaping India's tech industry, delivering leaps in services ...
17/02/2026
India is the nexus of AI innovation this week as the host of the AI Impact Summit, which brings together global heads of state and industry to chart the future ...
16/02/2026
The NVIDIA Blackwell platform has been widely adopted by leading inference provi...
12/02/2026
At leading institutions across the globe, the NVIDIA DGX Spark desktop supercomputer is bringing data center class AI to lab benches, faculty offices and studen...
12/02/2026
A diagnostic insight in healthcare. A character's dialogue in an interactive...
12/02/2026
The GeForce NOW sixth-anniversary festivities roll on this February, continuing a monthlong celebration of NVIDIA's cloud gaming service.
This week brings ...
05/02/2026
Break out the cake and green sprinkles - GeForce NOW is turning six.
Since launch, members have streamed over 1 billion hours, and the party's just getting...
04/02/2026
Editor's note: This post is part of the Nemotron Labs blog series, which exp...
03/02/2026
At 3DEXPERIENCE World in Houston, NVIDIA founder and CEO Jensen Huang and Dassau...
29/01/2026
Mercedes-Benz is marking 140 years of automotive innovation with a new S-Class b...
29/01/2026
Editor's note: This post is part of Into the Omniverse, a series focused on ...
29/01/2026
Get ready to game - the native GeForce NOW app for Linux PCs is now available in beta, letting Linux desktops tap directly into GeForce RTX performance from the...
28/01/2026
Quantum technologies are rapidly emerging as foundational capabilities for economic competitiveness, national security and scientific leadership in the 21st cen...
22/01/2026
AI-powered driver assistance technologies are becoming standard equipment, funda...
22/01/2026
The wait is over, pilots. Flight control support - one of the most community-requested features for GeForce NOW - is live starting today, following its announce...
22/01/2026
AI has taken center stage in financial services, automating the research and exe...
22/01/2026
AI-powered content generation is now embedded in everyday tools like Adobe and Canva, with a slew of agencies and studios incorporating the technology into thei...
21/01/2026
From skilled trades to startups, AI's rapid expansion is the beginning of th...
21/01/2026
From skilled trades to startups, AI's rapid expansion is the beginning of th...
15/01/2026
NVIDIA kicked off the year at CES, where the crowd buzzed about the latest gaming announcements - including the native GeForce NOW app for Linux and Amazon Fire...
13/01/2026
NVIDIA and Lilly are putting together a blueprint for what is possible in the f...
09/01/2026
Every that was easy shopping moment is made possible by teams working to hit s...
08/01/2026
The next universal technology since the smartphone is on the horizon - and it ma...
08/01/2026
In the rolling hills of Berkeley, California, an AI agent is supporting high-stakes physics experiments at the Advanced Light Source (ALS) particle accelerator....
08/01/2026
NVIDIA is wrapping up a big week at the CES trade show with a set of GeForce NOW...
07/01/2026
AI has transformed retail and consumer packaged goods (CPG) operations, enhancin...
05/01/2026
At the CES trade show running this week in Las Vegas, NVIDIA announced that the ...
05/01/2026
Open-source AI is accelerating innovation across industries, and NVIDIA DGX Spar...
05/01/2026
NVIDIA DGX SuperPOD is paving the way for large-scale system deployments built on the NVIDIA Rubin platform - the next leap forward in AI computing.
At the CES...
05/01/2026
AI is powering breakthroughs across industries, helping enterprises operate with...
05/01/2026
NVIDIA founder and CEO Jensen Huang took the stage at the Fontainebleau Las Vega...