Hugging Face Offers Developers Inference-as-a-Service Powered by NVIDIA NIM
29/07/2024
New inference-as-a-service capabilities will enable developers to rapidly deploy leading large language models such as the Llama 3 family and Mistral AI models with optimization from NVIDIA NIM microservices running on NVIDIA DGX Cloud.
Announced today at the SIGGRAPH conference, the service will help developers quickly prototype with open-source AI models hosted on the Hugging Face Hub and deploy them in production. Enterprise Hub users can tap serverless inference for increased flexibility, minimal infrastructure overhead and optimized performance with NVIDIA NIM.
The inference service complements Train on DGX Cloud, an AI training service already available on Hugging Face.
Developers facing a growing number of open-source models can benefit from a hub where they can easily compare options. These training and inference tools give Hugging Face developers new ways to experiment with, test and deploy cutting-edge models on NVIDIA-accelerated infrastructure. They're made easily accessible using the Train and Deploy drop-down menus on Hugging Face model cards, letting users get started with just a few clicks.
Get started with inference-as-a-service powered by NVIDIA NIM.
Beyond a Token Gesture - NVIDIA NIM Brings Big Benefits
NVIDIA NIM is a collection of AI microservices - including NVIDIA AI foundation models and open-source community models - optimized for inference using industry-standard application programming interfaces, or APIs.
NIM offers users higher efficiency in processing tokens - the units of data used and generated by a language model. The optimized microservices also improve the efficiency of the underlying NVIDIA DGX Cloud infrastructure, which can increase the speed of critical AI applications.
This means developers see faster, more robust results from an AI model accessed as a NIM compared with other versions of the model. The 70-billion-parameter version of Llama 3, for example, delivers up to 5x higher throughput when accessed as a NIM compared with off-the-shelf deployment on NVIDIA H100 Tensor Core GPU-powered systems.
Near-Instant Access to DGX Cloud Provides Accessible AI Acceleration
The NVIDIA DGX Cloud platform is purpose-built for generative AI, offering developers easy access to reliable accelerated computing infrastructure that can help them bring production-ready applications to market faster.
The platform provides scalable GPU resources that support every step of AI development, from prototype to production, without requiring developers to make long-term AI infrastructure commitments.
Hugging Face inference-as-a-service on NVIDIA DGX Cloud powered by NIM microservices offers easy access to compute resources that are optimized for AI deployment, enabling users to experiment with the latest AI models in an enterprise-grade environment.
More on NVIDIA NIM at SIGGRAPH
At SIGGRAPH, NVIDIA also introduced generative AI models and NIM microservices for the OpenUSD framework to accelerate developers' abilities to build highly accurate virtual worlds for the next evolution of AI.
To experience more than 100 NVIDIA NIM microservices with applications across industries, visit ai.nvidia.com.
LINK: | https://blogs.nvidia.com/blog/hugging-face-inference-nim-microservices... |
See more stories from nvidia |
More from Nvidia
17/09/2024
New AI Innovation Hub in Tunisia Drives Technological Advancement Across Africa
A new AI innovation hub for developers across Tunisia launched today in Novation City, a technology park that's designed to cultivate a vibrant, innovation ...
17/09/2024
Upgrade Livestreams With Twitch Enhanced Broadcasting and the NVIDIA Encoder
At TwitchCon - a global convention for the Twitch livestreaming platform-livestreamers and content creators this week can experience the latest technologies for...
12/09/2024
GeForce NOW to Bring Dead Rising Deluxe Remaster' to the Cloud at Launch
Rise and shine - Capcom's latest action-adventure game, Dead Rising Deluxe Remaster, heads to the cloud at launch next week. It's part of nine new titl...
11/09/2024
AI on the Air: Behind the Scenes at IBC With Holoscan for Media
AI is transforming the broadcast industry by enhancing the way content is created, distributed and consumed - but integrating the technology can be challenging....
11/09/2024
NVIDIA and Oracle to Accelerate AI and Data Processing for Enterprises
Enterprises are looking for increasingly powerful compute to support their AI workloads and accelerate data processing. The efficiency gained can translate to b...
11/09/2024
Ready to Roll: Nuro to License Its Autonomous Driving System
To accelerate autonomous vehicle development and deployment timelines, Nuro announced today it will license its Nuro Driver autonomous driving system directly t...
09/09/2024
Live Media Reimagined: NVIDIA Holoscan for Media Now Available for Production
Companies in broadcast, sports and streaming are transitioning to software-defined infrastructure to benefit from flexible deployment and to more easily adopt t...
06/09/2024
How AI Is Personalizing Customer Service Experiences Across Industries
Customer service departments across industries are facing increased call volumes, high customer service agent turnover, talent shortages and shifting customer e...
05/09/2024
19 New Games to Drop for GeForce NOW in September
Fall will be here soon, so leaf it to GeForce NOW to bring the games, with 19 joining the cloud in September. Get started with the seven games available to str...
05/09/2024
Three Ways to Ride the Flywheel of Cybersecurity AI
The business transformations that generative AI brings come with risks that AI itself can help secure in a kind of flywheel of progress. Companies who were qui...
04/09/2024
Volvo Cars EX90 SUV Rolls Out, Built on NVIDIA Accelerated Computing and AI
Volvo Cars' new, fully electric EX90 is making its way from the automaker's assembly line in Charleston, South Carolina, to dealerships around the U.S. ...
04/09/2024
Do the Math: New RTX AI PC Hardware Delivers More AI, Faster
Editor's note: This post is part of the AI Decoded series, which demystifies AI by making the technology more accessible, and showcases new hardware, softwa...
04/09/2024
Hammer Time: Machina Labs' Edward Mehr on Autonomous Blacksmith Bots and More
Edward Mehr works where AI meets the anvil. The company he cofounded, Machina L...
04/09/2024
Manufacturing Intelligence: Deltia AI Delivers Assembly Line Gains With NVIDIA Metropolis and Jetson
It all started at Berlin's Merantix venture studio in 2022, when Silviu Homo...
29/08/2024
From RAG to Richness: Startup Uplevels Retrieval-Augmented Generation for Enterprises
Well before OpenAI upended the technology industry with its release of ChatGPT i...
29/08/2024
Crystal-Clear Gaming: Visions of Mana' Sharpens on GeForce NOW
It's time to mana-fest the spirit of adventure with Square Enix's highly anticipated action role-playing game, Visions of Mana, launching today in the c...
28/08/2024
NVIDIA Blackwell Sets New Standard for Generative AI in MLPerf Inference Debut
As enterprises race to adopt generative AI and bring new services to market, the demands on data center infrastructure have never been greater. Training large l...
28/08/2024
More Than Fine: Multi-LoRA Support Now Available in NVIDIA RTX AI Toolkit
Editor's note: This post is part of the AI Decoded series, which demystifies AI by making the technology more accessible, and showcases new hardware, softwa...
27/08/2024
From Prototype to Prompt: NVIDIA NIM Agent Blueprints Fast-Forward Next Wave of Enterprise Generative AI
The initial wave of generative AI was driven by its use in internet services tha...
27/08/2024
Better Molecules, Faster: NVIDIA NIM Agent Blueprint Redefines Hit Identification With Generative AI-Based Virtual Screening
Aiming at making the process faster and smarter, NVIDIA on Wednesday released th...
26/08/2024
NVIDIA Launches NIM Microservices for Generative AI in Japan, Taiwan
Nations around the world are pursuing sovereign AI to produce artificial intelligence using their own computing infrastructure, data, workforce and business net...
23/08/2024
NVIDIA to Present Innovations at Hot Chips That Boost Data Center Performance and Energy Efficiency
A deep technology conference for processor and system architects from industry a...
22/08/2024
Straight Out of Gamescom and Into Xbox PC Games, GeForce NOW Newly Supports Automatic Xbox Sign-In
Straight out of Gamescom, NVIDIA introduced GeForce NOW support for Xbox automat...
21/08/2024
How Snowflake Is Unlocking the Value of Data With Large Language Models
Snowflake is using AI to help enterprises transform data into insights and applications. In this episode of NVIDIA's AI Podcast, host Noah Kravitz and Baris...
21/08/2024
Lightweight Champ: NVIDIA Releases Small Language Model With State-of-the-Art Accuracy
Developers of generative AI typically face a tradeoff between model size and acc...
21/08/2024
SLMming Down Latency: How NVIDIA's First On-Device Small Language Model Makes Digital Humans More Lifelike
Editor's note: This post is part of the AI Decoded series, which demystifies...
20/08/2024
NVIDIA Showcases New AI Capabilities With ACE, RTX Games and More at Gamescom 2024
At Gamescom, the world's biggest gaming expo, NVIDIA has once again pushed t...
20/08/2024
High-Tech Highways: India Uses NVIDIA Accelerated Computing to Ease Tollbooth Traffic
India is home to the globe's second-largest road network, spanning nearly 4 ...
20/08/2024
Level Up: NVIDIA, MediaTek to Bring G-SYNC Display Technologies to More Gamers
Picture this: NVIDIA and MediaTek are working together to make the industry's best gaming display technologies more accessible to gamers globally. The comp...
20/08/2024
NVIDIA Announces First Digital Human Technologies On-Device Small Language Model, Improving Conversation for Game Characters
NVIDIA's first digital human technology small language model is being demons...
20/08/2024
At Gamescom 2024, GeForce NOW Brings Black Myth: Wukong' and FINAL FANTASY XVI Demo' to the Cloud
Each week, GeForce NOW elevates cloud gaming by bringing top PC games and new up...
19/08/2024
AI Chases the Storm: New NVIDIA Research Boosts Weather Prediction, Climate Simulation
As hurricanes, tornadoes and other extreme weather events occur with increased f...
15/08/2024
GeForce NOW and CurseForge Bring Mod Support to World of Warcraft: The War Within' in the Cloud
Time to be wowed: GeForce NOW members can now stream World of Warcraft on suppor...
14/08/2024
Decoding NVIDIA Edify - The Technology That Helps Developers Create Custom Models Trained on Their Data
Editor's note: This post is part of the AI Decoded series, which demystifies...
13/08/2024
Applications Now Open for $60,000 NVIDIA Graduate Fellowship Awards
Bringing together the world's brightest minds and the latest accelerated computing technology leads to powerful breakthroughs that help tackle some of the b...
09/08/2024
Golden Opportunities: California to Train Students, Educators in AI
The State of California today announced a first-of-its-kind AI education initiative with NVIDIA. The public-private collaboration supports the state's goal...
08/08/2024
GeForce NOW Celebrates 2,000 Games in the Cloud
Editor's note: This blog was updated on Aug. 9 to reflect changes to the availability of Warhammer 40,000: Speed Freeks.' This GFN Thursday marks 2,00...
08/08/2024
Figure Unveils Next-Gen Conversational Humanoid Robot With 3x AI Computing for Fully Autonomous Tasks
Silicon Valley's Figure has taken the wraps off of its next-generation Figur...
07/08/2024
Problem Solved: STEM Studies Supercharged With RTX and AI Technologies
Editor's note: This post is part of the AI Decoded series, which demystifies AI by making the technology more accessible, and showcases new hardware, softwa...
07/08/2024
Recursion CEO Chris Gibson on Accelerating the Biopharmaceutical Industry With AI
Techbio is a field combining data, technology and biology to enhance scientific ...
06/08/2024
Meet the Maker: High School Student Develops Robot Guide Dogs With NVIDIA Jetson
High school student Selin Alara Ornek is looking ahead - using machine learning and the NVIDIA Jetson platform for edge AI and robotics to create robot guide do...
06/08/2024
Editor's Paradise: NVIDIA RTX-Powered Video Software CyberLink PowerDirector Gains High-Efficiency Video Coding Upgrades
Editor's note: This post is part of our In the NVIDIA Studio series, which c...
01/08/2024
August Adventures Await: 18 New Games Coming to GeForce NOW
Members can choose their own adventure with GeForce NOW bringing 18 new games to the cloud in August - including Square Enix's fantasy role-playing game Vis...
31/07/2024
Oracle Cloud Infrastructure Expands NVIDIA GPU-Accelerated Instances for AI, Digital Twins and More
Enterprises are rapidly adopting generative AI, large language models (LLMs), ad...
31/07/2024
NVIDIA Researchers Harness Real-Time Gen AI to Build Immersive Desert World
NVIDIA researchers used NVIDIA Edify, a multimodal architecture for visual generative AI, to build a detailed 3D desert landscape within a few minutes in a live...
31/07/2024
NVIDIA and Zoox Pave the Way for Autonomous Ride-Hailing
In celebration of Zoox's 10th anniversary, NVIDIA founder and CEO Jensen Huang recently joined the robotaxi company's CEO, Aicha Evans, and its cofounde...
31/07/2024
Taking AI to Warp Speed: Decoding How NVIDIA's Latest RTX-Powered Tools and Apps Help Developers Accelerate AI on PCs and Workstations
Editor's note: This post is part of the AI Decoded series, which demystifies...
29/07/2024
For Your Edification: Shutterstock Releases Generative 3D, Getty Images Upgrades Service Powered by NVIDIA
Designers and artists have new and improved ways to boost their productivity wit...
29/07/2024
AI Gets Physical: New NVIDIA NIM Microservices Bring Generative AI to Digital Environments
Millions of people already use generative AI to assist in writing and learning. ...