
Editor's note: This post is part of the AI Decoded series, which demystifies AI by making the technology more accessible and showcases new hardware, software, tools and accelerations for NVIDIA RTX PC and workstation users.
The demand for tools to simplify and optimize generative AI development is skyrocketing. Applications based on retrieval-augmented generation (RAG) - a technique for enhancing the accuracy and reliability of generative AI models with facts fetched from specified external sources - and customized models are enabling developers to tune AI models to their specific needs.
While such work may have required a complex setup in the past, new tools are making it easier than ever.
NVIDIA AI Workbench simplifies AI developer workflows by helping users build their own RAG projects, customize models and more. It's part of the RTX AI Toolkit - a suite of tools and software development kits for customizing, optimizing and deploying AI capabilities - launched at COMPUTEX earlier this month. AI Workbench removes the complexity of technical tasks that can derail experts and halt beginners.
What Is NVIDIA AI Workbench? Available for free, NVIDIA AI Workbench enables users to develop, experiment with, test and prototype AI applications across GPU systems of their choice - from laptops and workstations to data center and cloud. It offers a new approach for creating, using and sharing GPU-enabled development environments across people and systems.
A simple installation gets users up and running with AI Workbench on a local or remote machine in just minutes. Users can then start a new project or replicate one from the examples on GitHub. Everything works through GitHub or GitLab, so users can easily collaborate and distribute work. Learn more about getting started with AI Workbench.
How AI Workbench Helps Address AI Project Challenges Developing AI workloads can require manual, often complex processes, right from the start.
Setting up GPUs, updating drivers and managing versioning incompatibilities can be cumbersome. Reproducing projects across different systems can require replicating manual processes over and over. Inconsistencies when replicating projects, like issues with data fragmentation and version control, can hinder collaboration. Varied setup processes, moving credentials and secrets, and changes in the environment, data, models and file locations can all limit the portability of projects.
AI Workbench makes it easier for data scientists and developers to manage their work and collaborate across heterogeneous platforms. It integrates and automates various aspects of the development process, offering:
Ease of setup: AI Workbench streamlines the process of setting up a developer environment that's GPU-accelerated, even for users with limited technical knowledge.
Seamless collaboration: AI Workbench integrates with version-control and project-management tools like GitHub and GitLab, reducing friction when collaborating.
Consistency when scaling from local to cloud: AI Workbench ensures consistency across multiple environments, supporting scaling up or down from local workstations or PCs to data centers or the cloud.
RAG for Documents, Easier Than Ever NVIDIA offers sample development Workbench Projects to help users get started with AI Workbench. The hybrid RAG Workbench Project is one example: It runs a custom, text-based RAG web application with a user's documents on their local workstation, PC or remote system.
Every Workbench Project runs in a container - software that includes all the necessary components to run the AI application. The hybrid RAG sample pairs a Gradio chat interface frontend on the host machine with a containerized RAG server - the backend that services a user's request and routes queries to and from the vector database and the selected large language model.
This Workbench Project supports a wide variety of LLMs available on NVIDIA's GitHub page. Plus, the hybrid nature of the project lets users select where to run inference.
Workbench Projects let users version the development environment and code. Developers can run the embedding model on the host machine and run inference locally on a Hugging Face Text Generation Inference server, on target cloud resources using NVIDIA inference endpoints like the NVIDIA API catalog, or with self-hosting microservices such as NVIDIA NIM or third-party services.
The hybrid RAG Workbench Project also includes:
Performance metrics: Users can evaluate how RAG- and non-RAG-based user queries perform across each inference mode. Tracked metrics include Retrieval Time, Time to First Token (TTFT) and Token Velocity.
Retrieval transparency: A panel shows the exact snippets of text - retrieved from the most contextually relevant content in the vector database - that are being fed into the LLM and improving the response's relevance to a user's query.
Response customization: Responses can be tweaked with a variety of parameters, such as maximum tokens to generate, temperature and frequency penalty.
To get started with this project, simply install AI Workbench on a local system. The hybrid RAG Workbench Project can be brought from GitHub into the user's account and duplicated to the local system.
More resources are available in the AI Decoded user guide. In addition, community members provide helpful video tutorials, like the one from Joe Freeman below.
Customize, Optimize, Deploy Developers often seek to customize AI models for specific use cases. Fine-tuning, a technique that changes the model by training it with additional data, can be useful for style transfer or changing model behavior. AI Workbench helps with fine-tuning, as well.
The Llama-factory AI Workbench Project enables QLoRa, a fine-tuning method that minimizes memory requirements, for a variety of models, as well as
Most recent headlines
05/01/2027
Worlds first 802.15.4ab-UWB chip verified by Calterah and Rohde & Schwarz to be ...
01/06/2026
January 6 2026, 05:30 (PST) Dolby Sets the New Standard for Premium Entertainment at CES 2026
Throughout the week, Dolby brings to life the latest innovatio...
01/05/2026
January 5 2026, 18:30 (PST) NBCUniversal's Peacock to Be First Streamer to ...
01/04/2026
January 4 2026, 18:00 (PST) DOLBY AND DOUYIN EMPOWER THE NEXT GENERATON OF CREATORS WITH DOLBY VISION
Douyin Users Can Now Create And Share Videos With Stun...
21/01/2026
Wisycom, a global leader in advanced wireless RF solutions, launches its new wideband antenna matrix, MATF, which supports RF and fiber for demanding multi-zone...
21/01/2026
Grass Valley will demonstrate how it is powering scalable, future-ready live production at FOMEX 2026, taking place February 2 4 in Riyadh, Saudi Arabia. Exhibi...
21/01/2026
BCNEXXT, the developers of the advanced playout platform Vipe, today announced that OKAST, the monetization-first OTT platform provider, is using BCNEXXT's ...
21/01/2026
Revamped design enables advanced capabilities, leading with powerful IP to HDMI conversion
Magewell, developer of innovative, high-performance video I/O and I...
21/01/2026
Jan 20th 2026, Changsha Kiloview today announced the launch of two major additions to its AV-over-IP ecosystem: the AVX24-4 Media HUB and KiloLink Station, ma...
21/01/2026
Latest version of enterprise-class Buttons brings simple, coherent control to more than 700 professional devices and applications
Bitfocus, the specialist in ...
21/01/2026
Clear-Com is pleased to announce the appointment of Kari Eythorsson as the new Regional Sales Manager (RSM) for Southeast Asia & Australia, based in Singapore,...
21/01/2026
Share Share by:
Copy link
Facebook
X
Linkedin
Pinterest
Bluesky
Email...
21/01/2026
Share Share by:
Copy link
Facebook
X
Linkedin
Pinterest
Bluesky
Email...
21/01/2026
Share Share by:
Copy link
Facebook
X
Linkedin
Pinterest
Bluesky
Email...
21/01/2026
Share Share by:
Copy link
Facebook
X
Linkedin
Pinterest
Bluesky
Email...
21/01/2026
RT News is pleased to announce the appointment of Jackie Fox as its new Washington Correspondent. Jackie is a multimedia journalist with RT News and has repor...
21/01/2026
From skilled trades to startups, AI's rapid expansion is the beginning of th...
20/01/2026
SVG Sit-Down: BitFire's Jim Akimchuk on the Future of Cloud, Remote, Softwar...
20/01/2026
From architecture to reality: What CBC's Milano Cortina 2026 workflows tell ...
20/01/2026
Milano Cortina 2026: Making the Winter Olympics and Paralympics workflow work fo...
20/01/2026
Countdown to Milano Cortina 2026: SVG Launches SportsTechLive Blog in Lead-up to...
20/01/2026
Grand finale: Rally Saudi Arabia pushes WRC to make the most of cameras and dron...
20/01/2026
State of Change: What's Happening in the Remote-Production Business?NEP Group, Game Creek Video, Dome Productions, Mobile TV Group, Program Productions exec...
20/01/2026
CFP National Championship 2026: A Look Back at SVG's Complete CoverageGo behind the scenes with ESPN, Game Creek Video, Van Wagner, and the champion Indiana...
20/01/2026
The Gotham, Film Independent, and Creators Coalition on AI Join Alliance
by Michelle Satter, Founding Senior Director, Sundance Institute's Artist Programs...
20/01/2026
The National Film and Video Foundation (NFVF), an agency of the Department of Sp...
20/01/2026
December, the first month of winter and the holiday season, has traditionally encouraged families to spend time together in front of the television. During this...
20/01/2026
During December, streaming's share of TV viewing in Mexico settled at 24.3%, an increase of 0.1 share points from the previous month.
Disclaimer: YUMI TV,...
20/01/2026
Christmas Day is Most-Streamed Day Ever with 55 Billion Viewing Minutes, led by...
20/01/2026
Share Share by:
Copy link
Facebook
X
Whatsapp
Pinterest
Flipboard...
20/01/2026
Disguise and ASB GlassFloor today announced a strategic partnership that turns the arena floor into a fully customizable interactive digital surface. This will ...
20/01/2026
Last summer, ITV celebrated the tenth anniversary of its popular reality dating show, Love Island. To ramp up the excitement of series 12 (launched in the UK on...
20/01/2026
Berklee's Mark Ethier Named One of SPIN's Most Influential People in Mus...
20/01/2026
Share Share by:
Copy link
Facebook
X
Whatsapp
Pinterest
Flipboard...
20/01/2026
Share Share by:
Copy link
Facebook
X
Whatsapp
Pinterest
Flipboard...
20/01/2026
Emergent, a provider of AI-powered media production solutions, today announced its presence at ISE 2026 in Barcelona, taking place February 3 6, where it will u...
20/01/2026
Back to All News
Netflix and Warner Bros. Discovery Amend Agreement to All-Cash Transaction
Business
20 January 2026
Global
Link copied to clipboard
All-C...
20/01/2026
Back to All News
Now You Can Play a Role in Our Live Events With Real-Time Voting
Elmar Nubbemeyer
VP, Member Product
Product
20 January 2026
Global
Link...
19/01/2026
CFP National Championship 2026: Indiana Football's Digital Crew Generate Hyp...
19/01/2026
CFP National Championship 2026: Van Wagner Relies on Experience for Indiana-Miam...
19/01/2026
CFP National Championship 2026: Game Creek Video Fields Premiere IP Compound for...
19/01/2026
CFP National Championship 2026: For a Slew of Studio Shows, ESPN Turns Hard Rock...
19/01/2026
CFP National Championship 2026: ESPN's Sweeping Live Game Production Leverag...
19/01/2026
The new L3Harris NOVA binocular system is the every-soldier goggle' - and the beginning of a new era for night-vison capability....
19/01/2026
The Majority of Those Surveyed Prefer to Buy Brands Who Advertise in Content Tha...
19/01/2026
Back to All News
Straight to Hell' Sets April 27 Premiere: Teaser Trailer,...
19/01/2026
New Workflow Enhancements for All Broadcast Pix Systems, Plus 4-Channel ISO Recording for Hybrid and Roadie Tyngsboro, Mass. - January 2025 - Broadcast Pix ann...
19/01/2026
Monday 19 January 2026
Sky News reinvents News at Ten for the modern audience with The Wrap
Sky News today announces The Wrap, a bold evolution of its 10pm ou...
19/01/2026
Back to All News
Netflix Announces the Series Nights & Days, Directed by Kamila...