
Editor's note: This post is part of the AI Decoded series, which demystifies AI by making the technology more accessible and showcases new hardware, software, tools and accelerations for NVIDIA RTX PC and workstation users.
The demand for tools to simplify and optimize generative AI development is skyrocketing. Applications based on retrieval-augmented generation (RAG) - a technique for enhancing the accuracy and reliability of generative AI models with facts fetched from specified external sources - and customized models are enabling developers to tune AI models to their specific needs.
While such work may have required a complex setup in the past, new tools are making it easier than ever.
NVIDIA AI Workbench simplifies AI developer workflows by helping users build their own RAG projects, customize models and more. It's part of the RTX AI Toolkit - a suite of tools and software development kits for customizing, optimizing and deploying AI capabilities - launched at COMPUTEX earlier this month. AI Workbench removes the complexity of technical tasks that can derail experts and halt beginners.
What Is NVIDIA AI Workbench? Available for free, NVIDIA AI Workbench enables users to develop, experiment with, test and prototype AI applications across GPU systems of their choice - from laptops and workstations to data center and cloud. It offers a new approach for creating, using and sharing GPU-enabled development environments across people and systems.
A simple installation gets users up and running with AI Workbench on a local or remote machine in just minutes. Users can then start a new project or replicate one from the examples on GitHub. Everything works through GitHub or GitLab, so users can easily collaborate and distribute work. Learn more about getting started with AI Workbench.
How AI Workbench Helps Address AI Project Challenges Developing AI workloads can require manual, often complex processes, right from the start.
Setting up GPUs, updating drivers and managing versioning incompatibilities can be cumbersome. Reproducing projects across different systems can require replicating manual processes over and over. Inconsistencies when replicating projects, like issues with data fragmentation and version control, can hinder collaboration. Varied setup processes, moving credentials and secrets, and changes in the environment, data, models and file locations can all limit the portability of projects.
AI Workbench makes it easier for data scientists and developers to manage their work and collaborate across heterogeneous platforms. It integrates and automates various aspects of the development process, offering:
Ease of setup: AI Workbench streamlines the process of setting up a developer environment that's GPU-accelerated, even for users with limited technical knowledge.
Seamless collaboration: AI Workbench integrates with version-control and project-management tools like GitHub and GitLab, reducing friction when collaborating.
Consistency when scaling from local to cloud: AI Workbench ensures consistency across multiple environments, supporting scaling up or down from local workstations or PCs to data centers or the cloud.
RAG for Documents, Easier Than Ever NVIDIA offers sample development Workbench Projects to help users get started with AI Workbench. The hybrid RAG Workbench Project is one example: It runs a custom, text-based RAG web application with a user's documents on their local workstation, PC or remote system.
Every Workbench Project runs in a container - software that includes all the necessary components to run the AI application. The hybrid RAG sample pairs a Gradio chat interface frontend on the host machine with a containerized RAG server - the backend that services a user's request and routes queries to and from the vector database and the selected large language model.
This Workbench Project supports a wide variety of LLMs available on NVIDIA's GitHub page. Plus, the hybrid nature of the project lets users select where to run inference.
Workbench Projects let users version the development environment and code. Developers can run the embedding model on the host machine and run inference locally on a Hugging Face Text Generation Inference server, on target cloud resources using NVIDIA inference endpoints like the NVIDIA API catalog, or with self-hosting microservices such as NVIDIA NIM or third-party services.
The hybrid RAG Workbench Project also includes:
Performance metrics: Users can evaluate how RAG- and non-RAG-based user queries perform across each inference mode. Tracked metrics include Retrieval Time, Time to First Token (TTFT) and Token Velocity.
Retrieval transparency: A panel shows the exact snippets of text - retrieved from the most contextually relevant content in the vector database - that are being fed into the LLM and improving the response's relevance to a user's query.
Response customization: Responses can be tweaked with a variety of parameters, such as maximum tokens to generate, temperature and frequency penalty.
To get started with this project, simply install AI Workbench on a local system. The hybrid RAG Workbench Project can be brought from GitHub into the user's account and duplicated to the local system.
More resources are available in the AI Decoded user guide. In addition, community members provide helpful video tutorials, like the one from Joe Freeman below.
Customize, Optimize, Deploy Developers often seek to customize AI models for specific use cases. Fine-tuning, a technique that changes the model by training it with additional data, can be useful for style transfer or changing model behavior. AI Workbench helps with fine-tuning, as well.
The Llama-factory AI Workbench Project enables QLoRa, a fine-tuning method that minimizes memory requirements, for a variety of models, as well as
Most recent headlines
05/01/2027
Worlds first 802.15.4ab-UWB chip verified by Calterah and Rohde & Schwarz to be ...
01/06/2026
January 6 2026, 05:30 (PST) Dolby Sets the New Standard for Premium Entertainment at CES 2026
Throughout the week, Dolby brings to life the latest innovatio...
02/05/2026
Dalet, a leading technology and service provider for media-rich organizations, t...
01/05/2026
January 5 2026, 18:30 (PST) NBCUniversal's Peacock to Be First Streamer to ...
01/04/2026
January 4 2026, 18:00 (PST) DOLBY AND DOUYIN EMPOWER THE NEXT GENERATON OF CREATORS WITH DOLBY VISION
Douyin Users Can Now Create And Share Videos With Stun...
26/03/2026
Share
Copy link
Facebook
X
Linkedin
Bluesky
Email...
26/03/2026
Share
Copy link
Facebook
X
Linkedin
Bluesky
Email...
26/03/2026
28 participating companies, from start-ups to blue-chips, lead the UK Group as part of the GREAT Britain and Northern Ireland presence across all the Halls at N...
26/03/2026
Arkona Unveils BLADE//planner and Major Usability Enhancements at NAB 2026
Brie Clayton March 26, 2026
0 Comments
New graphical configuration tool and...
26/03/2026
Bitfocus showcases complete control at NAB Show 2026
Brie Clayton March 26, 2026
0 Comments
Continuing development drives advances in security, availa...
26/03/2026
Share
Copy link
Facebook
X
Linkedin
Bluesky
Email...
26/03/2026
Share
Copy link
Facebook
X
Linkedin
Bluesky
Email...
26/03/2026
Share
Copy link
Facebook
X
Linkedin
Bluesky
Email...
26/03/2026
Share
Copy link
Facebook
X
Linkedin
Bluesky
Email...
26/03/2026
Nevion introduces powerful new Panel Builder to enhance VideoIPath broadcast con...
26/03/2026
2026 Oscar Nominated Films Powered by Blackmagic Design
Brie Clayton March 25, 2026
0 Comments
DaVinci Resolve Studio used on 27 of this year's no...
26/03/2026
Leader to present full suite of advanced Test & Measurement solutions at NAB Sho...
26/03/2026
Boston Conservatory to Present New England and Collegiate Premiere of Groundbrea...
26/03/2026
RT Commercial today announced Fiat as sponsor of The Louise Duffy Show on RT Radio 1, Weekdays 3pm 4pm.
The Louise Duffy Show is the home of daytime music...
26/03/2026
That gaming backlog won't clear itself - GeForce NOW is here to help. Stream the latest titles straight from the cloud across a variety of devices.
This we...
26/03/2026
Wayne, N.J., March 26, 2026 Phantom High-Speed announces the latest product li...
25/03/2026
Live match directors Sarah Cheadle (Sky Sports), Rob Levi (TNT Sports), and Andrew Swift (BBC Sport) sit down with the Premier League's Rachel Nightingale t...
25/03/2026
The senior from Upstate New York is manning the mic while also interning for the athletic department's sports-information team
In the live-sports-video ind...
25/03/2026
Synamedia has announced ContentArmor Edge Watermarking, a server-side solution t...
25/03/2026
SES has announced meoSphere, a medium Earth orbit (MEO) satellite network targeted for operation by 2030. The first phase will pair SES-developed software-defin...
25/03/2026
TVU Networks is working with Reuters on a phased migration from satellite to a c...
25/03/2026
Nielsen has announced three senior appointments. Seth Ladetsky has been named Head of Global Sports. Trevor Fellows will lead Nielsen's advertiser and agenc...
25/03/2026
Anoki and Amagi have launched In-Scene Ads powered by Anoki ContextIQ across Amagi's portfolio of in-content ad formats for Free Ad-supported Streaming TV (...
25/03/2026
Arkona Technologies will announce a series of enhancements to its BLADE//runner platform at NAB 2026 (Booth C.1808). The updates focus on usability and workflow...
25/03/2026
Daktronics has installed two tower displays and a video wall in the Lexus Club at Petco Park in San Diego ahead of the 2026 season.
Continuing to improve the ...
25/03/2026
MultiDyne Video & Fiber Optic Systems is celebrating its 50th anniversary as NAB Show 2026 approaches. The company was founded in 1976 by Vincent Jachetta, an N...
25/03/2026
IPC, a provider of integrated communication solutions, will make its NAB 2026 de...
25/03/2026
Live production categories were led by NBC, FOX, and ESPN's NFL coverage...
25/03/2026
The Atlanta Braves and Spectrum have announced a multiyear distribution agreemen...
25/03/2026
(L-R) Charlie Tyrell and Daniel Roher attend The AI Doc: Or How I Became An Apocaloptimist Premiere during the 2026 Sundance Film Festival at The Ray Theatre ...
25/03/2026
Directed By, Spotify's documentary-style series that pulls back the curtain ...
25/03/2026
BTS is so back., This week, the global pop superstars took the stage at New York City's Pier 17 for their first U.S. performance in four years.
Part of Spo...
25/03/2026
How you listen can shape what you hear. That's the idea behind the new Spotify Listening Lounge, an acoustic space at our London headquarters purpose-built ...
25/03/2026
Tape effects taken to the extreme
The latest release from New York-based developer Iconic Instruments is said to accurately recreate the saturation and comp...
25/03/2026
Launched alongside new Vocal Phrases bundle
Sonuscore's latest release has been designed specifically for composers working on fantasy TV, film and game...
25/03/2026
Latest update now live
The latest version of Steinberg's post-production-focused DAW has just arrived, and comes packed with new dialogue editing, sound...
25/03/2026
Rohde & Schwarz joins FormFactor's MeasureOne partner program FormFactor and Rohde & Schwarz advance their partnership for on-wafer RF component character...
25/03/2026
L3Harris Technologies and RFTEQ Pty Ltd signed a memorandum of understanding to ...
25/03/2026
L3Harris delivers combat-ready Torpedo Tube Launch and Recovery system, which deploys and retrieves Iver4 900 autonomous underwater vehicles through submarine t...
25/03/2026
The company expands leadership team under Chief Revenue Officer Amilcar Perez
S...
25/03/2026
Winter Olympic Games Opening Ceremony features in top 10 programmes of the month...
25/03/2026
Share
Copy link
Facebook
X
Linkedin
Bluesky
Email...
25/03/2026
Providing wide view timing visibility across the entire production chain...
25/03/2026
Continuing development drives advances in security, availability, access and connectivity...
25/03/2026
Caudalie, the renowned French cosmetics brand, has unveiled a state-of-the-art 200-seat auditorium at its new headquarters in the historic Marais district of ce...