Sony Pixel Power calrec Sony

Decoding How NVIDIA AI Workbench Powers App Development

19/06/2024

Editor's note: This post is part of the AI Decoded series, which demystifies AI by making the technology more accessible and showcases new hardware, software, tools and accelerations for NVIDIA RTX PC and workstation users.

The demand for tools to simplify and optimize generative AI development is skyrocketing. Applications based on retrieval-augmented generation (RAG) - a technique for enhancing the accuracy and reliability of generative AI models with facts fetched from specified external sources - and customized models are enabling developers to tune AI models to their specific needs.

While such work may have required a complex setup in the past, new tools are making it easier than ever.

NVIDIA AI Workbench simplifies AI developer workflows by helping users build their own RAG projects, customize models and more. It's part of the RTX AI Toolkit - a suite of tools and software development kits for customizing, optimizing and deploying AI capabilities - launched at COMPUTEX earlier this month. AI Workbench removes the complexity of technical tasks that can derail experts and halt beginners.

What Is NVIDIA AI Workbench? Available for free, NVIDIA AI Workbench enables users to develop, experiment with, test and prototype AI applications across GPU systems of their choice - from laptops and workstations to data center and cloud. It offers a new approach for creating, using and sharing GPU-enabled development environments across people and systems.

A simple installation gets users up and running with AI Workbench on a local or remote machine in just minutes. Users can then start a new project or replicate one from the examples on GitHub. Everything works through GitHub or GitLab, so users can easily collaborate and distribute work. Learn more about getting started with AI Workbench.

How AI Workbench Helps Address AI Project Challenges Developing AI workloads can require manual, often complex processes, right from the start.

Setting up GPUs, updating drivers and managing versioning incompatibilities can be cumbersome. Reproducing projects across different systems can require replicating manual processes over and over. Inconsistencies when replicating projects, like issues with data fragmentation and version control, can hinder collaboration. Varied setup processes, moving credentials and secrets, and changes in the environment, data, models and file locations can all limit the portability of projects.

AI Workbench makes it easier for data scientists and developers to manage their work and collaborate across heterogeneous platforms. It integrates and automates various aspects of the development process, offering:

Ease of setup: AI Workbench streamlines the process of setting up a developer environment that's GPU-accelerated, even for users with limited technical knowledge.

Seamless collaboration: AI Workbench integrates with version-control and project-management tools like GitHub and GitLab, reducing friction when collaborating.

Consistency when scaling from local to cloud: AI Workbench ensures consistency across multiple environments, supporting scaling up or down from local workstations or PCs to data centers or the cloud.

RAG for Documents, Easier Than Ever NVIDIA offers sample development Workbench Projects to help users get started with AI Workbench. The hybrid RAG Workbench Project is one example: It runs a custom, text-based RAG web application with a user's documents on their local workstation, PC or remote system.

Every Workbench Project runs in a container - software that includes all the necessary components to run the AI application. The hybrid RAG sample pairs a Gradio chat interface frontend on the host machine with a containerized RAG server - the backend that services a user's request and routes queries to and from the vector database and the selected large language model.

This Workbench Project supports a wide variety of LLMs available on NVIDIA's GitHub page. Plus, the hybrid nature of the project lets users select where to run inference.

Workbench Projects let users version the development environment and code. Developers can run the embedding model on the host machine and run inference locally on a Hugging Face Text Generation Inference server, on target cloud resources using NVIDIA inference endpoints like the NVIDIA API catalog, or with self-hosting microservices such as NVIDIA NIM or third-party services.

The hybrid RAG Workbench Project also includes:

Performance metrics: Users can evaluate how RAG- and non-RAG-based user queries perform across each inference mode. Tracked metrics include Retrieval Time, Time to First Token (TTFT) and Token Velocity.

Retrieval transparency: A panel shows the exact snippets of text - retrieved from the most contextually relevant content in the vector database - that are being fed into the LLM and improving the response's relevance to a user's query.

Response customization: Responses can be tweaked with a variety of parameters, such as maximum tokens to generate, temperature and frequency penalty.

To get started with this project, simply install AI Workbench on a local system. The hybrid RAG Workbench Project can be brought from GitHub into the user's account and duplicated to the local system.

More resources are available in the AI Decoded user guide. In addition, community members provide helpful video tutorials, like the one from Joe Freeman below.

Customize, Optimize, Deploy Developers often seek to customize AI models for specific use cases. Fine-tuning, a technique that changes the model by training it with additional data, can be useful for style transfer or changing model behavior. AI Workbench helps with fine-tuning, as well.

The Llama-factory AI Workbench Project enables QLoRa, a fine-tuning method that minimizes memory requirements, for a variety of models, as well as
LINK: https://blogs.nvidia.com/blog/ai-decoded-workbench-hybrid-rag/...
See more stories from nvidia

Most recent headlines

27/11/2025

Vizrt Launches Viz One 8.1 With AI-Powered Features

LONDON Vizrt has added several AI-driven advanced features offering improved speed, intelligence and accuracy in the newest version of its media asset managemen...

27/11/2025

Prime Video Debuts AI-Powered Video Recaps

Prime Video has launched AI-powered video season recaps in a beta version for select English-language Prime Original series in the U.S., a move Amazon is callin...

26/11/2025

SVG Sit-Down: Prime Video EP Mike Muriano Previews Massive Black Friday Slate Featuring NFL, NBA, and Golf

SVG Sit-Down: Prime Video EP Mike Muriano Previews Massive Black Friday Slate Fe...

26/11/2025

Inside the Archives: Winter Is in the Air and in Our Festival Films

A cinematic snow sculpture at the 1995 Sundance Film Festival. Photo by Randall Michelson...

26/11/2025

10 Book Podcasts You Can't Miss

Book podcasts are booming. On Spotify, you'll find everything from celebrity book clubs to deep dives with bestselling authors. And in markets where audiobo...

26/11/2025

JioStar and Nielsen Unveil Breakthrough Cross-Screen MeasurementStudy, Redefining Advertising Effectiveness in Live Sports

Mumbai, November 24, 2025: In a first-of-its-kind initiative, JioStar, in collab...

26/11/2025

ITN Deploys IP-Based Production Control Room

LONDON Factual content producer ITN Productions has launched a new low-latency IP gallery for news bulletins....

26/11/2025

YouTube TV, TelevisaUnivision End Lengthy Blackout

MIAMI TelevisaUnivision said it struck a new multiyear distribution agreement with YouTube TV that includes distribution of TelevisaUnivision's U.S. network...

26/11/2025

OpenDrives Bridges the Gap Between IT and Creatives with...

OpenDrives, Inc., a leader in software-defined data storage and data services, today announced the launch of the Atlas Corporate Creative Solution. This new Atl...

26/11/2025

Disguise to Showcase Future of Event Visuals at LDI 2025

Disguise, the industry-leading company powering the world's biggest live performances, is partnering with pioneering LED wall manufacturer DVS to give atten...

26/11/2025

HighField AI Expands Global Channel Partner Network to Ac...

HighField AI, the pioneer in agentic and multimodal automation for broadcast and media production, today announced the expansion of its global channel partner n...

26/11/2025

Mono Streaming selects PlayBox Neo to manage English Prem...

As high-stakes Premier League fixtures approach and additional premium content launches, with MONO positioning themselves to dominate Thailand's sports stre...

26/11/2025

Bell Centre arena in Montreal elevates fan experience wit...

Hosting a wide variety of events from high-intensity NHL games to complex live music concerts and major entertainment productions, Montreal's 21,000 capacit...

26/11/2025

Vizrt launches AI-powered advances for speed and accuracy...

Vizrt, the leader in live production technology revolutionizing viewer engagement and experience, releases AI-driven advances focusing on speed, intelligence, a...

26/11/2025

ITN Launches Low-Latency IP Control Room Powered by Teche...

ITN Productions, an award-winning factual content producer, today launched a new low-latency IP gallery for news bulletins. Responsible for delivering a leading...

26/11/2025

Ikegami Maintains Initiative in Broadcast Systems Develop...

Ikegami reports ongoing advances throughout 2025 in developing and delivering coordinated television production solutions that maximize quality, versatility and...

26/11/2025

Fubo, NBCUniversal Trade Barbs in Carriage Dispute

Following the Nov. 21 blackout of NBCUniversal channels on Fubo, the two sides have traded barbs about their inability to reach a new carriage deal....

26/11/2025

Global Sports Rights Spending to Top $78 Billion in 2030

LONDON As TV sports rights become increasingly important for both broadcasters and streamers, Ampere Analysis predicts global investment in the genre will surpa...

26/11/2025

Vubiquity Earns AWS Media & Entertainment Competency Status

LOS ANGELES Vubiquity said it has achieved the Amazon Web Services (AWS) Media & Entertainment Competency as part of the AWS Partner Network (APN). This designa...

26/11/2025

Comcast Pays $1.5 Million to Settle FCC Data Breach Probe

WASHINGTON The Federal Communications Commission's Enforcement Bureau said it has entered into a consent decree with Comcast calling for the cable company t...

26/11/2025

Berklee Named to the Hollywood Reporters Top Music Schools List

Berklee Named to the Hollywood Reporters Top Music Schools List The publication highlights the college's screen scoring program, industry partnerships, and ...

26/11/2025

YouView Achieves Greenly Gold Certification for Sustainability

YouView Achieves Greenly Gold Certification for SustainabilityNov 26, 2025 YouView is proud to announce a Gold Certification award from Greenly for our perform...

26/11/2025

Netflix Deepens Partnership with Taiwan's 62nd Golden Horse Film Festival, Launches New Talent and Storytelling Initiatives

Back to All News Netflix Deepens Partnership with Taiwan's 62nd Golden Hors...

25/11/2025

Tracy Bonareri Onchoke: Winner, Young Journalist Award 2025

Tracy Bonareri Onchoke, an investigative journalist from Kenya is the winner of the Thomson Foundation's Young Journalist Award 2025. The 26-year-old-sele...

25/11/2025

SVG All-Stars: Blayke Scheer, Senior Director, Creative Content, YES Network

SVG All-Stars: Blayke Scheer, Senior Director, Creative Content, YES NetworkThe Indiana alum has turned storytelling into an artform for more than two decadesBy...

25/11/2025

Op-Ed: With FCC's C-Band Auction on the Horizon, Broadcasters Need Proven, Cost-Effective Alternatives

Op-Ed: With FCC's C-Band Auction on the Horizon, Broadcasters Need Proven, C...

25/11/2025

Analysis: Is Baller League Really the Future of Sport?

Analysis: Is Baller League really the future of sport? By Callum McCarthy, Editor-at-Large Tuesday, November 25, 2025 - 10:10 Print This Story With KSI on...

25/11/2025

Platinum Whitepaper: The Growth of Broadcast in the World of Major Large Scale Events with SOS Global

Platinum Whitepaper: The Growth of Broadcast in the World of Major Large Scale E...

25/11/2025

SVG Summit 2025 Preview: SVG Women's Sports Workshop

SVG Summit 2025 Preview: SVG Women's Sports WorkshopBy Samantha Gabay Tuesday, November 25, 2025 - 10:27 am Print This Story | Subscribe Story Highlig...

25/11/2025

SVG New Sponsor Spotlight: CacheFly's Matt Levine on the Evolving Role of the CDN and Prioritizing Throughput

SVG New Sponsor Spotlight: CacheFly's Matt Levine on the Evolving Role of th...

25/11/2025

Peacock's EA SPORTS Madden NFL Cast Levels Up on Thanksgiving With SkyCam as the Primary Angle and More Madden Elements

Peacock's EA SPORTS Madden NFL Cast Levels Up on Thanksgiving With SkyCam as...

25/11/2025

Sauna Is an Intimate Exploration of Queer Love and Identity

Mathias Broe attends the 2025 Sundance Film Festival premiere of Sauna at Library Center Theatre. (Photo by Michael Hurcomb/Shutterstock for Sundance Film Fes...

25/11/2025

5 Reasons to Try Spotify Premium This Holiday Season

The best playlists, podcasts, and audiobooks bring a little extra magic to your daily routine. With new features and offerings, Spotify Premium delivers even mo...

25/11/2025

New Study Reveals Australians Love Discovering New Music

Comprehensive new research confirms what we already knew: Australian music fans love the quality, quantity, and access they have to new and local music on strea...

25/11/2025

Why Use a SIM Card With The SNYPER-5G

Applicable Products Objectives The purpose of this application note is to give a brief background on 5G (NR) wireless communication an explain the reason a SN...

25/11/2025

Lionsgate and Nielsen expand partnership to deliver first-ever combined FAST channel and digital network measurement

Nielsen will now measure both Lionsgate's FAST channel MovieSphere and Movie...

25/11/2025

AP Switches to DaVinci Resolve Studio for Global News Production

FREMONT, Calif. Blackmagic Design said the Associated Press has completed the transition of its global video-editing platform to DaVinci Resolve Studio....

25/11/2025

Berklees Inaugural Nat King Cole and Natalie Cole Scholarship Awarded to Paris Pineyro

Berklees Inaugural Nat King Cole and Natalie Cole Scholarship Awarded to Paris P...

25/11/2025

Traditional TV Players Gained Viewers in October: Nielsen Gauge

NEW YORK NFL and college football coverage, the MLB postseason and the new fall broadcast-TV season contributed to major gains for traditional media companies a...

25/11/2025

Tower Products CEO Jim Veltrie to Retire Dec. 30

SAUGERTIES, N.Y. Tower Products, a manufacturer and distributor of pro video and audio equipment here, said President and CEO Jim Veltrie will retire from the c...

25/11/2025

Sinclair Makes Unsolicited Bid to Buy Scripps at $7 a Share

Following last week's disclosure that it had acquired a 8.2% stake in E.W. Scripps, Sinclair has filed papers with the Securities and Exchange Commission pr...

25/11/2025

VEON's QazCode and MeetKai Sign Agreement to Power National LLM Training and Local-Language Agentic Services Across VEON Markets

25 Nov 2025 VEON's QazCode and MeetKai Sign Agreement to Power National LLM...

25/11/2025

UKTV acquires three shows from Paramount Global Content Distribution for U, U&W and U&alibi

UKTV has acquired a high-profile slate of US dramas from Paramount Global Conten...

25/11/2025

Will Sharpe, Paul Bettany and Gabrielle Creevy star in a spectacular five-part event series Amadeus: Full Trailer Released

A symphony of genius, rivalry and vengeance, boldly reimagined from Peter Shaffe...

25/11/2025

Bradford Young named 2025 FilmLight Colour Awards Jury President'

Article courtesy of Cinematography World Read the article FilmLight has finalised the prestigious 2025 FilmLight Colour Awards jury and welcomed award-winning...

25/11/2025

Correccin de color en Chespirito: Sin Querer Queriendo

Article courtesy of Prensario Read the article La serie fue dirigida por Juli n de Tavira, Rodrigo Santos, y David Leche Ruiz, con direcci n de fotograf a a...

25/11/2025

Nosferatu,' Sinners,' The Studio' and Severance' Colourists Nominated for FilmLight Colour Awards

Article courtesy of The Hollywood Reporter Read the article The awards, celebr...

25/11/2025

Harbor rolls out Nara globally

Article courtesy of Televisual Read the article Already live in Los Angeles and rolling out in New York and London, Nara gives producers, colourists, conform ...

25/11/2025

ARTONE FILM integrates Baselight M

Article courtesy of Digital Media World Read the article ARTONE post-house in Tokyo is the first facility in Japan to integrate Baselight M, choosing its prec...