Sony Pixel Power calrec Sony

Decoding How NVIDIA AI Workbench Powers App Development

19/06/2024

Editor's note: This post is part of the AI Decoded series, which demystifies AI by making the technology more accessible and showcases new hardware, software, tools and accelerations for NVIDIA RTX PC and workstation users.

The demand for tools to simplify and optimize generative AI development is skyrocketing. Applications based on retrieval-augmented generation (RAG) - a technique for enhancing the accuracy and reliability of generative AI models with facts fetched from specified external sources - and customized models are enabling developers to tune AI models to their specific needs.

While such work may have required a complex setup in the past, new tools are making it easier than ever.

NVIDIA AI Workbench simplifies AI developer workflows by helping users build their own RAG projects, customize models and more. It's part of the RTX AI Toolkit - a suite of tools and software development kits for customizing, optimizing and deploying AI capabilities - launched at COMPUTEX earlier this month. AI Workbench removes the complexity of technical tasks that can derail experts and halt beginners.

What Is NVIDIA AI Workbench? Available for free, NVIDIA AI Workbench enables users to develop, experiment with, test and prototype AI applications across GPU systems of their choice - from laptops and workstations to data center and cloud. It offers a new approach for creating, using and sharing GPU-enabled development environments across people and systems.

A simple installation gets users up and running with AI Workbench on a local or remote machine in just minutes. Users can then start a new project or replicate one from the examples on GitHub. Everything works through GitHub or GitLab, so users can easily collaborate and distribute work. Learn more about getting started with AI Workbench.

How AI Workbench Helps Address AI Project Challenges Developing AI workloads can require manual, often complex processes, right from the start.

Setting up GPUs, updating drivers and managing versioning incompatibilities can be cumbersome. Reproducing projects across different systems can require replicating manual processes over and over. Inconsistencies when replicating projects, like issues with data fragmentation and version control, can hinder collaboration. Varied setup processes, moving credentials and secrets, and changes in the environment, data, models and file locations can all limit the portability of projects.

AI Workbench makes it easier for data scientists and developers to manage their work and collaborate across heterogeneous platforms. It integrates and automates various aspects of the development process, offering:

Ease of setup: AI Workbench streamlines the process of setting up a developer environment that's GPU-accelerated, even for users with limited technical knowledge.

Seamless collaboration: AI Workbench integrates with version-control and project-management tools like GitHub and GitLab, reducing friction when collaborating.

Consistency when scaling from local to cloud: AI Workbench ensures consistency across multiple environments, supporting scaling up or down from local workstations or PCs to data centers or the cloud.

RAG for Documents, Easier Than Ever NVIDIA offers sample development Workbench Projects to help users get started with AI Workbench. The hybrid RAG Workbench Project is one example: It runs a custom, text-based RAG web application with a user's documents on their local workstation, PC or remote system.

Every Workbench Project runs in a container - software that includes all the necessary components to run the AI application. The hybrid RAG sample pairs a Gradio chat interface frontend on the host machine with a containerized RAG server - the backend that services a user's request and routes queries to and from the vector database and the selected large language model.

This Workbench Project supports a wide variety of LLMs available on NVIDIA's GitHub page. Plus, the hybrid nature of the project lets users select where to run inference.

Workbench Projects let users version the development environment and code. Developers can run the embedding model on the host machine and run inference locally on a Hugging Face Text Generation Inference server, on target cloud resources using NVIDIA inference endpoints like the NVIDIA API catalog, or with self-hosting microservices such as NVIDIA NIM or third-party services.

The hybrid RAG Workbench Project also includes:

Performance metrics: Users can evaluate how RAG- and non-RAG-based user queries perform across each inference mode. Tracked metrics include Retrieval Time, Time to First Token (TTFT) and Token Velocity.

Retrieval transparency: A panel shows the exact snippets of text - retrieved from the most contextually relevant content in the vector database - that are being fed into the LLM and improving the response's relevance to a user's query.

Response customization: Responses can be tweaked with a variety of parameters, such as maximum tokens to generate, temperature and frequency penalty.

To get started with this project, simply install AI Workbench on a local system. The hybrid RAG Workbench Project can be brought from GitHub into the user's account and duplicated to the local system.

More resources are available in the AI Decoded user guide. In addition, community members provide helpful video tutorials, like the one from Joe Freeman below.

Customize, Optimize, Deploy Developers often seek to customize AI models for specific use cases. Fine-tuning, a technique that changes the model by training it with additional data, can be useful for style transfer or changing model behavior. AI Workbench helps with fine-tuning, as well.

The Llama-factory AI Workbench Project enables QLoRa, a fine-tuning method that minimizes memory requirements, for a variety of models, as well as
LINK: https://blogs.nvidia.com/blog/ai-decoded-workbench-hybrid-rag/...
See more stories from nvidia

Most recent headlines

05/01/2027

Worlds first 802.15.4ab-UWB chip verified by Calterah and Rohde & Schwarz to be demoed at CES 2026

Worlds first 802.15.4ab-UWB chip verified by Calterah and Rohde & Schwarz to be ...

01/06/2026

Dolby Sets the New Standard for Premium Entertainment at CES 2026

January 6 2026, 05:30 (PST) Dolby Sets the New Standard for Premium Entertainment at CES 2026 Throughout the week, Dolby brings to life the latest innovatio...

02/05/2026

Dalet Flex LTS Delivers Smarter Search, Faster Editing, and an AI-Ready Foundation for Modern Media

Dalet, a leading technology and service provider for media-rich organizations, t...

01/05/2026

NBCUniversal's Peacock to Be First Streamer to Integrate Dolby's Full Suite of Premium Picture and Sound Innovations

January 5 2026, 18:30 (PST) NBCUniversal's Peacock to Be First Streamer to ...

01/04/2026

DOLBY AND DOUYIN EMPOWER THE NEXT GENERATON OF CREATORS WITH DOLBY VISION

January 4 2026, 18:00 (PST) DOLBY AND DOUYIN EMPOWER THE NEXT GENERATON OF CREATORS WITH DOLBY VISION Douyin Users Can Now Create And Share Videos With Stun...

26/03/2026

Apogee Insight Acquires PMA Research

Share Copy link Facebook X Linkedin Bluesky Email...

26/03/2026

NBC Announces Exclusive Coverage of Sail4th 250 Tall Ships in NYC

Share Copy link Facebook X Linkedin Bluesky Email...

26/03/2026

UK Group of media tech companies take the spotlight at NA...

28 participating companies, from start-ups to blue-chips, lead the UK Group as part of the GREAT Britain and Northern Ireland presence across all the Halls at N...

26/03/2026

Arkona Unveils BLADE//planner and Major Usability Enhancements at NAB 2026

Arkona Unveils BLADE//planner and Major Usability Enhancements at NAB 2026 Brie Clayton March 26, 2026 0 Comments New graphical configuration tool and...

26/03/2026

Bitfocus showcases complete control at NAB Show 2026

Bitfocus showcases complete control at NAB Show 2026 Brie Clayton March 26, 2026 0 Comments Continuing development drives advances in security, availa...

26/03/2026

Allen Media Group To Deploy Anoki ContextIQ

Share Copy link Facebook X Linkedin Bluesky Email...

26/03/2026

LG Announces New Premium FAST Channels

Share Copy link Facebook X Linkedin Bluesky Email...

26/03/2026

IABM to Host Breakfast Event at 2026 NAB Show

Share Copy link Facebook X Linkedin Bluesky Email...

26/03/2026

Nexstar Defends Tegna Deal in Calif. Court Filing

Share Copy link Facebook X Linkedin Bluesky Email...

26/03/2026

Nevion introduces powerful new Panel Builder to enhance VideoIPath broadcast control capability

Nevion introduces powerful new Panel Builder to enhance VideoIPath broadcast con...

26/03/2026

2026 Oscar Nominated Films Powered by Blackmagic Design

2026 Oscar Nominated Films Powered by Blackmagic Design Brie Clayton March 25, 2026 0 Comments DaVinci Resolve Studio used on 27 of this year's no...

26/03/2026

Leader to present full suite of advanced Test & Measurement solutions at NAB Show 2026

Leader to present full suite of advanced Test & Measurement solutions at NAB Sho...

26/03/2026

Boston Conservatory to Present New England and Collegiate Premiere of Groundbreaking Opera Time to Act

Boston Conservatory to Present New England and Collegiate Premiere of Groundbrea...

26/03/2026

Fiat to sponsor The Louise Duffy Show on RT Radio 1

RT Commercial today announced Fiat as sponsor of The Louise Duffy Show on RT Radio 1, Weekdays 3pm 4pm. The Louise Duffy Show is the home of daytime music...

26/03/2026

Game On: Five New Titles Now Streaming on GeForce NOW

That gaming backlog won't clear itself - GeForce NOW is here to help. Stream the latest titles straight from the cloud across a variety of devices. This we...

26/03/2026

Phantom C-Series High-Speed Cameras Set a New Standard for Automotive Crash and Safety Imaging

Wayne, N.J., March 26, 2026 Phantom High-Speed announces the latest product li...

25/03/2026

In The Hot Seat: The Art of Directing a Premier League Match

Live match directors Sarah Cheadle (Sky Sports), Rob Levi (TNT Sports), and Andrew Swift (BBC Sport) sit down with the Premier League's Rachel Nightingale t...

25/03/2026

SVG Students To Watch: Kyle Maier, St. Bonaventure University

The senior from Upstate New York is manning the mic while also interning for the athletic department's sports-information team In the live-sports-video ind...

25/03/2026

NAB 2026: Synamedia Launches Edge Watermarking Solution, Marks 10 Years of ContentArmor

Synamedia has announced ContentArmor Edge Watermarking, a server-side solution t...

25/03/2026

SES Taps K2 Space to Build meoSphere MEO Satellite Network

SES has announced meoSphere, a medium Earth orbit (MEO) satellite network targeted for operation by 2030. The first phase will pair SES-developed software-defin...

25/03/2026

Reuters and TVU Networks Begin Satellite-to-IP Migration for Live News Distribution

TVU Networks is working with Reuters on a phased migration from satellite to a c...

25/03/2026

Nielsen Names Three Senior Hires in Sports, Advertising, and Publishing Roles

Nielsen has announced three senior appointments. Seth Ladetsky has been named Head of Global Sports. Trevor Fellows will lead Nielsen's advertiser and agenc...

25/03/2026

Anoki and Amagi Bring Scene-Level Intelligence to In-Content CTV Ads

Anoki and Amagi have launched In-Scene Ads powered by Anoki ContextIQ across Amagi's portfolio of in-content ad formats for Free Ad-supported Streaming TV (...

25/03/2026

NAB 2026: Arkona to Unveil BLADE//planner and Platform Updates

Arkona Technologies will announce a series of enhancements to its BLADE//runner platform at NAB 2026 (Booth C.1808). The updates focus on usability and workflow...

25/03/2026

San Diego Padres Partners With Daktronics to Enhance Petco Park

Daktronics has installed two tower displays and a video wall in the Lexus Club at Petco Park in San Diego ahead of the 2026 season. Continuing to improve the ...

25/03/2026

NAB 2026: MultiDyne Marks 50th Anniversary

MultiDyne Video & Fiber Optic Systems is celebrating its 50th anniversary as NAB Show 2026 approaches. The company was founded in 1976 by Vincent Jachetta, an N...

25/03/2026

NAB 2026: IPC to Debut with One Connect Intercom Platform and New One Link Keypanels

IPC, a provider of integrated communication solutions, will make its NAB 2026 de...

25/03/2026

ESPN Tops 2026 Sports Emmy Nominations With 63 Nods

Live production categories were led by NBC, FOX, and ESPN's NFL coverage...

25/03/2026

Atlanta Braves and Spectrum Reach Multiyear Distribution Agreement for BravesVision

The Atlanta Braves and Spectrum have announced a multiyear distribution agreemen...

25/03/2026

The AI Doc Asks the Question No One Wants to Answer

(L-R) Charlie Tyrell and Daniel Roher attend The AI Doc: Or How I Became An Apocaloptimist Premiere during the 2026 Sundance Film Festival at The Ray Theatre ...

25/03/2026

Kelsey Lu and Savanah Leaf Lean Into the Emotional Core of Running To Pain' in Episode Three of Directed By'

Directed By, Spotify's documentary-style series that pulls back the curtain ...

25/03/2026

BTS and Spotify Bring ARIRANG' to Top Fans in New York City

BTS is so back., This week, the global pop superstars took the stage at New York City's Pier 17 for their first U.S. performance in four years. Part of Spo...

25/03/2026

Step Into Sound at Our New Spotify Listening Lounge in London

How you listen can shape what you hear. That's the idea behind the new Spotify Listening Lounge, an acoustic space at our London headquarters purpose-built ...

25/03/2026

Iconic Instruments launch Transport Vintage Tape

Tape effects taken to the extreme The latest release from New York-based developer Iconic Instruments is said to accurately recreate the saturation and comp...

25/03/2026

Sonuscore introduce Fantasy Vocal Phrases

Launched alongside new Vocal Phrases bundle Sonuscore's latest release has been designed specifically for composers working on fantasy TV, film and game...

25/03/2026

Steinberg unveil Nuendo 15

Latest update now live The latest version of Steinberg's post-production-focused DAW has just arrived, and comes packed with new dialogue editing, sound...

25/03/2026

Rohde & Schwarz joins FormFactor's MeasureOne partner program

Rohde & Schwarz joins FormFactor's MeasureOne partner program FormFactor and Rohde & Schwarz advance their partnership for on-wafer RF component character...

25/03/2026

L3Harris, RFTEQ Sign Agreement to Advance Sovereign Electronic Warfare Capability in Australia

L3Harris Technologies and RFTEQ Pty Ltd signed a memorandum of understanding to ...

25/03/2026

L3Harris to Provide Autonomous Underwater Capability for US Navy Submarines

L3Harris delivers combat-ready Torpedo Tube Launch and Recovery system, which deploys and retrieves Iver4 900 autonomous underwater vehicles through submarine t...

25/03/2026

Nielsen Names New Senior Leaders Supporting Sports, Advertising and Publishing Clients

The company expands leadership team under Chief Revenue Officer Amilcar Perez S...

25/03/2026

Stable TV Viewership in Poland in February as Warner Bros. Discovery Retains Top Spot

Winter Olympic Games Opening Ceremony features in top 10 programmes of the month...

25/03/2026

Mediaproxy to Show Upgrades to LogServer at 2026 NAB Show

Share Copy link Facebook X Linkedin Bluesky Email...

25/03/2026

Hitomi transforms production synchronisation with the lau...

Providing wide view timing visibility across the entire production chain...

25/03/2026

Bitfocus showcases complete control at NAB Show 2026

Continuing development drives advances in security, availability, access and connectivity...

25/03/2026

Caudalie Paris HQ elevates brand experience with INFiLED...

Caudalie, the renowned French cosmetics brand, has unveiled a state-of-the-art 200-seat auditorium at its new headquarters in the historic Marais district of ce...