Sony Pixel Power calrec Sony

Decoding How NVIDIA AI Workbench Powers App Development

19/06/2024

Editor's note: This post is part of the AI Decoded series, which demystifies AI by making the technology more accessible and showcases new hardware, software, tools and accelerations for NVIDIA RTX PC and workstation users.

The demand for tools to simplify and optimize generative AI development is skyrocketing. Applications based on retrieval-augmented generation (RAG) - a technique for enhancing the accuracy and reliability of generative AI models with facts fetched from specified external sources - and customized models are enabling developers to tune AI models to their specific needs.

While such work may have required a complex setup in the past, new tools are making it easier than ever.

NVIDIA AI Workbench simplifies AI developer workflows by helping users build their own RAG projects, customize models and more. It's part of the RTX AI Toolkit - a suite of tools and software development kits for customizing, optimizing and deploying AI capabilities - launched at COMPUTEX earlier this month. AI Workbench removes the complexity of technical tasks that can derail experts and halt beginners.

What Is NVIDIA AI Workbench? Available for free, NVIDIA AI Workbench enables users to develop, experiment with, test and prototype AI applications across GPU systems of their choice - from laptops and workstations to data center and cloud. It offers a new approach for creating, using and sharing GPU-enabled development environments across people and systems.

A simple installation gets users up and running with AI Workbench on a local or remote machine in just minutes. Users can then start a new project or replicate one from the examples on GitHub. Everything works through GitHub or GitLab, so users can easily collaborate and distribute work. Learn more about getting started with AI Workbench.

How AI Workbench Helps Address AI Project Challenges Developing AI workloads can require manual, often complex processes, right from the start.

Setting up GPUs, updating drivers and managing versioning incompatibilities can be cumbersome. Reproducing projects across different systems can require replicating manual processes over and over. Inconsistencies when replicating projects, like issues with data fragmentation and version control, can hinder collaboration. Varied setup processes, moving credentials and secrets, and changes in the environment, data, models and file locations can all limit the portability of projects.

AI Workbench makes it easier for data scientists and developers to manage their work and collaborate across heterogeneous platforms. It integrates and automates various aspects of the development process, offering:

Ease of setup: AI Workbench streamlines the process of setting up a developer environment that's GPU-accelerated, even for users with limited technical knowledge.

Seamless collaboration: AI Workbench integrates with version-control and project-management tools like GitHub and GitLab, reducing friction when collaborating.

Consistency when scaling from local to cloud: AI Workbench ensures consistency across multiple environments, supporting scaling up or down from local workstations or PCs to data centers or the cloud.

RAG for Documents, Easier Than Ever NVIDIA offers sample development Workbench Projects to help users get started with AI Workbench. The hybrid RAG Workbench Project is one example: It runs a custom, text-based RAG web application with a user's documents on their local workstation, PC or remote system.

Every Workbench Project runs in a container - software that includes all the necessary components to run the AI application. The hybrid RAG sample pairs a Gradio chat interface frontend on the host machine with a containerized RAG server - the backend that services a user's request and routes queries to and from the vector database and the selected large language model.

This Workbench Project supports a wide variety of LLMs available on NVIDIA's GitHub page. Plus, the hybrid nature of the project lets users select where to run inference.

Workbench Projects let users version the development environment and code. Developers can run the embedding model on the host machine and run inference locally on a Hugging Face Text Generation Inference server, on target cloud resources using NVIDIA inference endpoints like the NVIDIA API catalog, or with self-hosting microservices such as NVIDIA NIM or third-party services.

The hybrid RAG Workbench Project also includes:

Performance metrics: Users can evaluate how RAG- and non-RAG-based user queries perform across each inference mode. Tracked metrics include Retrieval Time, Time to First Token (TTFT) and Token Velocity.

Retrieval transparency: A panel shows the exact snippets of text - retrieved from the most contextually relevant content in the vector database - that are being fed into the LLM and improving the response's relevance to a user's query.

Response customization: Responses can be tweaked with a variety of parameters, such as maximum tokens to generate, temperature and frequency penalty.

To get started with this project, simply install AI Workbench on a local system. The hybrid RAG Workbench Project can be brought from GitHub into the user's account and duplicated to the local system.

More resources are available in the AI Decoded user guide. In addition, community members provide helpful video tutorials, like the one from Joe Freeman below.

Customize, Optimize, Deploy Developers often seek to customize AI models for specific use cases. Fine-tuning, a technique that changes the model by training it with additional data, can be useful for style transfer or changing model behavior. AI Workbench helps with fine-tuning, as well.

The Llama-factory AI Workbench Project enables QLoRa, a fine-tuning method that minimizes memory requirements, for a variety of models, as well as
LINK: https://blogs.nvidia.com/blog/ai-decoded-workbench-hybrid-rag/...
See more stories from nvidia

Most recent headlines

05/01/2027

Worlds first 802.15.4ab-UWB chip verified by Calterah and Rohde & Schwarz to be demoed at CES 2026

Worlds first 802.15.4ab-UWB chip verified by Calterah and Rohde & Schwarz to be ...

01/06/2026

Dolby Sets the New Standard for Premium Entertainment at CES 2026

January 6 2026, 05:30 (PST) Dolby Sets the New Standard for Premium Entertainment at CES 2026 Throughout the week, Dolby brings to life the latest innovatio...

02/05/2026

Dalet Flex LTS Delivers Smarter Search, Faster Editing, and an AI-Ready Foundation for Modern Media

Dalet, a leading technology and service provider for media-rich organizations, t...

01/05/2026

NBCUniversal's Peacock to Be First Streamer to Integrate Dolby's Full Suite of Premium Picture and Sound Innovations

January 5 2026, 18:30 (PST) NBCUniversal's Peacock to Be First Streamer to ...

02/04/2026

Scripps Completes Sale of WRTV to Circle City Broadcasting

Share Copy link Facebook X Linkedin Bluesky Email...

02/04/2026

GoVertical! AiDi Powers Real-Time 9:16 Autocropping for I...

Already deployed extensively by NBC Sports, FOR-A Corporation will demonstrate GoVertical! AiDi, the real-time 9:16 autocropping feature of viztrick AiDi, durin...

02/04/2026

Elite Media Technologies Selects Interra Systems BATON Fi...

Interra Systems, a provider of end-to-end quality assurance solutions for the digital media industry, announced that Elite Media Technologies has selected its B...

02/04/2026

TDF Expands Broadcast Channel Lineup with Harmonic

Harmonic's Media Processing Solutions Maximize Bandwidth Efficiency for Terrestrial Broadcast Delivery Harmonic (NASDAQ: HLIT) today announced that TDF, a...

02/04/2026

FOR-A's Software-Defined, AI-Powered Development Advances...

NBC Sports Deploys viztrick AiDi to Stream Live Events in 9:16 Mobile-First Formats with Auto Tracking, Development Signals Strategic Shift for FOR-A Long reco...

02/04/2026

Evergent showcases innovations in sports streaming and mo...

Evergent will showcase new innovations in subscriber lifecycle management and monetization at NAB Show 2026 (Las Vegas, April 18 22), including: New advances i...

02/04/2026

Binghamton University Strengthens Student Run Productions...

Riedel Communications is proud to be part of Binghamton University, State University of New York, Athletics' milestone year, celebrating the university'...

02/04/2026

Techex and Encompass Launch Industry-Leading Cloud-Based...

Encompass Digital Media and Techex have today announced new, fully managed, cloud-native Master Control services designed to meet the growing operational demand...

02/04/2026

Winning in the new media economy - Avid debuts fully avai...

Avid today announced it will showcase new innovations designed to help media companies win in the new media economy at NAB Show 2026 (April 18 22, Las Vegas Co...

02/04/2026

PlayBox Neo reinforces MIMO Tech with new Playout capabil...

PlayBox Neo helps AIS PLAY kick-off premier football content direct to fans PlayBox Neo has provided MIMO Tech with a brand-new major installation to extend it...

02/04/2026

Globo transitions primary distribution to SRT over IP wit...

Globo has transitioned its primary content distribution to Secure Reliable Transport over a fully IP-based managed backbone using Synamedia's Quortex PowerV...

02/04/2026

Nexstar Says Pausing Tegna Merger Creates 'Impossible' Challenges

Share Copy link Facebook X Linkedin Bluesky Email...

02/04/2026

FCC Launches Efforts to Strengthen U.S. Drone Ecosystem

Share Copy link Facebook X Linkedin Bluesky Email...

02/04/2026

WAPA+ to Launch on Dish, DishLatino, Sling TV and Sling Freestream

Share Copy link Facebook X Linkedin Bluesky Email...

02/04/2026

Student Spotlight: Al-Fadl Salem

Student Spotlight: Al-Fadl Salem The Danish singer recently performed for the queen of Denmark. April 1, 2026 By Editorial Staff Image by Junia Morrow Wh...

02/04/2026

Taku Hirano's Career Is Defined by Identity

Taku Hirano's Career Is Defined by Identity Whether he's performing, composing, teaching, or developing instruments, the do-it-all percussionist sees ...

02/04/2026

Continuing to connect with Young Ireland: 2FM Announces Brand-New Daytime Schedule

2FM Breakfast to extend on weekday mornings from 6am to 10am Doireann Garrihy m...

02/04/2026

RT NEWS ANNOUNCES BARRY LENIHAN AS NEW POLITICAL CORRESPONDENT

RT News & Current Affairs is pleased to announce the appointment of RT Radio 1 reporter, Barry Lenihan, as Political Correspondent. Barry has reported across...

02/04/2026

Press Start on April: GeForce NOW Brings 10 Games to the Cloud

No joke - GFN Thursday is skipping the tricks and heading straight into the games. April kicks off with ten new titles, bringing fresh adventures to GeForce NOW...

01/04/2026

SVG New Sponsor Spotlight: Flowstate AI's Sahil Shah on Transforming Video Content with Intelligent AI Agents

As sports media organization continue to seek out new ways to streamline their p...

01/04/2026

SVG GFX Forum 2026: Sessions Now Available to Watch on SVG PLAY

The SVG GFX Forum hit New York City earlier this month for a day packed with sessions focused on the creative strategy and technology behind today's cutting...

01/04/2026

From Buenos Aires to Mexico City, EQUAL Days Bring Latin America Together for Women in Audio

This year, Spotify celebrates the five-year anniversary of EQUAL, our global pro...

01/04/2026

FourFingers announce Tape Splice Pro plug-in

Analogue-style tape splicing in the digital domain In this era of digital recording and multiple layers of Undo, it seems that the fading art of tape splici...

01/04/2026

Zero G introduce Morphology Evolved

Latest release introduces new Orbita Engine Zero G's latest release marks the start of a new series of libraries, as well as introducing an all-new engi...

01/04/2026

Warm Audio introduce the WA-8TRX

Until now, one format has largely been left behind Warm Audio's extensive product range includes modern-day recreations of all manner of sought-after s...

01/04/2026

The Crow Hill Company announce Crystal Pianos

A piano with glass vessels for strings! The Crow Hill Company's recently released Gong Piano offered a refreshing new take on piano libraries, harnessin...

01/04/2026

ESSENCE RS from Aim Audio

Remote Streaming Studio Condenser Aim Audio have just revealed their latest creation, the ESSENCE RS Remote Streaming Studio Condenser, which becomes the wo...

01/04/2026

Call for NFVF funding applications to attend Film Festivals and Markets taking place from 08 - 31 May 2026

The National Film and Video Foundation (NFVF) is pleased to announce that the ca...

01/04/2026

AgileTV powers Liwest's next-generation TV experience with the launch of next IPTV platform in Austria

Bilbao, April 1st, 2026 - AgileTV, a leading provider of end-to-end TV technolog...

01/04/2026

Green Hippo Debuts Hands on Hippotizer Media Server Train...

Green Hippo is excited to announce the launch of its new Hippotizer Media Server training courses at Pixel Academy, a purpose built AV learning hub combining ha...

01/04/2026

TAG Video Systems and Oracle Cloud Infrastructure Partner...

TAG Video Systems, a global leader in IP-native broadcast monitoring, multiviewing, and quality control, today announced a collaboration with Oracle Cloud Infra...

01/04/2026

Professional Wireless Systems PWS Takes on Intercom and R...

Professional Wireless Systems (PWS), a leading provider of wireless audio solutions and RF management, was on site at the Caesars Superdome in New Orleans, wher...

01/04/2026

AgileTV powers Liwest next generation TV experience with...

AgileTV, a leading provider of end-to-end TV technology solutions, has deployed next , the new IPTV platform of the Austrian telco LIWEST, marking the first st...

01/04/2026

LTN and Ateme partner to deliver integrated video process...

LTN, a leader in fully managed IP video transport, and Ateme, a global leader in video compression and delivery solutions, today announced a collaboration integ...

01/04/2026

Adobe Unveils Powerful New Innovations for Creative Pros in Adobe Illustrator

Adobe Unveils Powerful New Innovations for Creative Pros in Adobe Illustrator Deepa Subramaniam April 1, 2026 0 Comments I'm excited to share that...

01/04/2026

Boland Communications Introduces QD4K315HDR10 QD-OLED Series Monitors for Live Production, Film, Post, and Broadcast

Boland Communications Introduces QD4K315HDR10 QD-OLED Series Monitors for Live P...

01/04/2026

2026 NAB Show Exhibitor Insight: Evertz

Share Copy link Facebook X Linkedin Bluesky Email...

01/04/2026

Judge Blocks Order Barring NPR and PBS From Funding

Share Copy link Facebook X Linkedin Bluesky Email...

01/04/2026

Nikon to Sell Mark Roberts Motion Control

Share Copy link Facebook X Linkedin Bluesky Email...

01/04/2026

Mediagenix Showcases Semantic Intelligence-Powered Title Management, Schedule Optimization, and Personalization at NAB 2026

Mediagenix Showcases Semantic Intelligence-Powered Title Management, Schedule Op...

01/04/2026

FCC Approves WJAX-TV License Transfer to Cox

Share Copy link Facebook X Linkedin Bluesky Email...

01/04/2026

Scripps Sports Ink Deal for Ion to Air 2026 Teal Rising Cup

Share Copy link Facebook X Linkedin Bluesky Email...

01/04/2026

UK Group Companies Unveil NAB Show Plans

Share Copy link Facebook X Linkedin Bluesky Email...

01/04/2026

Victoria Mont Brings the Multi-Hyphenate Mindset to Career Jam 2026

Victoria Mon t Brings the Multi-Hyphenate Mindset to Career Jam 2026 The Grammy-winning singer, songwriter, and producer shared how versatility and self-inves...

01/04/2026

UKTV announces expanded remit for Jonathan Newman and appoints David Swetman as Director of Content Partnerships & Sales

UKTV today announces that Jonathan Newman has formally stepped into the role of ...

01/04/2026

Rite of Spring - Save on Ivory 3 German D and American Concert D

A New Season of Expression: Save 30% On Ivory 3There are moments in music when everything changes-when new ideas break through and redefine what's possible....