
To understand the latest advance in generative AI, imagine a courtroom.
Judges hear and decide cases based on their general understanding of the law. Sometimes a case - like a malpractice suit or a labor dispute - requires special expertise, so judges send court clerks to a law library, looking for precedents and specific cases they can cite.
Like a good judge, large language models (LLMs) can respond to a wide variety of human queries. But to deliver authoritative answers that cite sources, the model needs an assistant to do some research.
The court clerk of AI is a process called retrieval-augmented generation, or RAG for short.
The Story of the Name Patrick Lewis, lead author of the 2020 paper that coined the term, apologized for the unflattering acronym that now describes a growing family of methods across hundreds of papers and dozens of commercial services he believes represent the future of generative AI.
Patrick Lewis We definitely would have put more thought into the name had we known our work would become so widespread, Lewis said in an interview from Singapore, where he was sharing his ideas with a regional conference of database developers.
We always planned to have a nicer sounding name, but when it came time to write the paper, no one had a better idea, said Lewis, who now leads a RAG team at AI startup Cohere.
So, What Is Retrieval-Augmented Generation? Retrieval-augmented generation is a technique for enhancing the accuracy and reliability of generative AI models with facts fetched from external sources.
In other words, it fills a gap in how LLMs work. Under the hood, LLMs are neural networks, typically measured by how many parameters they contain. An LLM's parameters essentially represent the general patterns of how humans use words to form sentences.
That deep understanding, sometimes called parameterized knowledge, makes LLMs useful in responding to general prompts at light speed. However, it does not serve users who want a deeper dive into a current or more specific topic.
Combining Internal, External Resources Lewis and colleagues developed retrieval-augmented generation to link generative AI services to external resources, especially ones rich in the latest technical details.
The paper, with coauthors from the former Facebook AI Research (now Meta AI), University College London and New York University, called RAG a general-purpose fine-tuning recipe because it can be used by nearly any LLM to connect with practically any external resource.
Building User Trust Retrieval-augmented generation gives models sources they can cite, like footnotes in a research paper, so users can check any claims. That builds trust.
What's more, the technique can help models clear up ambiguity in a user query. It also reduces the possibility a model will make a wrong guess, a phenomenon sometimes called hallucination.
Another great advantage of RAG is it's relatively easy. A blog by Lewis and three of the paper's coauthors said developers can implement the process with as few as five lines of code.
That makes the method faster and less expensive than retraining a model with additional datasets. And it lets users hot-swap new sources on the fly.
How People Are Using Retrieval-Augmented Generation With retrieval-augmented generation, users can essentially have conversations with data repositories, opening up new kinds of experiences. This means the applications for RAG could be multiple times the number of available datasets.
For example, a generative AI model supplemented with a medical index could be a great assistant for a doctor or nurse. Financial analysts would benefit from an assistant linked to market data.
In fact, almost any business can turn its technical or policy manuals, videos or logs into resources called knowledge bases that can enhance LLMs. These sources can enable use cases such as customer or field support, employee training and developer productivity.
The broad potential is why companies including AWS, IBM, Glean, Google, Microsoft, NVIDIA, Oracle and Pinecone are adopting RAG.
Getting Started With Retrieval-Augmented Generation To help users get started, NVIDIA developed a reference architecture for retrieval-augmented generation. It includes a sample chatbot and the elements users need to create their own applications with this new method.
The workflow uses NVIDIA NeMo, a framework for developing and customizing generative AI models, as well as software like NVIDIA Triton Inference Server and NVIDIA TensorRT-LLM for running generative AI models in production.
The software components are all part of NVIDIA AI Enterprise, a software platform that accelerates development and deployment of production-ready AI with the security, support and stability businesses need.
Getting the best performance for RAG workflows requires massive amounts of memory and compute to move and process data. The NVIDIA GH200 Grace Hopper Superchip, with its 288GB of fast HBM3e memory and 8 petaflops of compute, is ideal - it can deliver a 150x speedup over using a CPU.
Once companies get familiar with RAG, they can combine a variety of off-the-shelf or custom LLMs with internal or external knowledge bases to create a wide range of assistants that help their employees and customers.
RAG doesn't require a data center. LLMs are debuting on Windows PCs, thanks to NVIDIA software that enables all sorts of applications users can access even on their laptops.
An example application for RAG on a PC. PCs equipped with NVIDIA RTX GPUs can now run some AI models locally. By using RAG on a PC, users can link to a private knowledge source - whether that be emails, notes or articles - to improve responses. The user can then feel confident that their data source, prompts and response all remain private and secure.
A recent blog provides an example of RAG acc
Most recent headlines
06/10/2025
France T l visions, France's leading broadcaster, has received the 2025 EBU ...
04/09/2025
Monumental Sports & Entertainment (MSE), in collaboration with Dalet, has been a...
15/06/2025
July 2025 in Dublin, Berlin, Amsterdam & London
Photo: Thea Martre
Music Production for Women (MPW) have announced that they will be running a series of fo...
15/06/2025
Composer/producer launches free virtual instruments
Sulcata Sound is the latest venture of Jason Graves, a two-time British Academy Award-winnning composer,...
14/06/2025
NEW YORK Pluto TV and the All Womens Sports Network have launched a free ad-supported streaming TV (FAST) AWSN channel in the U.S., Canada, the U.K. and the Nor...
14/06/2025
NEW YORK and CINCINNATI E.W. Scripps has announced a new, multiyear agreement with the WNBA that will continue Ions regular-season coverage of the league on Fri...
14/06/2025
WASHINGTON The National Association of Broadcasters highlighted the hidden importance of spectrum in the production of major sporting events and described wha...
14/06/2025
WASHINGTON Sunsetting ATSC 1.0, expanding business opportunities for NextGen Broadcast and increasing international adoption of the ATSC 3.0 standard were top o...
14/06/2025
SAN FRANCISCO Samba TV and Acxiom have announced that they will dramatically expand their longstanding relationship....
14/06/2025
July 2025 in Dublin, Berlin, Amsterdam & London
Photo: Thea Martre
Music Production for Women (MPW) have announced that they will be running a series of fo...
14/06/2025
San Francisco State University's School of Cinema Uses Blackmagic Design
Brie Clayton June 13, 2025
0 Comments
More than 40 Blackmagic Design came...
14/06/2025
Boris FX Mocha Pro Adds New AI Tools To Tackle VFX Tasks Fast
Jessie Electa Petrov June 13, 2025
0 Comments
The 2025.5 release helps artists work more...
14/06/2025
AJA Debuts DRM2-Plus Mini-Converter Frame at InfoComm 2025
Brie Clayton June 13, 2025
0 Comments
Next-gen frame addresses diverse rackmount needs wit...
13/06/2025
(L-R) Lindsay Utz, Michelle Walshe, and The Right Honourable Dame Jacinda Ardern attend the 2025 Sundance Film Festival premiere of Prime Minister at Eccles T...
13/06/2025
Photo credit: Atsushi Nishijima
If you're a true lover of rom-coms, chances...
13/06/2025
Pure Drama and Fierce Rivalries set to dominate the world's most iconic spor...
13/06/2025
Johannesburg, 12 June 2025 - The National Film and Video Foundation (NFVF), an a...
13/06/2025
ABILENE. Texas A severe storm knocked down the tower and severely damaged the news studio and main facility of Sinclair-owned KTXS here on Sunday, June 8....
13/06/2025
Berklee's Music Business/Management Department Recognized by the Music Biz A...
13/06/2025
WASHINGTON The ATSC, the Broadcast Standards Association, honored veteran technologist Aldo Cugnini and Clarence Hau, Senior Vice President of Standards, Policy...
13/06/2025
(Editor's note: The 2025 UFL Championship Game between the D.C. Defenders and Michigan Panthers kicks off Saturday, June 14, at 8 p.m. Eastern. The game wil...
13/06/2025
New iPad/iPhone synth App announced
Following on from last year's release of Gradient Synth - which reached #6 on the App Store's Paid Music charts ...
13/06/2025
LONDON Warner Bros. Discovery has announced that HBO Max will launch direct-to-consumer in multiple new countries this July as the streamer becomes available in...
13/06/2025
AI voice transcription and captioning platform Verbit has added a new feature to its Captivate ASR solution the ability to identify specific features in automat...
13/06/2025
WASHINGTON Federal Communications Commission member Anna Gomez has wrapped up two weeks in California visiting broadcasters, television studio executives, enter...
13/06/2025
WASHINGTON The U.S. House of Representatives voted mostly along party lines to approve a rescission package that would cancel $9.4 billion in previously approve...
13/06/2025
At InfoComm 2025, AJA Video Systems announced DRM2-Plus, an intuitive, high-capacity 3RU frame that can neatly house up to 24 AJA Mini-Converters. Tailored to s...
13/06/2025
Cinema advertising leader to leverage AOS and suite of AI-enabled solutions to optimize forecasting, yield management, and streamlined ad sales and operations a...
13/06/2025
Manfrotto has launched the ONE Hybrid Tripod, a new support system designed specifically for professional content creators working with mirrorless cameras acros...
13/06/2025
Leading video software provider, Synamedia, today announced that its Media Edge Gateway (MEG), an ATSC 3.0 software-based IRD, now supports Device Security requ...
13/06/2025
LiveU, the global leader in live IP-video contribution, production and distribution solutions, is deepening its commitment to the German-speaking market with th...
13/06/2025
Chaos, the leader in architectural visualisation software, today announces Chaos Corona 13, giving archviz designers new ways to add eye-catching style and flai...
13/06/2025
PALI's Nena Music Video Shot with Blackmagic Design
Brie Clayton June 12, 2025
0 Comments
Blackmagic Cinema Camera 6K and DaVinci Resolve Studio b...
13/06/2025
OddBeast Powers Up iRobot's Newest Roombas with Suite of CGI Launch Assets
Brie Clayton June 12, 2025
0 Comments
The motion design and production ...
13/06/2025
On Chick Coreas Birthday, a Newly Uncovered Archival Release The Visitors, composed by Corea and performed by vibraphonist Gary Burton and pianist Kirill Gers...
13/06/2025
In fulfilment of a recommendation by the Government's Expert Advisory Commit...
13/06/2025
SVG Sit-Down: Backblaze's Gleb Budman Talks Products, Partnerships, and the ...
13/06/2025
SVG Sit-Down: DAZN's Walker Jacobs Calls Streaming the FIFA Club World Cup ...
13/06/2025
New Sponsor Spotlight: Vecima Networks' Paul Strickland on How Improving QoE...
13/06/2025
Pitch Perspective: Where's Next for Specialty Cameras in Soccer? Leaders from Sky Austria and ACS discuss the possibilities of camera placement pitchside B...
13/06/2025
Premiership Rugby Final 2025: Vintage clash between Bath and Leicester gets full...
13/06/2025
Premiership Rugby Final 2025: TNT Sports gears up for Bath vs Leicester battle w...
13/06/2025
NCAA Men's College World Series: ESPN Adds Two-Point SupraCam, Invests in Ne...
13/06/2025
New FSWX signal and spectrum analyzer with novel architecture overcomes limits o...
13/06/2025
Apple today announced the addition of iPad to Self Service Repair, providing iPad owners with access to repair manuals, genuine Apple parts, Apple Diagnostics t...
13/06/2025
CUPERTINO, CALIFORNIA Apple today previewed iOS 26, a major update that brings a beautiful new design, intelligent experiences, and improvements to the apps use...
13/06/2025
At Apple's Worldwide Developers Conference (WWDC), Apple unveiled Apple Games, an all-new destination designed to help players jump back into the games they...
13/06/2025
Industrial AI isn't slowing down. Germany is ready.
Following London Tech Week and GTC Paris at VivaTech, NVIDIA founder and CEO Jensen Huang's Europea...
12/06/2025
In 2018, Spotify launched Heart & Soul, a mental health initiative developed to ...
12/06/2025
50 Years Strong: SBS and NITV Supercharge NAIDOC Week 2025 in a joint 50th celeb...