
To understand the latest advance in generative AI, imagine a courtroom.
Judges hear and decide cases based on their general understanding of the law. Sometimes a case - like a malpractice suit or a labor dispute - requires special expertise, so judges send court clerks to a law library, looking for precedents and specific cases they can cite.
Like a good judge, large language models (LLMs) can respond to a wide variety of human queries. But to deliver authoritative answers that cite sources, the model needs an assistant to do some research.
The court clerk of AI is a process called retrieval-augmented generation, or RAG for short.
The Story of the Name Patrick Lewis, lead author of the 2020 paper that coined the term, apologized for the unflattering acronym that now describes a growing family of methods across hundreds of papers and dozens of commercial services he believes represent the future of generative AI.
Patrick Lewis We definitely would have put more thought into the name had we known our work would become so widespread, Lewis said in an interview from Singapore, where he was sharing his ideas with a regional conference of database developers.
We always planned to have a nicer sounding name, but when it came time to write the paper, no one had a better idea, said Lewis, who now leads a RAG team at AI startup Cohere.
So, What Is Retrieval-Augmented Generation? Retrieval-augmented generation is a technique for enhancing the accuracy and reliability of generative AI models with facts fetched from external sources.
In other words, it fills a gap in how LLMs work. Under the hood, LLMs are neural networks, typically measured by how many parameters they contain. An LLM's parameters essentially represent the general patterns of how humans use words to form sentences.
That deep understanding, sometimes called parameterized knowledge, makes LLMs useful in responding to general prompts at light speed. However, it does not serve users who want a deeper dive into a current or more specific topic.
Combining Internal, External Resources Lewis and colleagues developed retrieval-augmented generation to link generative AI services to external resources, especially ones rich in the latest technical details.
The paper, with coauthors from the former Facebook AI Research (now Meta AI), University College London and New York University, called RAG a general-purpose fine-tuning recipe because it can be used by nearly any LLM to connect with practically any external resource.
Building User Trust Retrieval-augmented generation gives models sources they can cite, like footnotes in a research paper, so users can check any claims. That builds trust.
What's more, the technique can help models clear up ambiguity in a user query. It also reduces the possibility a model will make a wrong guess, a phenomenon sometimes called hallucination.
Another great advantage of RAG is it's relatively easy. A blog by Lewis and three of the paper's coauthors said developers can implement the process with as few as five lines of code.
That makes the method faster and less expensive than retraining a model with additional datasets. And it lets users hot-swap new sources on the fly.
How People Are Using Retrieval-Augmented Generation With retrieval-augmented generation, users can essentially have conversations with data repositories, opening up new kinds of experiences. This means the applications for RAG could be multiple times the number of available datasets.
For example, a generative AI model supplemented with a medical index could be a great assistant for a doctor or nurse. Financial analysts would benefit from an assistant linked to market data.
In fact, almost any business can turn its technical or policy manuals, videos or logs into resources called knowledge bases that can enhance LLMs. These sources can enable use cases such as customer or field support, employee training and developer productivity.
The broad potential is why companies including AWS, IBM, Glean, Google, Microsoft, NVIDIA, Oracle and Pinecone are adopting RAG.
Getting Started With Retrieval-Augmented Generation To help users get started, NVIDIA developed a reference architecture for retrieval-augmented generation. It includes a sample chatbot and the elements users need to create their own applications with this new method.
The workflow uses NVIDIA NeMo, a framework for developing and customizing generative AI models, as well as software like NVIDIA Triton Inference Server and NVIDIA TensorRT-LLM for running generative AI models in production.
The software components are all part of NVIDIA AI Enterprise, a software platform that accelerates development and deployment of production-ready AI with the security, support and stability businesses need.
Getting the best performance for RAG workflows requires massive amounts of memory and compute to move and process data. The NVIDIA GH200 Grace Hopper Superchip, with its 288GB of fast HBM3e memory and 8 petaflops of compute, is ideal - it can deliver a 150x speedup over using a CPU.
Once companies get familiar with RAG, they can combine a variety of off-the-shelf or custom LLMs with internal or external knowledge bases to create a wide range of assistants that help their employees and customers.
RAG doesn't require a data center. LLMs are debuting on Windows PCs, thanks to NVIDIA software that enables all sorts of applications users can access even on their laptops.
An example application for RAG on a PC. PCs equipped with NVIDIA RTX GPUs can now run some AI models locally. By using RAG on a PC, users can link to a private knowledge source - whether that be emails, notes or articles - to improve responses. The user can then feel confident that their data source, prompts and response all remain private and secure.
A recent blog provides an example of RAG acc
Most recent headlines
05/01/2027
Worlds first 802.15.4ab-UWB chip verified by Calterah and Rohde & Schwarz to be ...
04/08/2026
Dalet, a leading technology and service provider for media-rich organizations, t...
04/07/2026
April 7 2026, 19:00 (PDT) Detective Conan: Fallen Angel of the Highway Opens in...
01/06/2026
January 6 2026, 05:30 (PST) Dolby Sets the New Standard for Premium Entertainment at CES 2026
Throughout the week, Dolby brings to life the latest innovatio...
02/05/2026
Dalet, a leading technology and service provider for media-rich organizations, t...
01/05/2026
January 5 2026, 18:30 (PST) NBCUniversal's Peacock to Be First Streamer to ...
28/04/2026
The audio team for the entertainment event must blend speech intelligibility with full-range music reproduction while considering the broadcast
Last week's...
28/04/2026
The Pac-12 Conference has released an updated primary mark and logo as the starting point of the new league's brand identity. The mark was soft-launched acr...
28/04/2026
The DP World Tour and Amazon Leo have signed an agreement making Amazon's lo...
28/04/2026
Pixellot and HELIOS have announced an integration that automatically converts full-game hockey video into individualized shift videos for each athlete, without ...
28/04/2026
Daktronics has partnered with the Asheville Tourists to manufacture and install a new LED video display. The installation was completed in late 2025 and is now ...
28/04/2026
Eutelsat has announced the renewal of its partnership with PCTV, a content aggregation and distribution company in Mexico and part of Megacable Holdings, for co...
28/04/2026
Daktronics has partnered with the Gary SouthShore RailCats to install a new LED video display at U.S. Steel Yard, replacing the previous Daktronics display inst...
28/04/2026
Telos Alliance and the College Radio Foundation have announced that WWSU-FM of W...
28/04/2026
Golf viewership is growing. The 2025 Ryder Cup drew five million viewers in the UK, a 45% increase over the 2023 event. The US Open was the most streamed golf e...
28/04/2026
The CW Network and WWE, part of TKO Group Holdings (NYSE: TKO), have announced t...
28/04/2026
The Alliance for IP Media Solutions (AIMS) has announced that the Internet Protocol Media Experience (IPMX) suite of standards and specifications has been named...
28/04/2026
The 2026 NAB Show is in the books and the show once again served up a cavalcade ...
28/04/2026
Gray Media and RAJ Sports have announced Rose City SportsNet (RCSN), a new netwo...
28/04/2026
Today, we announced our First Quarter 2026 earnings, starting the Year of Raising Ambition with strong momentum across the business and continued innovation acr...
28/04/2026
I dag presenterade vi v rt resultat f r det f rsta kvartalet 2026. Vi inleder ret med starkt momentum i hela verksamheten och fortsatt innovation p plattforme...
28/04/2026
New handheld promises studio performance for the stage
Mojave have just introduced a new live-focused handheld vocal mic created by award-winning designer D...
28/04/2026
Max for Live device offers AI-powered stem separation
Dynamic Split Module (DSM) is a new Max for Live device created by Ostin Solo, a developer and musican...
28/04/2026
First interface equipped with ISA preamps
Focusrite have just announced the launch of a new high-end audio interface that features a pair of their legendary...
28/04/2026
Triton Digital's Podcast Metrics Demos+ Data Integration Enables Comprehensi...
28/04/2026
Share
Copy link
Facebook
X
Linkedin
Bluesky
Email...
28/04/2026
Share
Copy link
Facebook
X
Linkedin
Bluesky
Email...
28/04/2026
TAG Video Systems, the leading IP-native Realtime Media Platform, today announced that Lens, its visual service health interface for broadcast operations, recei...
28/04/2026
Open AV-over-IP Standard Recognized in IT Networking/Infrastructure and Security Category
The Alliance for IP Media Solutions (AIMS) today announced that the ...
28/04/2026
VFX History: Slit Scan
Graham Quince April 28, 2026
0 Comments
How did 2001: A Space Odyssey, Star Wars, Doctor Who and Star Trek: The Next Generation...
28/04/2026
These DaVinci Resolve Effects Will Make You a More Creative Colorist
Kasia Jarco April 28, 2026
0 Comments
Creativity in color grading is not about ha...
28/04/2026
A Simple Introduction to Cavalry: Indexed Circle
Simon Ubsdell April 28, 2026
0 Comments
In this new introductory tutorial for Cavalry we're going...
28/04/2026
Rise, the award-winning advocacy group for gender diversity in the broadcast and media technology sector, is pleased to announce a new global training programme...
28/04/2026
Clear-Com has appointed Brian Grahn as Market Outreach Manager of the Americas and Ben Turnwell as Business Development Manager for EMEA live, expanding their ...
28/04/2026
LiveU is inviting MPTS visitors to step into the companys new Q Era on Stand D32, at The Grand Hall, Olympia, London (May 13-14). The company will showcase its ...
28/04/2026
IBC today announces the launch of the IBC2026 Innovation Awards, with nominations now open for projects, programmes and initiatives that exemplify breakthrough ...
28/04/2026
Share
Copy link
Facebook
X
Linkedin
Bluesky
Email...
28/04/2026
Share
Copy link
Facebook
X
Linkedin
Bluesky
Email...
28/04/2026
Share
Copy link
Facebook
X
Linkedin
Bluesky
Email...
28/04/2026
Share
Copy link
Facebook
X
Linkedin
Bluesky
Email...
28/04/2026
Share
Copy link
Facebook
X
Linkedin
Bluesky
Email...
28/04/2026
Share
Copy link
Facebook
X
Linkedin
Bluesky
Email...
28/04/2026
Introducing Nx 3-Strip v2 - A Physics-Based Technicolor Reconstruction for DaVin...
28/04/2026
April 28th, 2026 Press Materials Available Here
TRIBECA FESTIVAL MARKS 25 YEAR...
28/04/2026
In 2023, Norwegian climber Kristin Harila set out to break a mountaineering reco...
28/04/2026
LinkedIn Top Companies 2026: Where Career Growth Is Happening Now Published on Apr 28, 2026 Categories: Data and insights
LinkedIn Corporate Communication...
28/04/2026
Editor's note: This post is part of Into the Omniverse, a series focused on how developers, 3D practitioners, and enterprises can transform their workflows ...
28/04/2026
AI agent systems today juggle separate models for vision, speech and language - ...
28/04/2026
RT News is pleased to announce the appointment of Sean Whelan as its new London Correspondent.
Sean has held the role of Washington Correspondent for the last...
28/04/2026
Joseph O'Connor, Eileen Walsh, Louise Duffy, Mick Lynch, Gormfhlaith N Thuairisg and Dermot Bannon take a personal look at life 100yrs ago in new TV docume...