Sony Pixel Power calrec Sony

What Is Retrieval-Augmented Generation?

15/11/2023

To understand the latest advance in generative AI, imagine a courtroom.

Judges hear and decide cases based on their general understanding of the law. Sometimes a case - like a malpractice suit or a labor dispute - requires special expertise, so judges send court clerks to a law library, looking for precedents and specific cases they can cite.

Like a good judge, large language models (LLMs) can respond to a wide variety of human queries. But to deliver authoritative answers that cite sources, the model needs an assistant to do some research.

The court clerk of AI is a process called retrieval-augmented generation, or RAG for short.

The Story of the Name Patrick Lewis, lead author of the 2020 paper that coined the term, apologized for the unflattering acronym that now describes a growing family of methods across hundreds of papers and dozens of commercial services he believes represent the future of generative AI.

Patrick Lewis We definitely would have put more thought into the name had we known our work would become so widespread, Lewis said in an interview from Singapore, where he was sharing his ideas with a regional conference of database developers.

We always planned to have a nicer sounding name, but when it came time to write the paper, no one had a better idea, said Lewis, who now leads a RAG team at AI startup Cohere.

So, What Is Retrieval-Augmented Generation? Retrieval-augmented generation is a technique for enhancing the accuracy and reliability of generative AI models with facts fetched from external sources.

In other words, it fills a gap in how LLMs work. Under the hood, LLMs are neural networks, typically measured by how many parameters they contain. An LLM's parameters essentially represent the general patterns of how humans use words to form sentences.

That deep understanding, sometimes called parameterized knowledge, makes LLMs useful in responding to general prompts at light speed. However, it does not serve users who want a deeper dive into a current or more specific topic.

Combining Internal, External Resources Lewis and colleagues developed retrieval-augmented generation to link generative AI services to external resources, especially ones rich in the latest technical details.

The paper, with coauthors from the former Facebook AI Research (now Meta AI), University College London and New York University, called RAG a general-purpose fine-tuning recipe because it can be used by nearly any LLM to connect with practically any external resource.

Building User Trust Retrieval-augmented generation gives models sources they can cite, like footnotes in a research paper, so users can check any claims. That builds trust.

What's more, the technique can help models clear up ambiguity in a user query. It also reduces the possibility a model will make a wrong guess, a phenomenon sometimes called hallucination.

Another great advantage of RAG is it's relatively easy. A blog by Lewis and three of the paper's coauthors said developers can implement the process with as few as five lines of code.

That makes the method faster and less expensive than retraining a model with additional datasets. And it lets users hot-swap new sources on the fly.

How People Are Using Retrieval-Augmented Generation With retrieval-augmented generation, users can essentially have conversations with data repositories, opening up new kinds of experiences. This means the applications for RAG could be multiple times the number of available datasets.

For example, a generative AI model supplemented with a medical index could be a great assistant for a doctor or nurse. Financial analysts would benefit from an assistant linked to market data.

In fact, almost any business can turn its technical or policy manuals, videos or logs into resources called knowledge bases that can enhance LLMs. These sources can enable use cases such as customer or field support, employee training and developer productivity.

The broad potential is why companies including AWS, IBM, Glean, Google, Microsoft, NVIDIA, Oracle and Pinecone are adopting RAG.

Getting Started With Retrieval-Augmented Generation To help users get started, NVIDIA developed a reference architecture for retrieval-augmented generation. It includes a sample chatbot and the elements users need to create their own applications with this new method.

The workflow uses NVIDIA NeMo, a framework for developing and customizing generative AI models, as well as software like NVIDIA Triton Inference Server and NVIDIA TensorRT-LLM for running generative AI models in production.

The software components are all part of NVIDIA AI Enterprise, a software platform that accelerates development and deployment of production-ready AI with the security, support and stability businesses need.

Getting the best performance for RAG workflows requires massive amounts of memory and compute to move and process data. The NVIDIA GH200 Grace Hopper Superchip, with its 288GB of fast HBM3e memory and 8 petaflops of compute, is ideal - it can deliver a 150x speedup over using a CPU.

Once companies get familiar with RAG, they can combine a variety of off-the-shelf or custom LLMs with internal or external knowledge bases to create a wide range of assistants that help their employees and customers.

RAG doesn't require a data center. LLMs are debuting on Windows PCs, thanks to NVIDIA software that enables all sorts of applications users can access even on their laptops.

An example application for RAG on a PC. PCs equipped with NVIDIA RTX GPUs can now run some AI models locally. By using RAG on a PC, users can link to a private knowledge source - whether that be emails, notes or articles - to improve responses. The user can then feel confident that their data source, prompts and response all remain private and secure.

A recent blog provides an example of RAG acc
LINK: https://blogs.nvidia.com/blog/what-is-retrieval-augmented-generation/...
See more stories from nvidia

Most recent headlines

09/11/2025

Dalet Unveils Agentic AI Media Workflows at IBC2025

Dalet today announced a transformative leap forward for media operations: Agentic Artificial Intelligence (AI) that unifies the Dalet ecosystem under one natura...

14/10/2025

Tiny toys, big missions: Knee High Spies launches on ABC this November

14 10 2025 - Media release Tiny toys, big missions: Knee High Spies launches on ABC this November Knee High Spies Kids, assemble! The ABC and Screen Australi...

13/10/2025

Spectrum Brings Selected L.A. Lakers Games to Apple Vision Pro With New Immersive Presentation

Spectrum Brings Selected L.A. Lakers Games to Apple Vision Pro With New Immersiv...

13/10/2025

Media Climate Accord Aims to Offer United Approach to M&E Industry Sustainability Efforts

Media Climate Accord aims to offer united approach to M&E industry sustainabilit...

13/10/2025

Riot Games Streamlines Production of Valorant Champions Paris with ST 2110 Flypack

Riot Games streamlines production of Valorant Champions Paris with ST 2110 flypa...

13/10/2025

Feeling the NRG: Riot Games Puts on a Show for Valorant Champions Paris Final

Feeling the NRG: Riot Games puts on a show for Valorant Champions Paris final By Jo Ruddock Monday, October 13, 2025 - 09:17 Print This Story After more t...

13/10/2025

FOX Sports MLB Postseason Audio Aims To Make Officials' Calls More Accurate

FOX Sports MLB Postseason Audio Aims To Make Officials' Calls More AccurateA1 Joe Carpenter hopes to bring some baseball CSI' to the ABS ump-cam system...

13/10/2025

New SBS and NITV Original RECKLESS a Deadly Funny Thriller Straight Out of Freo - Premieres Wednesday November 12 at 8:30pm

New SBS and NITV Original RECKLESS a Deadly Funny Thriller Straight Out of Fre...

13/10/2025

Mid-Atlantic Sports Network strikes all-IP video distribu...

Regional sports network moves from satellite to IP to cut distribution costs by more than half and streamline broadcast and direct-to-consumer delivery Mid-Atl...

13/10/2025

Delta Live invests in the future of live sound for presti...

Delta Live, the award-winning audio supplier, has underlined its position at the forefront of live sound with significant investments in cutting edge audio syst...

13/10/2025

Elon Musk Gets Just-Launched NVIDIA DGX Spark: Petaflop AI Supercomputer Lands at SpaceX

The next AI revolution starts where rockets launch. NVIDIA DGX Spark's first...

13/10/2025

Space42 Expands Access to Geospatial Intelligence with Launch of GIQ on Microsoft Azure

Abu Dhabi, UAE October 13, 2025: Space42 (ADX: SPACE42), the UAE-based AI-powe...

13/10/2025

Nick Blood and Saffron Hocking lead casting for Hit Point

Nick Blood and Saffron Hocking lead casting for Hit Point, brand new original drama series for U and U&Dave Developed & Produced by Urban Myth Films (a STUDIOC...

13/10/2025

Sky exclusively picks up Crystal Lake, the highly anticipated prequel to the Friday the 13th franchise

The series from A24 will land in the UK & Ireland in 2026Monday 13 October 2025 ...

13/10/2025

Grand Galaxy Hotel' Open for Business: Netflix Confirms Production and Cast

Back to All News Grand Galaxy Hotel' Open for Business: Netflix Confirms Production and Cast Entertainment 13 October 2025 GlobalSouth Korea Link copi...

13/10/2025

Netflix Partners with GOBELINS Paris and Guillermo del Toro to Ignite the Future of Stop-Motion Animation

Back to All News Netflix Partners with GOBELINS Paris and Guillermo del Toro to...

13/10/2025

Stories Set to Thrill, Move, and Entertain: Netflix Announces Six Tamil and Telugu Originals

Back to All News Stories Set to Thrill, Move, and Entertain: Netflix Announces ...

13/10/2025

Fox Corporation Executives to Discuss First Quarter Fiscal 2026 Financial Results Via Webcast

Fox Corporation Executives to Discuss First Quarter Fiscal 2026 Financial Result...

13/10/2025

NVIDIA, Partners Drive Next-Gen Efficient Gigawatt AI Factories in Buildup for Vera Rubin

At the OCP Global Summit, NVIDIA is offering a glimpse into the future of gigawa...

13/10/2025

Dark crime comedy drama Obituary returns to RT for a second season

Season 2 brings murder and West of Ireland humour - and rain - to our screens, with M ir ad Tyers joining the cast Watch trailer here. A small-town obituary w...

13/10/2025

RT to air live in-depth interviews with Presidential candidates

The Katie Hannon Interview Live airs tonight & Wednesday night at 7pm As part of RT 's comprehensive election campaign coverage, journalist Katie Hannon w...

11/10/2025

SVG New Sponsor Spotlight: TAB M Solutions' Joe Wire, Kevin Tucker on Guiding Clients in Control Room Builds

SVG New Sponsor Spotlight: TAB M Solutions' Joe Wire, Kevin Tucker on Guidin...

11/10/2025

Give Me the Backstory: Get to Know Bill Condon, the Writer-Director Behind Kiss of the Spider Woman

By Jessica Herndon One of the most exciting things about the Sundance Film Fest...

11/10/2025

Charter Launches Spectrum App Store, Inks Apple Vision Pro Deal

STAMFORD, Conn. In a move that highlights the growing importance of streaming apps on pay TV platforms, Charter Communications' Spectrum operating brand has...

11/10/2025

Netflix Launches Party Games for TVs

Netflix is expanding its video game offerings from mobile into TV by launching party games that its subscribers can play on smart TVs....

11/10/2025

Spectrum News Expands Distribution to Comcast's Xfinity TV Customers

STAMFORD, Conn. Charter Communications' Spectrum News has reached an deal with Comcast to expand distribution of its local news channels to Xfinity TV cust...

11/10/2025

Brightline Lighting AV 720 Low Voltage Control and Flex-T...

Professional podcasts are booming. They're an effective way to establish company executives as industry leaders, humanize a large organization, drill down o...

11/10/2025

Award-Winning PlayBox Neo Suite Makes US Debut at the NA...

PlayBox Neo, a leading provider of media playout and channel branding solutions, will present its PlayBox Neo Suite media platform for the first time in the U.S...

11/10/2025

FOR-A America to Show Commitment to US Broadcast Market a...

As a testament to its commitment to the broadcast market, FOR-A America will bring several popular and future-facing technologies to the NAB Show New York, runn...

11/10/2025

Profuz Digital remains at the forefront of Broadcast Inno...

European technology developer Profuz Digital reflects on another successful IBC Show in Amsterdam from 12 15 September after showcasing the latest version of ...

11/10/2025

Catch COBALT ARIA at NAB NY - Audio Monitoring Minus the...

Cobalt Digital, the leading designer and manufacturer of award-winning signal processing products, and a founding partner in the openGear initiative, is headin...

11/10/2025

Lightware UBEX powers AV-over-IP across education immersi...

Lightware, an industry leader in signal management, is at the center of a growing range of high-profile integrations with its UBEX platform. Built to deliver un...

11/10/2025

CAPER 2025 Next Stop for FOR-A Latin America and the Cari...

FOR-A Latin America and the Caribbean (LAC) will bring its industry-leading signal processing, frame rate conversion and graphics playout software to CAPER 2025...

11/10/2025

Clear-Com and BNE Productions Amplify Dreamstate and Apoc...

Clear-Com is happy to announce its latest collaboration with BNE Productions, a premier production company known for delivering world-class audio for live even...

11/10/2025

Dean's List: Tommy Neblett Shares His YouTube Top Five

Dean's List: Tommy Neblett Shares His YouTube Top Five Boston Conservatory's dean of dance reveals his favorite student dance videos. By Sarah Godcher...

10/10/2025

SVG New Sponsor Spotlight: TAB M Solutions' Joe Wire, Jeff Tucker on Guiding Clients in Control Room Builds

SVG New Sponsor Spotlight: TAB M Solutions' Joe Wire, Jeff Tucker on Guiding...

10/10/2025

SVG Students To Watch: Vincent Macri, Monmouth University

SVG Students To Watch: Vincent Macri, Monmouth University The Jersey local runs Camera 1 on Hawks games and is expanding into technical directing By Brandon Co...

10/10/2025

Bundesliga Spotlight: Inside the DFL's New Customizable Camera Concepts for Bundesliga and Bundesliga 2 Rights Holders

Flexible budgets: Inside the DFL's new customisable camera concepts for Bund...

10/10/2025

Bundesliga Spotlight: TVN Talks Remote Production and Bells and Whistles

Facing the future: TVN on its technical services for the new Bundesliga season with remote production and all the bells and whistles By Heather McLean Monday...

10/10/2025

Frauen-Bundesliga Spotlight: Developing Broadcast Expertise and Pushing the Women's Game at the Deutscher Fuball-Bund

Evolving in-house: Developing broadcast expertise and pushing the women's ga...

10/10/2025

Frauen-Bundesliga Spotlight: The Deutscher Fuball-Bund on Pushing Production Innovation for the Rapidly Growing Women's Bundesliga

Growing the game: The Deutscher Fu ball-Bund on pushing production innovation fo...

10/10/2025

Bundesliga Spotlight: DFL Boosts Broadcaster Content Access Through Tent Pole' Match Concept

Proximity and authenticity: DFL kicks off the new football season with more broa...

10/10/2025

Spectrum Brings Select L.A. Lakers Games to Apple Vision Pro With New Immersive Presentation

Spectrum Brings Select L.A. Lakers Games to Apple Vision Pro With New Immersive ...

10/10/2025

Fairyland Delves Deeply Into the Tragedy of the AIDS Crisis

From left, Scoot McNairy, Andrew Durham, Nessa Dougherty, and Emilia Jones attend the premiere of Fairyland at the 2023 Sundance Film Festival. Photo by Jemal...

10/10/2025

Recognizing the Innovations Re-shaping Film & TV Right Now

By Chuck Parker, CEO of Sohonet If you work in film and television, you can feel it: anxiety is high. Budgets are tight, schedules are tighter, and AI is a c...

10/10/2025

L3Harris Supports American Rheinmetall's Next-Gen Combat Vehicle

L3Harris' WESCAM MX-Series EO/IR sensor systems have a long history of supporting complex missions in harsh environments, as seen here on a Kaplan-20 Next G...

10/10/2025

Catch COBALT ARIA at NAB NY Audio Monitoring Minus the Anxiety

Cobalt Digital Booth # 607 // Journalists: Click to visit Cobalt NAB NY 2025 Audio monitors join Cobalt's platform, including its latest routers, multiview...

10/10/2025

Nielsen unveils blueprint to achieve confident ROI

NEW YORK - October 9, 2025 - Nielsen, the global leader in audience measurement, data and analytics, today announced the release of The Marketing ROI Blueprint:...

10/10/2025

Cobalt Digital to Showcase Aria Audio Solutions at NAB Show New York

CHAMPAIGN, Ill. Cobalt Digital will feature its Aria series of audio solutions designed to simplify monitoring, embedding and routing at NAB Show New York, set ...

10/10/2025

Prime Video to Stream PGA Tour's The Skins Game on Black Friday

LOS ANGELES and PONTE VEDRA BEACH, Florida Amazon's Prime Video has announced a new deal that will allow it to exclusively stream a revival of the PGA Tour&...