Sony Pixel Power calrec Sony

What Is Retrieval-Augmented Generation?

15/11/2023

To understand the latest advance in generative AI, imagine a courtroom.

Judges hear and decide cases based on their general understanding of the law. Sometimes a case - like a malpractice suit or a labor dispute - requires special expertise, so judges send court clerks to a law library, looking for precedents and specific cases they can cite.

Like a good judge, large language models (LLMs) can respond to a wide variety of human queries. But to deliver authoritative answers that cite sources, the model needs an assistant to do some research.

The court clerk of AI is a process called retrieval-augmented generation, or RAG for short.

The Story of the Name Patrick Lewis, lead author of the 2020 paper that coined the term, apologized for the unflattering acronym that now describes a growing family of methods across hundreds of papers and dozens of commercial services he believes represent the future of generative AI.

Patrick Lewis We definitely would have put more thought into the name had we known our work would become so widespread, Lewis said in an interview from Singapore, where he was sharing his ideas with a regional conference of database developers.

We always planned to have a nicer sounding name, but when it came time to write the paper, no one had a better idea, said Lewis, who now leads a RAG team at AI startup Cohere.

So, What Is Retrieval-Augmented Generation? Retrieval-augmented generation is a technique for enhancing the accuracy and reliability of generative AI models with facts fetched from external sources.

In other words, it fills a gap in how LLMs work. Under the hood, LLMs are neural networks, typically measured by how many parameters they contain. An LLM's parameters essentially represent the general patterns of how humans use words to form sentences.

That deep understanding, sometimes called parameterized knowledge, makes LLMs useful in responding to general prompts at light speed. However, it does not serve users who want a deeper dive into a current or more specific topic.

Combining Internal, External Resources Lewis and colleagues developed retrieval-augmented generation to link generative AI services to external resources, especially ones rich in the latest technical details.

The paper, with coauthors from the former Facebook AI Research (now Meta AI), University College London and New York University, called RAG a general-purpose fine-tuning recipe because it can be used by nearly any LLM to connect with practically any external resource.

Building User Trust Retrieval-augmented generation gives models sources they can cite, like footnotes in a research paper, so users can check any claims. That builds trust.

What's more, the technique can help models clear up ambiguity in a user query. It also reduces the possibility a model will make a wrong guess, a phenomenon sometimes called hallucination.

Another great advantage of RAG is it's relatively easy. A blog by Lewis and three of the paper's coauthors said developers can implement the process with as few as five lines of code.

That makes the method faster and less expensive than retraining a model with additional datasets. And it lets users hot-swap new sources on the fly.

How People Are Using Retrieval-Augmented Generation With retrieval-augmented generation, users can essentially have conversations with data repositories, opening up new kinds of experiences. This means the applications for RAG could be multiple times the number of available datasets.

For example, a generative AI model supplemented with a medical index could be a great assistant for a doctor or nurse. Financial analysts would benefit from an assistant linked to market data.

In fact, almost any business can turn its technical or policy manuals, videos or logs into resources called knowledge bases that can enhance LLMs. These sources can enable use cases such as customer or field support, employee training and developer productivity.

The broad potential is why companies including AWS, IBM, Glean, Google, Microsoft, NVIDIA, Oracle and Pinecone are adopting RAG.

Getting Started With Retrieval-Augmented Generation To help users get started, NVIDIA developed a reference architecture for retrieval-augmented generation. It includes a sample chatbot and the elements users need to create their own applications with this new method.

The workflow uses NVIDIA NeMo, a framework for developing and customizing generative AI models, as well as software like NVIDIA Triton Inference Server and NVIDIA TensorRT-LLM for running generative AI models in production.

The software components are all part of NVIDIA AI Enterprise, a software platform that accelerates development and deployment of production-ready AI with the security, support and stability businesses need.

Getting the best performance for RAG workflows requires massive amounts of memory and compute to move and process data. The NVIDIA GH200 Grace Hopper Superchip, with its 288GB of fast HBM3e memory and 8 petaflops of compute, is ideal - it can deliver a 150x speedup over using a CPU.

Once companies get familiar with RAG, they can combine a variety of off-the-shelf or custom LLMs with internal or external knowledge bases to create a wide range of assistants that help their employees and customers.

RAG doesn't require a data center. LLMs are debuting on Windows PCs, thanks to NVIDIA software that enables all sorts of applications users can access even on their laptops.

An example application for RAG on a PC. PCs equipped with NVIDIA RTX GPUs can now run some AI models locally. By using RAG on a PC, users can link to a private knowledge source - whether that be emails, notes or articles - to improve responses. The user can then feel confident that their data source, prompts and response all remain private and secure.

A recent blog provides an example of RAG acc
LINK: https://blogs.nvidia.com/blog/what-is-retrieval-augmented-generation/...
See more stories from nvidia

Most recent headlines

04/08/2024

Dalet Appoints Santiago Solanas as CEO to Lead Next Era of Growth and Innovation

Dalet, a leading technology and service provider for media-rich organizations, is excited to announce Santiago Solanas as its new Chief Executive Officer (CEO)....

03/06/2024

Dalet and Veritone Reach Agreement to Distribute, Transact and Monetize Media Archives

Dalet, a leading technology and service provider for media-rich organizations, a...

19/05/2024

TV Techs Weekly Product Wrap-Up

Missed any of our product coverage during your busy week? The TV Tech weekly product and services news wrap-up provides links to all of our coverage from May 13...

19/05/2024

The CW Shares Its 2024-2025 Lineup

LeVar Burton will host the new game show Trivial Pursuit on The CW, while Raven-Symon will host Scrabble....

19/05/2024

Biden, Trump Agree to Debates on CNN, ABC

President Joe Biden and former President Donald Trump have agreed to debates, set for June 27 on CNN and September 10 on ABC....

19/05/2024

End of the Line for Young Sheldon' on May 16

Young Sheldon signs off after seven seasons on CBS, when the series finale airs Thursday, May 18. Jim Parsons and Mayim Bialik reprise their roles as Sheldon Co...

18/05/2024

If Bundling Is Back, What's the Ideal Bundle?

PORTSMOUTH, N.H. Bundling is back in a big way, with all the major streaming companies and many pay TV operators exploring ways to simplify the consumer experie...

18/05/2024

FCC to Vote on LPTV Rules during June Open Meeting

WASHINGTON, D.C. Federal Communications Commission Chairwoman Jessica Rosenworcel has announced a tentative agenda for the June Open Commission Meeting schedule...

18/05/2024

Matthews Launches New Multipurpose Grip Rail Telescopic Grid Pipe Solution

Matthews Studio Equipment has introduced Grip Rail, which the company said offers a better way to mount equipment on location, in the studio, or on the fly....

18/05/2024

IAB Tech Labs, Google Partner on New First Party Data Solution

In a notable development in the industry-wide effort to address privacy concerns while improving efficacy of marketing efforts in a cookieless ad landscape, IAB...

18/05/2024

TV Tech Weekly Product Wrap-Up

Missed any of our product coverage during your busy week? The TV Tech weekly product and services news wrap-up provides links to all of our coverage from May 13...

18/05/2024

DHD Elevates the Art of Podcast Production

DHD Elevates the Art of Podcast Production Brie Clayton May 17, 2024 0 Comments Hero image: the DHD DX2 base and expansion modules Latest-generation ...

17/05/2024

Aerojet Rocketdyne's Camden Site Leverages Modernization Investments to Accelerate Solid Rocket Motor Production

Aerojet Rocketdyne has worked to modernize facilities at its Camden, Arkansas, l...

17/05/2024

FCC Plans to Revise LPTV Rules

The FCC has issued a Notice of Proposed Rulemaking (NPRM) that would revise rules governing low power TV stations (LPTV) in a number of areas, including online ...

17/05/2024

Demystifying Post-Production: Introducing Cinema 4D Particles Week 4

Demystifying Post-Production: Introducing Cinema 4D Particles Week 4 Brie Clayton May 17, 2024 0 Comments With the spring release of Maxon One, we&#...

17/05/2024

Takashi Yamazaki Film Godzilla Minus One Graded with DaVinci Resolve Studio

Takashi Yamazaki Film Godzilla Minus One Graded with DaVinci Resolve Studio Brie Clayton May 17, 2024 0 Comments Hero image credit: 2023 TOHO CO., LT...

17/05/2024

Sterling Event Group Streamlines Live Event Productions with AJA

Sterling Event Group Streamlines Live Event Productions with AJA Brie Clayton May 17, 2024 0 Comments Live event productions only happen once, which ...

17/05/2024

Meet the product manager

Muster Ngobi, product manager at LYNX Technik tells TVBEurope how the ever-evolving media industry provides a truly dynamic working environment By Matthew Corr...

17/05/2024

TV, Streaming Schedule for 2024 NFL Regular Season Is Released

NEW YORK As declines in linear TV viewing make the ongoing popularity of live sports, particularly football, central to financial success of the TV industry, th...

17/05/2024

Netflix Ad Tier Hits 40M Monthly Active Users

During Netflixs second Upfront presentation to advertisers, Amy Reinhard, Netflix's president of advertising, walked advertisers through the continued growt...

17/05/2024

Scripps Promotes Jeff Kiernan to VP, Local News

CINCINNATI The E.W. Scripps Company has added to its leadership team for news by promoting Jeff Kiernan a veteran journalist and general manager of Scripps'...

17/05/2024

Survey: New Disney-Fox-WBD Sports Streamer May Hurt Pay TV Sub Counts

Top executives from Disney, Fox and Warner Bros. Discovery have consistently insisted that their joint venture to launch the Venu Sports streaming bundle in the...

17/05/2024

Caitlin Clark's WNBA Debut Set Viewing Records

ESPN has announced that its coverage of Caitlin Clark's WNBA debut in the Indiana Fever versus the Connecticut Sun season opener was the most-watched WNBA g...

17/05/2024

ATEM Mini Extreme ISO switcher and Blackmagic Pocket Cinema Camera 4K

ATEM Mini Extreme ISO switcher and Blackmagic Pocket Cinema Camera 4K Brie Clayton May 16, 2024 0 Comments Blackmagic Design announced today that Yoic...

17/05/2024

Pixomondo's Virtual Production Academy Expands with Programs at Sony PCL, Vook, and Vancouver Film School

Pixomondo's Virtual Production Academy Expands with Programs at Sony PCL, Vo...

17/05/2024

WBD Upfront Show Offers Peeks at House of the Dragon,' White Lotus,' Biden-Trump Debate

The Warner Bros. Discovery upfront presentation took place Wednesday, May 15 at ...

17/05/2024

The Black Keys, Jelly Roll, Kate Hudson Set To Perform on The Voice' Finale

Season 25 of The Voice wraps on NBC Tuesday, May 21, with performances from The Black Keys, Jelly Roll, Kate Hudson, Lainey Wilson, Muni Long, Thomas Rhett and ...

17/05/2024

CNN Boss Mark Thompson's Plan Includes More News in More Categories on More Devices (Upfronts)

New CNN CEO Mark Thompson spelled out his plan for the struggling news network d...

17/05/2024

Netflix To Launch In-House Advertising Tech Platform

Netflix, a newcomer to the advertising business, said it plans to launch an in-house advertising technology platform....

17/05/2024

Netflix Plots TV Takeover at Upfront Presentation

Netflix shared some programming projects at an upfront presentation in New York. Those include the basketball-themed comedy series Running Point, a Mindy Kaling...

17/05/2024

Plex Geek Week Sale Offers 20% Off Plex Lifetime Pass

Plex is offering movie and music collectors a 20% discount off its Lifetime Plex Pass as part of its Geek Week sale....

17/05/2024

GroupM Names Toby Jenner as President, GroupM Clients

Giant media buyer GroupM said it named Toby Jenner as global president, Group M Clients, a new position at the company....

17/05/2024

Clients of Independent Agencies Boost Programmatic Buying

Smaller advertisers are increasingly buying connected TV programmatically, according to a new report from FreeWheel, Comcast's ad-tech unit....

17/05/2024

TCLtvPlus Adds Streaming Music Channels From Vevo

TCLtvPlus, the streaming app on smart TVs made by TCL, has added live linear channel from music-video programmer Vevo....

17/05/2024

StackAdapt Adopts Data From Samba TV for Programmatic Campaigns

StackAdapt said it made a deal to integrate data from Samba TV into its programmatic advertising platform....

17/05/2024

Tonight on Skeem Saam: Lehasa gets a rude awakening when Kgosi blackmails him

Tonight on Skeem Saam: Lehasa gets a rude awakening when Kgosi blackmails himDon't miss Friday, 17 May's riveting episode of South African soapie Skeem ...

17/05/2024

Tonight on House of Zwide: Dorothy is blown away by Ona's sketches for her wedding dress

Tonight on House of Zwide: Dorothy is blown away by Ona's sketches for her w...

17/05/2024

Tonight on Scandal: Dintle has a visit from her past that leaves her very unsettled

Tonight on Scandal: Dintle has a visit from her past that leaves her very unsett...

17/05/2024

Save Time and Money with WO Traffic v24.0

WO Traffic provides a solid foundation from which stations can manage, execute, and scale end-to-end ad trafficking and sales, both today and into the future. W...

17/05/2024

Broadcast Innovation in India: How AI and Automated Production Helps Smaller Sports Grow

Broadcast Innovation in India: How AI and Automated Production Helps Smaller Spo...

17/05/2024

SVG Sports Cloud Production Forum Gives Refresher Course on Cloud-Based Tools, Ecosystem

SVG Sports Cloud Production Forum Gives Refresher Course on Cloud-Based Tools, E...

17/05/2024

WNBA Tip-Off 2024: Scripps Sports Constructs New Studio for Second Season of WNBA Friday Night Spotlight on ION

WNBA Tip-Off 2024: Scripps Sports Constructs New Studio for Second Season of WNB...

17/05/2024

SVG College Summit 2024: Auburn's War Eagle Productions Breaks Down How They Produce Live Gymnastics Broadcasts

SVG College Summit 2024: Auburn's War Eagle Productions Breaks Down How They...

17/05/2024

Netflix & Shondaland Announce the Song List and Soundtrack for 'Bridgerton' Season 3: Part 1

Back to All News Netflix & Shondaland Announce the Song List and Soundtrack for...

17/05/2024

Skeem Saam: Thursday's episode, 16 May 2024 [video]

Skeem Saam: Thursday's episode, 16 May 2024 [video]Missed an episode of Skeem Saam? No problem! Watch the latest episode of your favourite South African soa...

17/05/2024

Prison Journalism: Letter to my mothers

Prison Journalism: Letter to my mothersThabo Mthembu was incarcerated in Pollsmoor Prison from 2014 to 2019. Read Thabo's story by Thabo Mthembu 17-05-20...

17/05/2024

Paul McCartney becomes UK's first billionaire musician

Paul McCartney becomes UK's first billionaire musicianMusic icon Paul McCartney has become the UK's first billionaire musician, according to the Sunday ...

17/05/2024

Tonight on Smoke and Mirrors: Sakhile advises Tiny against sabotaging Petunia

Tonight on Smoke and Mirrors: Sakhile advises Tiny against sabotaging PetuniaDon't miss Friday, 17 May's riveting episode of South African soapie Smoke ...

17/05/2024

RT'S Operation Transformation comes to a close after 17 seasons

RT has today announced that Operation Transformation (OT) is to end after 17 seasons. As series come to an end each year, RT undertakes an editorial review to...

17/05/2024

Studio One: Your Binaural Beats Lab

By Craig Anderton When I heard about binaural beats, I was interested-I like beats, and I'm into binaural audio. But this has nothing to do with either o...