Sony Pixel Power calrec Sony

Seamless in Seattle: NVIDIA Research Showcases Advancements in Visual Generative AI at CVPR

17/06/2024

NVIDIA researchers are at the forefront of the rapidly advancing field of visual generative AI, developing new techniques to create and interpret images, videos and 3D environments.

More than 50 of these projects will be showcased at the Computer Vision and Pattern Recognition (CVPR) conference, taking place June 17-21 in Seattle. Two of the papers - one on the training dynamics of diffusion models and another on high-definition maps for autonomous vehicles - are finalists for CVPR's Best Paper Awards.

NVIDIA is also the winner of the CVPR Autonomous Grand Challenge's End-to-End Driving at Scale track - a significant milestone that demonstrates the company's use of generative AI for comprehensive self-driving models. The winning submission, which outperformed more than 450 entries worldwide, also received CVPR's Innovation Award.

NVIDIA's research at CVPR includes a text-to-image model that can be easily customized to depict a specific object or character, a new model for object pose estimation, a technique to edit neural radiance fields (NeRFs) and a visual language model that can understand memes. Additional papers introduce domain-specific innovations for industries including automotive, healthcare and robotics.

Collectively, the work introduces powerful AI models that could enable creators to more quickly bring their artistic visions to life, accelerate the training of autonomous robots for manufacturing, and support healthcare professionals by helping process radiology reports.

Artificial intelligence, and generative AI in particular, represents a pivotal technological advancement, said Jan Kautz, vice president of learning and perception research at NVIDIA. At CVPR, NVIDIA Research is sharing how we're pushing the boundaries of what's possible - from powerful image generation models that could supercharge professional creators to autonomous driving software that could help enable next-generation self-driving cars.

At CVPR, NVIDIA also announced NVIDIA Omniverse Cloud Sensor RTX, a set of microservices that enable physically accurate sensor simulation to accelerate the development of fully autonomous machines of every kind.

Forget Fine-Tuning: JeDi Simplifies Custom Image Generation Creators harnessing diffusion models, the most popular method for generating images based on text prompts, often have a specific character or object in mind - they may, for example, be developing a storyboard around an animated mouse or brainstorming an ad campaign for a specific toy.

Prior research has enabled these creators to personalize the output of diffusion models to focus on a specific subject using fine-tuning - where a user trains the model on a custom dataset - but the process can be time-consuming and inaccessible for general users.

JeDi, a paper by researchers from Johns Hopkins University, Toyota Technological Institute at Chicago and NVIDIA, proposes a new technique that allows users to easily personalize the output of a diffusion model within a couple of seconds using reference images. The team found that the model achieves state-of-the-art quality, significantly outperforming existing fine-tuning-based and fine-tuning-free methods.

JeDi can also be combined with retrieval-augmented generation, or RAG, to generate visuals specific to a database, such as a brand's product catalog.

https://blogs.nvidia.com/wp-content/uploads/2024/06/JeDi-cow-sculpture.mp4

New Foundation Model Perfects the Pose NVIDIA researchers at CVPR are also presenting FoundationPose, a foundation model for object pose estimation and tracking that can be instantly applied to new objects during inference, without the need for fine-tuning.

The model, which set a new record on a popular benchmark for object pose estimation, uses either a small set of reference images or a 3D representation of an object to understand its shape. It can then identify and track how that object moves and rotates in 3D across a video, even in poor lighting conditions or complex scenes with visual obstructions.

FoundationPose could be used in industrial applications to help autonomous robots identify and track the objects they interact with. It could also be used in augmented reality applications where an AI model is used to overlay visuals on a live scene.

NeRFDeformer Transforms 3D Scenes With a Single Snapshot A NeRF is an AI model that can render a 3D scene based on a series of 2D images taken from different positions in the environment. In fields like robotics, NeRFs can be used to generate immersive 3D renders of complex real-world scenes, such as a cluttered room or a construction site. However, to make any changes, developers would need to manually define how the scene has transformed - or remake the NeRF entirely.

Researchers from the University of Illinois Urbana-Champaign and NVIDIA have simplified the process with NeRFDeformer. The method, being presented at CVPR, can successfully transform an existing NeRF using a single RGB-D image, which is a combination of a normal photo and a depth map that captures how far each object in a scene is from the camera.

VILA Visual Language Model Gets the Picture A CVPR research collaboration between NVIDIA and the Massachusetts Institute of Technology is advancing the state of the art for vision language models, which are generative AI models that can process videos, images and text.

The group developed VILA, a family of open-source visual language models that outperforms prior neural networks on key benchmarks that test how well AI models answer questions about images. VILA's unique pretraining process unlocked new model capabilities, including enhanced world knowledge, stronger in-context learning and the ability to reason across multiple images.

VILA can understand memes and reason based on multiple images or video frames. The VILA model fa
LINK: https://blogs.nvidia.com/blog/visual-generative-ai-cvpr-research/...
See more stories from nvidia

Most recent headlines

05/01/2027

Worlds first 802.15.4ab-UWB chip verified by Calterah and Rohde & Schwarz to be demoed at CES 2026

Worlds first 802.15.4ab-UWB chip verified by Calterah and Rohde & Schwarz to be ...

04/08/2026

Dalet Announces Commercial Availability of Dalia, Bringing Media-Aware Agentic AI to Enterprise Productions

Dalet, a leading technology and service provider for media-rich organizations, t...

04/07/2026

Detective Conan: Fallen Angel of the Highway Opens in Dolby Cinemas Across Japan, Presented in Dolby Atmos and Dolby ...

April 7 2026, 19:00 (PDT) Detective Conan: Fallen Angel of the Highway Opens in...

01/06/2026

Dolby Sets the New Standard for Premium Entertainment at CES 2026

January 6 2026, 05:30 (PST) Dolby Sets the New Standard for Premium Entertainment at CES 2026 Throughout the week, Dolby brings to life the latest innovatio...

02/05/2026

FCC Releases Tentative Agenda for May Open Meeting

Share Copy link Facebook X Linkedin Bluesky Email...

02/05/2026

Sinclair Remains Bullish on Station M&A

Share Copy link Facebook X Linkedin Bluesky Email...

02/05/2026

NABLF Announces 2026 Broadcast Leadership Training Award Winners

Share Copy link Facebook X Linkedin Bluesky Email...

02/05/2026

Gravity Media Taps Custom Consoles for Work on Production Center

Share Copy link Facebook X Linkedin Bluesky Email...

02/05/2026

May 01, 2026

Scripps Research immunologist Dennis Burton elected to American Academy of Arts and Sciences A leader in broadly neutralizing antibodies, Burton has helped driv...

02/05/2026

Dalet Flex LTS Delivers Smarter Search, Faster Editing, and an AI-Ready Foundation for Modern Media

Dalet, a leading technology and service provider for media-rich organizations, t...

01/05/2026

Ratings Roundup: NBA Playoffs Return to NBC Sports up 38%; NFL Draft Down 12% Overall From 2025

Ratings Roundup is a rundown of recent rating news and is derived from press rel...

01/05/2026

BKB Bare Knuckle Boxing Appoints Will Wright as Chief Operating Officer to Drive Global Growth and Operational Excellence

BKB Bare Knuckle Boxing ( BKB ), today announced the appointment of Will Wright ...

01/05/2026

NAB Rewind: Lawo's Andreas Hilmer on the Power of the Edge One AV Stagebox

Lawo has been at the center of the industry's transition to IP and other next-generation technologies. At NAB 2026, its story was the Edge One AV stagebox, ...

01/05/2026

Kentucky Derby 152 to Air Across 19 Networks in 170-Plus Territories

HBA Media, acting on behalf of NBC Sports and Churchill Downs Incorporated, has announced broadcast and streaming distribution for Kentucky Derby 152, taking pl...

01/05/2026

Give Me the Backstory: Get to Know Barbara Kopple, the Director of American Dream

By Bailey Pennick One of the most exciting things about the Sundance Film Festi...

01/05/2026

Find Out Which The Devil Wears Prada 2' Character You Are With Our New Playlist

Florals for spring? Groundbreaking. But a playlist that tells you which The Devi...

01/05/2026

Olivia Rodrigo Takes Over FC Barcelona Jersey for El Clsico Match at Spotify Camp Nou

One of the world's biggest popstars is headed to El Cl sico. Later this mont...

01/05/2026

Heritage Audio announce the Baby RAM Black Edition

Limited-edition model celebrates 15th anniversary Heritage Audio's range of monitor controllers has just gained a new member, the Baby RAM Black Edition...

01/05/2026

Universal Audio release UAD Enigmatic '82 Overdrive Special Amp

Dumble recreation now available as UAD plug-in Along with their renowned processing plug-ins, Universal Audio have been steadily introducing emulations of c...

01/05/2026

UPDATED: Republican AGs Join Nexstar-Tegna Antitrust Suit

Share Copy link Facebook X Linkedin Bluesky Email...

01/05/2026

Broadcaster Draper Media Names Bill Vernon President

Share Copy link Facebook X Linkedin Bluesky Email...

01/05/2026

Analysts: 'Hollywood's Vertical Video Strategy Is Dead Wrong'

Share Copy link Facebook X Linkedin Bluesky Email...

01/05/2026

Lightware UK celebrates new London showroom with launch e...

To celebrate the opening of its new showroom and office, Lightware UK hosted a dedicated launch event at the new London location. The event welcomed partners, c...

01/05/2026

Calrec Puts Broadcaster Choice Centre Stage at MPTS 2026

Choice without compromise The broadcast industrys transformation is accelerating, and traditional broadcasters are having to fundamentally reinvent how they o...

01/05/2026

Beam Dynamics Showcases its Asset Intelligence Platform a...

Beam Dynamics will return to MPTS 2026 with its asset intelligence platform, helping systems integrators, live production teams, media facilities and profession...

01/05/2026

Synamedia and FX Digital collaborate to bring GO Shorts a...

Best-in-class UX design and rapid, scalable delivery for next-generation viewing experiences Leading video software provider, Synamedia, today announced a coll...

01/05/2026

Compact new cforce MAX lens motor brings unrivaled speed and responsiveness to the Hi-5 ecosystem

Compact new cforce MAX lens motor brings unrivaled speed and responsiveness to t...

01/05/2026

Panavision welcomes Fritz Heinzle as Vice President of Sales

Panavision welcomes Fritz Heinzle as Vice President of Sales Brie Clayton May 1, 2026 0 Comments Heinzle will support Panavision's global growth s...

01/05/2026

NAB Hires FCC Staffer Ben Arden as SVP, Deputy General Counsel

Share Copy link Facebook X Linkedin Bluesky Email...

01/05/2026

ARRI Introduces Compact New cforce MAX Lens Motor

Share Copy link Facebook X Linkedin Bluesky Email...

01/05/2026

CPI Media Deploys QuickLink StudioCall

Share Copy link Facebook X Linkedin Bluesky Email...

01/05/2026

FCC Proposes to Amend Audible Crawl Rule to Preserve Accessibility

Share Copy link Facebook X Linkedin Bluesky Email...

01/05/2026

Republican AGs Join Nexstar/Tegna Antitrust Suit

Share Copy link Facebook X Linkedin Bluesky Email...

01/05/2026

Dan Johnson Elevates Precision Editing With NUGEN Audio D...

LONDON, APRIL 30, 2026 The Post Republic London's Re-recording Mixer and Dialogue Editor Dan Johnson has built a reputation for clean, emotionally resonan...

01/05/2026

Adobe Unveils Powerful New Innovations in Photoshop & Lightroom

Adobe Unveils Powerful New Innovations in Photoshop & Lightroom Deepa Subramaniam April 30, 2026 0 Comments Your most tedious creative tasks just got ea...

01/05/2026

Berklee Partners with Santander US to Establish Global Opportunity Fund

Berklee Partners with Santander US to Establish Global Opportunity Fund The $400,000 grant offers students access to experiential learning opportunities withi...

01/05/2026

Student Spotlight: Keziah Thomas

Student Spotlight: Keziah Thomas The Indian composer, who was named the 2026 student commencement speaker for Berklee College of Music, talks about how shes p...

01/05/2026

Hannah Waddingham and Ncuti Gatwa to host the series final two episodes of Saturday Night Live UK

Friday 1 May 2026 Hannah Waddingham and Ncuti Gatwa to host the series final tw...

01/05/2026

Got plans? Cancel them. Sky Sports Big Weekend is coming

Friday 1 May 2026 Got plans? Cancel them. Sky Sports Big Weekend is coming Sky Sports is preparing for a bumper weekend of live action, including Manchester ...

01/05/2026

Sky Sports to broadcast all matches from World Sevens Football London edition

Friday 1 May 2026 Sky Sports to broadcast all matches from World Sevens Football London edition Sky Sports will be the exclusive UK broadcaster of the women&#...

01/05/2026

NIAJ Fest Gets Los Angeles In on the Joke With Free Pop-Up Events

Back to All News NIAJ Fest Gets Los Angeles In on the Joke With Free Pop-Up Events Entertainment 01 May 2026 GlobalUnited States Link copied to clipboard ...

01/05/2026

RT Secures UEFA Champions League Rights from 2027-2031

RT Sport awarded first pick free-to-air on Wednesday nights Champions League and Super Cup finals Highlights on Wednesday nights RT today (Thursday 30 Apri...

01/05/2026

NBCUniversal's Peacock to Be First Streamer to Integrate Dolby's Full Suite of Premium Picture and Sound Innovations

January 5 2026, 18:30 (PST) NBCUniversal's Peacock to Be First Streamer to ...

30/04/2026

PWHL Reports Record Growth in Third Regular Season as Playoffs Begin April 30

The Professional Women's Hockey League (PWHL) concluded its third regular season on Saturday, reporting growth across attendance, viewership, digital engage...

30/04/2026

NBC Sports Launches National Sunday MLB Coverage Beginning May 3

NBC Sports will air national MLB coverage on Sundays beginning May 3, with MLB Sunday Leadoff on Peacock and NBCSN at 12:30 p.m. ET, followed by the debut of th...

30/04/2026

Clear-Com Appoints Brian Grahn and Ben Turnwell to New Roles

Clear-Com has appointed Brian Grahn as Market Outreach Manager of the Americas and Ben Turnwell as Business Development Manager for EMEA live. Grahn joined Cle...

30/04/2026

ARRI Introduces cforce MAX Lens Motor for Hi-5 Lens Control System

ARRI has introduced the cforce MAX, a new lens motor for the Hi-5 lens control system. The cforce MAX is twice as fast as the cforce plus motor it replaces whil...

30/04/2026

Knuerr, Voxtronic, and IHSE to Present Integrated Control Room Solution at Airspace World

Knuerr, Voxtronic, and IHSE will jointly present an integrated control room solu...

30/04/2026

The CW Network and ESPN to Stream CW Sports Live Events on ESPN App

The CW Network and ESPN have announced an agreement to make the ESPN App the exclusive streaming home for all CW Sports live events. CW Sports will continue to ...

30/04/2026

Sennheiser Spectera Deployed on Ed Sheerans The Loop Global Stadium Tour

Ed Sheeran's The Loop' tour launched in Auckland in January 2026 before moving on to Australia, with South America and the United States to follow late...