Sony Pixel Power calrec Sony

Seamless in Seattle: NVIDIA Research Showcases Advancements in Visual Generative AI at CVPR

17/06/2024

NVIDIA researchers are at the forefront of the rapidly advancing field of visual generative AI, developing new techniques to create and interpret images, videos and 3D environments.

More than 50 of these projects will be showcased at the Computer Vision and Pattern Recognition (CVPR) conference, taking place June 17-21 in Seattle. Two of the papers - one on the training dynamics of diffusion models and another on high-definition maps for autonomous vehicles - are finalists for CVPR's Best Paper Awards.

NVIDIA is also the winner of the CVPR Autonomous Grand Challenge's End-to-End Driving at Scale track - a significant milestone that demonstrates the company's use of generative AI for comprehensive self-driving models. The winning submission, which outperformed more than 450 entries worldwide, also received CVPR's Innovation Award.

NVIDIA's research at CVPR includes a text-to-image model that can be easily customized to depict a specific object or character, a new model for object pose estimation, a technique to edit neural radiance fields (NeRFs) and a visual language model that can understand memes. Additional papers introduce domain-specific innovations for industries including automotive, healthcare and robotics.

Collectively, the work introduces powerful AI models that could enable creators to more quickly bring their artistic visions to life, accelerate the training of autonomous robots for manufacturing, and support healthcare professionals by helping process radiology reports.

Artificial intelligence, and generative AI in particular, represents a pivotal technological advancement, said Jan Kautz, vice president of learning and perception research at NVIDIA. At CVPR, NVIDIA Research is sharing how we're pushing the boundaries of what's possible - from powerful image generation models that could supercharge professional creators to autonomous driving software that could help enable next-generation self-driving cars.

At CVPR, NVIDIA also announced NVIDIA Omniverse Cloud Sensor RTX, a set of microservices that enable physically accurate sensor simulation to accelerate the development of fully autonomous machines of every kind.

Forget Fine-Tuning: JeDi Simplifies Custom Image Generation Creators harnessing diffusion models, the most popular method for generating images based on text prompts, often have a specific character or object in mind - they may, for example, be developing a storyboard around an animated mouse or brainstorming an ad campaign for a specific toy.

Prior research has enabled these creators to personalize the output of diffusion models to focus on a specific subject using fine-tuning - where a user trains the model on a custom dataset - but the process can be time-consuming and inaccessible for general users.

JeDi, a paper by researchers from Johns Hopkins University, Toyota Technological Institute at Chicago and NVIDIA, proposes a new technique that allows users to easily personalize the output of a diffusion model within a couple of seconds using reference images. The team found that the model achieves state-of-the-art quality, significantly outperforming existing fine-tuning-based and fine-tuning-free methods.

JeDi can also be combined with retrieval-augmented generation, or RAG, to generate visuals specific to a database, such as a brand's product catalog.

https://blogs.nvidia.com/wp-content/uploads/2024/06/JeDi-cow-sculpture.mp4

New Foundation Model Perfects the Pose NVIDIA researchers at CVPR are also presenting FoundationPose, a foundation model for object pose estimation and tracking that can be instantly applied to new objects during inference, without the need for fine-tuning.

The model, which set a new record on a popular benchmark for object pose estimation, uses either a small set of reference images or a 3D representation of an object to understand its shape. It can then identify and track how that object moves and rotates in 3D across a video, even in poor lighting conditions or complex scenes with visual obstructions.

FoundationPose could be used in industrial applications to help autonomous robots identify and track the objects they interact with. It could also be used in augmented reality applications where an AI model is used to overlay visuals on a live scene.

NeRFDeformer Transforms 3D Scenes With a Single Snapshot A NeRF is an AI model that can render a 3D scene based on a series of 2D images taken from different positions in the environment. In fields like robotics, NeRFs can be used to generate immersive 3D renders of complex real-world scenes, such as a cluttered room or a construction site. However, to make any changes, developers would need to manually define how the scene has transformed - or remake the NeRF entirely.

Researchers from the University of Illinois Urbana-Champaign and NVIDIA have simplified the process with NeRFDeformer. The method, being presented at CVPR, can successfully transform an existing NeRF using a single RGB-D image, which is a combination of a normal photo and a depth map that captures how far each object in a scene is from the camera.

VILA Visual Language Model Gets the Picture A CVPR research collaboration between NVIDIA and the Massachusetts Institute of Technology is advancing the state of the art for vision language models, which are generative AI models that can process videos, images and text.

The group developed VILA, a family of open-source visual language models that outperforms prior neural networks on key benchmarks that test how well AI models answer questions about images. VILA's unique pretraining process unlocked new model capabilities, including enhanced world knowledge, stronger in-context learning and the ability to reason across multiple images.

VILA can understand memes and reason based on multiple images or video frames. The VILA model fa
LINK: https://blogs.nvidia.com/blog/visual-generative-ai-cvpr-research/...
See more stories from nvidia

Most recent headlines

27/11/2025

Vizrt Launches Viz One 8.1 With AI-Powered Features

LONDON Vizrt has added several AI-driven advanced features offering improved speed, intelligence and accuracy in the newest version of its media asset managemen...

27/11/2025

Prime Video Debuts AI-Powered Video Recaps

Prime Video has launched AI-powered video season recaps in a beta version for select English-language Prime Original series in the U.S., a move Amazon is callin...

27/11/2025

Sky unveils first look image from high-stakes action thriller Prisoner, coming 2026

Tahar Rahim and Izuka Hoyle star in the gripping six-part Sky Original from Acad...

27/11/2025

Sky Arts Reveals the Nations Greatest Basslines and Queen Reign Supreme

Thursday 27 November 2025 Sky Arts Reveals the Nation's Greatest Basslines - and Queen Reign Supreme The UK's most iconic basslines have been revealed...

27/11/2025

Stranger Things 5': Prepare for One Last Adventure With Our Final Season Coverage Guide

Back to All News Stranger Things 5': Prepare for One Last Adventure With O...

27/11/2025

Elastic Compute for a Sustainable Media Industry

The media industry has a paradox at its core. It's an industry built on light, color and imagination, yet behind the scenes, it's powered by one of the ...

27/11/2025

Arqiva Achieves Five-Star GRESB Rating

Rating reflects rating progress across areas including policies, diversity & inclusion, health & safety and Net Zero leadership Winchester, UK, 27 November 202...

27/11/2025

Retail Media Audits Explained: What Networks Need to Know

What are the industry standards for Retail Media? Kathryn explains that certification is based on the IAB Europe Retail Media Measurement Standards and the IAB ...

27/11/2025

Katie Taylor, Rachael Blackmore and Arthur Gourounlian among the guests on this week's Late Late Show

World champion boxer and Irish sporting icon Katie Taylor will be in studio this...

27/11/2025

Tonight on RT Prime Time, serious child protection concerns emerge over online gaming platform, Roblox

Roblox, one of the world's most popular online gaming platforms for primary ...

27/11/2025

The Ultimate Black Friday Deal Is Here

Black Friday is leveling up. Get ready to score one of the biggest deals of the season - 50% off the first three months of a new GeForce NOW Ultimate membership...

26/11/2025

SVG Sit-Down: Prime Video EP Mike Muriano Previews Massive Black Friday Slate Featuring NFL, NBA, and Golf

SVG Sit-Down: Prime Video EP Mike Muriano Previews Massive Black Friday Slate Fe...

26/11/2025

Inside the Archives: Winter Is in the Air and in Our Festival Films

A cinematic snow sculpture at the 1995 Sundance Film Festival. Photo by Randall Michelson...

26/11/2025

10 Book Podcasts You Can't Miss

Book podcasts are booming. On Spotify, you'll find everything from celebrity book clubs to deep dives with bestselling authors. And in markets where audiobo...

26/11/2025

JioStar and Nielsen Unveil Breakthrough Cross-Screen MeasurementStudy, Redefining Advertising Effectiveness in Live Sports

Mumbai, November 24, 2025: In a first-of-its-kind initiative, JioStar, in collab...

26/11/2025

ITN Deploys IP-Based Production Control Room

LONDON Factual content producer ITN Productions has launched a new low-latency IP gallery for news bulletins....

26/11/2025

YouTube TV, TelevisaUnivision End Lengthy Blackout

MIAMI TelevisaUnivision said it struck a new multiyear distribution agreement with YouTube TV that includes distribution of TelevisaUnivision's U.S. network...

26/11/2025

OpenDrives Bridges the Gap Between IT and Creatives with...

OpenDrives, Inc., a leader in software-defined data storage and data services, today announced the launch of the Atlas Corporate Creative Solution. This new Atl...

26/11/2025

Disguise to Showcase Future of Event Visuals at LDI 2025

Disguise, the industry-leading company powering the world's biggest live performances, is partnering with pioneering LED wall manufacturer DVS to give atten...

26/11/2025

HighField AI Expands Global Channel Partner Network to Ac...

HighField AI, the pioneer in agentic and multimodal automation for broadcast and media production, today announced the expansion of its global channel partner n...

26/11/2025

Mono Streaming selects PlayBox Neo to manage English Prem...

As high-stakes Premier League fixtures approach and additional premium content launches, with MONO positioning themselves to dominate Thailand's sports stre...

26/11/2025

Bell Centre arena in Montreal elevates fan experience wit...

Hosting a wide variety of events from high-intensity NHL games to complex live music concerts and major entertainment productions, Montreal's 21,000 capacit...

26/11/2025

Vizrt launches AI-powered advances for speed and accuracy...

Vizrt, the leader in live production technology revolutionizing viewer engagement and experience, releases AI-driven advances focusing on speed, intelligence, a...

26/11/2025

ITN Launches Low-Latency IP Control Room Powered by Teche...

ITN Productions, an award-winning factual content producer, today launched a new low-latency IP gallery for news bulletins. Responsible for delivering a leading...

26/11/2025

Ikegami Maintains Initiative in Broadcast Systems Develop...

Ikegami reports ongoing advances throughout 2025 in developing and delivering coordinated television production solutions that maximize quality, versatility and...

26/11/2025

Fubo, NBCUniversal Trade Barbs in Carriage Dispute

Following the Nov. 21 blackout of NBCUniversal channels on Fubo, the two sides have traded barbs about their inability to reach a new carriage deal....

26/11/2025

Global Sports Rights Spending to Top $78 Billion in 2030

LONDON As TV sports rights become increasingly important for both broadcasters and streamers, Ampere Analysis predicts global investment in the genre will surpa...

26/11/2025

Vubiquity Earns AWS Media & Entertainment Competency Status

LOS ANGELES Vubiquity said it has achieved the Amazon Web Services (AWS) Media & Entertainment Competency as part of the AWS Partner Network (APN). This designa...

26/11/2025

Comcast Pays $1.5 Million to Settle FCC Data Breach Probe

WASHINGTON The Federal Communications Commission's Enforcement Bureau said it has entered into a consent decree with Comcast calling for the cable company t...

26/11/2025

Berklee Named to the Hollywood Reporters Top Music Schools List

Berklee Named to the Hollywood Reporters Top Music Schools List The publication highlights the college's screen scoring program, industry partnerships, and ...

26/11/2025

Animated Series Love Through a Prism' Casts New Light on Romance Between Aristocrat and Exchange Student in London

Back to All News Animated Series Love Through a Prism' Casts New Light on ...

26/11/2025

NALIP Unveils Fifth Cohort of Director Incubator

Back to All News NALIP Unveils Fifth Cohort of Director Incubator Social Impact 26 November 2025 United States Link copied to clipboard The National Assoc...

26/11/2025

YouView Achieves Greenly Gold Certification for Sustainability

YouView Achieves Greenly Gold Certification for SustainabilityNov 26, 2025 YouView is proud to announce a Gold Certification award from Greenly for our perform...

26/11/2025

Netflix Deepens Partnership with Taiwan's 62nd Golden Horse Film Festival, Launches New Talent and Storytelling Initiatives

Back to All News Netflix Deepens Partnership with Taiwan's 62nd Golden Hors...

25/11/2025

Tracy Bonareri Onchoke: Winner, Young Journalist Award 2025

Tracy Bonareri Onchoke, an investigative journalist from Kenya is the winner of the Thomson Foundation's Young Journalist Award 2025. The 26-year-old-sele...

25/11/2025

SVG All-Stars: Blayke Scheer, Senior Director, Creative Content, YES Network

SVG All-Stars: Blayke Scheer, Senior Director, Creative Content, YES NetworkThe Indiana alum has turned storytelling into an artform for more than two decadesBy...

25/11/2025

Op-Ed: With FCC's C-Band Auction on the Horizon, Broadcasters Need Proven, Cost-Effective Alternatives

Op-Ed: With FCC's C-Band Auction on the Horizon, Broadcasters Need Proven, C...

25/11/2025

Analysis: Is Baller League Really the Future of Sport?

Analysis: Is Baller League really the future of sport? By Callum McCarthy, Editor-at-Large Tuesday, November 25, 2025 - 10:10 Print This Story With KSI on...

25/11/2025

Platinum Whitepaper: The Growth of Broadcast in the World of Major Large Scale Events with SOS Global

Platinum Whitepaper: The Growth of Broadcast in the World of Major Large Scale E...

25/11/2025

SVG Summit 2025 Preview: SVG Women's Sports Workshop

SVG Summit 2025 Preview: SVG Women's Sports WorkshopBy Samantha Gabay Tuesday, November 25, 2025 - 10:27 am Print This Story | Subscribe Story Highlig...

25/11/2025

SVG New Sponsor Spotlight: CacheFly's Matt Levine on the Evolving Role of the CDN and Prioritizing Throughput

SVG New Sponsor Spotlight: CacheFly's Matt Levine on the Evolving Role of th...

25/11/2025

Peacock's EA SPORTS Madden NFL Cast Levels Up on Thanksgiving With SkyCam as the Primary Angle and More Madden Elements

Peacock's EA SPORTS Madden NFL Cast Levels Up on Thanksgiving With SkyCam as...

25/11/2025

Sauna Is an Intimate Exploration of Queer Love and Identity

Mathias Broe attends the 2025 Sundance Film Festival premiere of Sauna at Library Center Theatre. (Photo by Michael Hurcomb/Shutterstock for Sundance Film Fes...

25/11/2025

5 Reasons to Try Spotify Premium This Holiday Season

The best playlists, podcasts, and audiobooks bring a little extra magic to your daily routine. With new features and offerings, Spotify Premium delivers even mo...

25/11/2025

New Study Reveals Australians Love Discovering New Music

Comprehensive new research confirms what we already knew: Australian music fans love the quality, quantity, and access they have to new and local music on strea...

25/11/2025

Why Use a SIM Card With The SNYPER-5G

Applicable Products Objectives The purpose of this application note is to give a brief background on 5G (NR) wireless communication an explain the reason a SN...

25/11/2025

Lionsgate and Nielsen expand partnership to deliver first-ever combined FAST channel and digital network measurement

Nielsen will now measure both Lionsgate's FAST channel MovieSphere and Movie...