Sony Pixel Power calrec Sony

Seamless in Seattle: NVIDIA Research Showcases Advancements in Visual Generative AI at CVPR

17/06/2024

NVIDIA researchers are at the forefront of the rapidly advancing field of visual generative AI, developing new techniques to create and interpret images, videos and 3D environments.

More than 50 of these projects will be showcased at the Computer Vision and Pattern Recognition (CVPR) conference, taking place June 17-21 in Seattle. Two of the papers - one on the training dynamics of diffusion models and another on high-definition maps for autonomous vehicles - are finalists for CVPR's Best Paper Awards.

NVIDIA is also the winner of the CVPR Autonomous Grand Challenge's End-to-End Driving at Scale track - a significant milestone that demonstrates the company's use of generative AI for comprehensive self-driving models. The winning submission, which outperformed more than 450 entries worldwide, also received CVPR's Innovation Award.

NVIDIA's research at CVPR includes a text-to-image model that can be easily customized to depict a specific object or character, a new model for object pose estimation, a technique to edit neural radiance fields (NeRFs) and a visual language model that can understand memes. Additional papers introduce domain-specific innovations for industries including automotive, healthcare and robotics.

Collectively, the work introduces powerful AI models that could enable creators to more quickly bring their artistic visions to life, accelerate the training of autonomous robots for manufacturing, and support healthcare professionals by helping process radiology reports.

Artificial intelligence, and generative AI in particular, represents a pivotal technological advancement, said Jan Kautz, vice president of learning and perception research at NVIDIA. At CVPR, NVIDIA Research is sharing how we're pushing the boundaries of what's possible - from powerful image generation models that could supercharge professional creators to autonomous driving software that could help enable next-generation self-driving cars.

At CVPR, NVIDIA also announced NVIDIA Omniverse Cloud Sensor RTX, a set of microservices that enable physically accurate sensor simulation to accelerate the development of fully autonomous machines of every kind.

Forget Fine-Tuning: JeDi Simplifies Custom Image Generation Creators harnessing diffusion models, the most popular method for generating images based on text prompts, often have a specific character or object in mind - they may, for example, be developing a storyboard around an animated mouse or brainstorming an ad campaign for a specific toy.

Prior research has enabled these creators to personalize the output of diffusion models to focus on a specific subject using fine-tuning - where a user trains the model on a custom dataset - but the process can be time-consuming and inaccessible for general users.

JeDi, a paper by researchers from Johns Hopkins University, Toyota Technological Institute at Chicago and NVIDIA, proposes a new technique that allows users to easily personalize the output of a diffusion model within a couple of seconds using reference images. The team found that the model achieves state-of-the-art quality, significantly outperforming existing fine-tuning-based and fine-tuning-free methods.

JeDi can also be combined with retrieval-augmented generation, or RAG, to generate visuals specific to a database, such as a brand's product catalog.

https://blogs.nvidia.com/wp-content/uploads/2024/06/JeDi-cow-sculpture.mp4

New Foundation Model Perfects the Pose NVIDIA researchers at CVPR are also presenting FoundationPose, a foundation model for object pose estimation and tracking that can be instantly applied to new objects during inference, without the need for fine-tuning.

The model, which set a new record on a popular benchmark for object pose estimation, uses either a small set of reference images or a 3D representation of an object to understand its shape. It can then identify and track how that object moves and rotates in 3D across a video, even in poor lighting conditions or complex scenes with visual obstructions.

FoundationPose could be used in industrial applications to help autonomous robots identify and track the objects they interact with. It could also be used in augmented reality applications where an AI model is used to overlay visuals on a live scene.

NeRFDeformer Transforms 3D Scenes With a Single Snapshot A NeRF is an AI model that can render a 3D scene based on a series of 2D images taken from different positions in the environment. In fields like robotics, NeRFs can be used to generate immersive 3D renders of complex real-world scenes, such as a cluttered room or a construction site. However, to make any changes, developers would need to manually define how the scene has transformed - or remake the NeRF entirely.

Researchers from the University of Illinois Urbana-Champaign and NVIDIA have simplified the process with NeRFDeformer. The method, being presented at CVPR, can successfully transform an existing NeRF using a single RGB-D image, which is a combination of a normal photo and a depth map that captures how far each object in a scene is from the camera.

VILA Visual Language Model Gets the Picture A CVPR research collaboration between NVIDIA and the Massachusetts Institute of Technology is advancing the state of the art for vision language models, which are generative AI models that can process videos, images and text.

The group developed VILA, a family of open-source visual language models that outperforms prior neural networks on key benchmarks that test how well AI models answer questions about images. VILA's unique pretraining process unlocked new model capabilities, including enhanced world knowledge, stronger in-context learning and the ability to reason across multiple images.

VILA can understand memes and reason based on multiple images or video frames. The VILA model fa
LINK: https://blogs.nvidia.com/blog/visual-generative-ai-cvpr-research/...
See more stories from nvidia

North America Stories

11/07/2025

2025 Sundance Institute Producers Lab Fellows Announced

PARK CITY, UTAH, July 11, 2025 - The nonprofit Sundance Institute announced today the 11 producers chosen for its annual Producers Labs, returning to Ucross Fou...

11/07/2025

L3Harris Delivers First P-8A Poseidon Aircraft to US Navy

L3Harris Technologies President of Intelligence, Surveillance and Reconnaissance Jason Lambert and General Manager of L3Harris Waco facility Sean Ling held a ce...

11/07/2025

WETA Launches WETA+ Free Streaming Service

ARLINGTON, Va. WETA, the flagship public media station in the national capital area, has launched WETA+, a new streaming service tailored for the local Washingt...

11/07/2025

TV Tech's Top Regulatory Stories of 2025

The Federal Communications Commission has emerged as one of the central players in the broadcast TV landscape in 2025, with its deregulatory policies sparking h...

11/07/2025

Calrec to Feature Suite of Interconnected Audio Solutions at IBC2025

Calrec will introduce usability, customization and system enhancements across its entire range of Argo consoles during IBC2025, Sept. 12-15, at the RAI Amsterda...

11/07/2025

Encompass Supports DAZN's Coverage of 2025 FIFA Club World Cup

LONDON Encompass Digital Media said it will support live and on-demand viewing of the 2025 FIFA Club World Cup across multiple global regions for sports enterta...

11/07/2025

SBE Survey: Certified Broadcast Engineers Earn More

Two-thirds of broadcast engineers reaped the benefits of a pay raise within the last year....

11/07/2025

SmallHD Unveils Quantum 27 OLED Monitor

CARY, N.C. SmallHD has launched the Quantum 27, a new 26.5-inch Quantum-Dot OLED monitor designed to deliver postproduction image quality in a compact, set-frie...

11/07/2025

Tegna Will Pay $225K to Settle FCC Investigation

The Federal Communications Commission's Enforcement Bureau and Tegna have entered into a consent decree that will settle an investigation into the accidenta...

11/07/2025

Sens. Markey, Lujn Again Call for FCC Vote on Paramount-Skydance Merger

WASHINGTON Following news in early July that Paramount had settled President Donald Trump's lawsuit, Sens. Edward J. Markey (D-Mass.) and Ben Ray Luj n (D-N...

11/07/2025

Model/Actriz Performs Lead Single Cinderella on The Late Show with Stephen Colbert

Model/Actriz Performs Lead Single Cinderella on The Late Show with Stephen Colbe...

11/07/2025

Behind the Mic: Amazon Prime Preps for First Season of NBA Action; MSG Networks Adjusts Broadcast Booths for Rangers, Devils

Behind the Mic: Amazon Prime Preps for First Season of NBA Action; MSG Networks ...

11/07/2025

SVG New Sponsor Spotlight: Suite Studios' Craig Hering on Adapting to Clients' Needs With Scalable Cloud-Based Storage

SVG New Sponsor Spotlight: Suite Studios' Craig Hering on Adapting to Client...

11/07/2025

2025 SVG Content Management Forum Breaks Down AI's Impact, Continued Transition to the Cloud

2025 SVG Content Management Forum Breaks Down AI's Impact, Continued Transit...

11/07/2025

A Journey HOME: University of Nebraska's HuskerVision Goes IP

A Journey HOME: University of Nebraska's HuskerVision Goes IP Leaders from the HuskerVision and Lawo share their IP learnings By SVG Staff Friday, July 1...

11/07/2025

CMSI, Remote Picture Labs, Ace ESPN's Cloud-Based Editing Efforts for Wimbledon

CMSI, Remote Picture Labs, Ace ESPN's Cloud-Based Editing Efforts for Wimble...

11/07/2025

Netflix Enters the Live-Boxing-Production Ring for Round 2 With Historic Taylor-Serrano 3 Card at MSG

Netflix Enters the Live-Boxing-Production Ring for Round 2 With Historic Taylor-...

11/07/2025

'Too Hot to Handle: Italy' Is Coming on July 18 Only on Netflix

Back to All News Too Hot to Handle: Italy Is Coming on July 18 Only on Netflix Entertainment 11 July 2025 GlobalItaly Link copied to clipboard July 11, 20...

11/07/2025

Netflix Will Release 'Death Inc.' Seasons 1, 2 and 3

Back to All News Netflix Will Release Death Inc. Seasons 1, 2 and 3 Entertainment 11 July 2025 GlobalSpain Link copied to clipboard Season 1 Season 2 Se...

11/07/2025

A Gaming GPU Helps Crack the Code on a Thousand-Year Cultural Conversation

Ceramics - the humble mix of earth, fire and artistry - have been part of a global conversation for millennia. From Tang Dynasty trade routes to Renaissance pa...

10/07/2025

Nielsen Appoints Richard Pacheco as Head of Global Partnerships

NEW YORK - July 10, 2025 - Nielsen, the global leader in audience measurement, data and analytics, today announced that it appointed Richard Pacheco as head of ...

10/07/2025

Sponsored: Robotic Deployments Are Transforming Local News

Local newscasts don't exist in a vacuum. News directors and station management constantly evaluate what's working, what isn't and perhaps most impor...

10/07/2025

Stuttgart Media University Upgrades Studio with Lawo mc56

Lawo has announced that Stuttgart Media University (Hochschule der Medien, HdM) has comprehensively modernized its central recording studio after selecting an I...

10/07/2025

SMPTE Opens Early Bird Registration for Media Technology Summit

The Society of Motion Picture and Television Engineers (SMPTE) has opened early-bird registration for the Media Technology Summit, which will take place in a ne...

10/07/2025

TNDV Television Launches Aspiration 35 to Support Cinematic Workflows

NASHVILLE, Tenn. TNDV Television has launched Aspiration 35, a new version of its 40-foot Aspiration truck reimagined for cinematic multicamera productions....

10/07/2025

Key Code Education Launches Beginner, Intermediate Training Courses

BURBANK, Calif. Key Code Education, a provider of instructor-led postproduction training, is growing its curriculum with new programs for beginner and intermedi...

10/07/2025

Actus Digital to Show Actus X Intelligent Monitoring With AI at IBC2025

HACKENSACK, N.J. Actus Digital will demonstrate how broadcasters can transform compliance monitoring from a necessary expense into a strategic revenue driver at...

10/07/2025

Comments on FCC Ownership Rules Due in August

The Federal Register has published a summary of the Federal Communications Commission's Public Notice seeking comments on its ownership rules that lists a d...

10/07/2025

Netflix Presents the Official Trailer for 'Superestar'

Back to All News Netflix Presents the Official Trailer for SuperestarPlay Video Play Video Entertainment 10 July 2025 GlobalSpain Link copied to clipboard...

10/07/2025

From Terabytes to Turnkey: AI-Powered Climate Models Go Mainstream

In the race to understand our planet's changing climate, speed and accuracy are everything. But today's most widely used climate simulators often strugg...

10/07/2025

Indonesia on Track to Achieve Sovereign AI Goals With NVIDIA, Cisco and IOH

As one of the world's largest emerging markets, Indonesia is making strides toward its Golden 2045 Vision - an initiative tapping digital technologies and...

10/07/2025

5G for All? What the DFL's Use of Easy5G, RefCam Could Mean for Events in the Future

5G for all? What the DFL's use of Easy5G and RefCam could mean for events in...

10/07/2025

Save the Date: PGA TOUR Studios Welcomes SVG Remote Production Summit on Oct 14-15

Save the Date: PGA TOUR Studios Welcomes SVG Remote Production Summit on Oct 14-...

10/07/2025

Cloud on the Road: How Remote-Production-Service Providers Are Adapting to a New Era

Cloud on the Road: How Remote-Production-Service Providers Are Adapting to a New...

10/07/2025

Seattle Kraken's Ryan Schaber on the NHL Team Taking Live Game Productions In-House

Seattle Kraken's Ryan Schaber on the NHL Team Taking Live Game Productions I...

10/07/2025

FOX Sports Reboots Small Control Room in Los Angeles as Hub for Vertical-First Production

FOX Sports Reboots Small Control Room in Los Angeles as Hub for Vertical-First P...

10/07/2025

SVG Sit-Down: MSE's Zach Leonsis, ViewLift's Rick Allen Go Deep on Joint Venture Targeting Local-Sports-Media Market

SVG Sit-Down: MSE's Zach Leonsis, ViewLift's Rick Allen Go Deep on Joint...

10/07/2025

Bringing Culture Into Focus on My Brilliant Career': First Nations Voices Reshaping Storytelling on Set

Back to All News Bringing Culture Into Focus on My Brilliant Career': Firs...

10/07/2025

Daktronics and Grass Valley Announce Strategic Partnership to Deliver End-to-End Venue Solutions

Strategic Alliance Combines Daktronics' LED Display and Content Management S...

10/07/2025

Reach the PEAK' on GeForce NOW

Grab a friend and climb toward the clouds - PEAK is now available on GeForce NOW, enabling members to try the hugely popular indie hit on virtually any device. ...

10/07/2025

How to Run Coding Assistants for Free on RTX AI PCs and Workstations

Coding assistants or copilots - AI-powered assistants that can suggest, explain and debug code - are fundamentally changing how software is developed for both e...

09/07/2025

Through Their Lens: What Cinematographer Jomo Fray Saw at the 2025 Directors Lab

By Bailey Pennick There's something arresting about the way Jomo Fray captures the world. The cinematographer, now best known for his unparalleled work on ...

09/07/2025

How Sencore is Upgrading IPTV for the Hospitality Industry

Key Highlights Centralized management interface for full control, monitoring, and diagnostics Scalable, multi-site OTT decryption and distribution Secure int...

09/07/2025

L3Harris Appoints Rob Mitrevski to Lead Enterprise Pursuit of Golden Dome

MELBOURNE, Fla., July 9, 2025 - L3Harris Technologies (NYSE: LHX) has appointed Rob Mitrevski as President, Golden Dome Strategy and Integration, a new role cre...

09/07/2025

TAM Ireland awards programme data harmonisation contract to MetaBroadcast and Nielsen's Gracenote

Collaboration will result in improved data quality and understanding of genre-le...

09/07/2025

NAB Slams NextGen TV Critics for Protecting Their Turf

The National Association of Broadcasters is hitting back at critics who oppose its proposal to phase out the current ATSC 1.0 DTV over-the-air standard and tran...

09/07/2025

Zeam Launches on LG Smart TVs

Zeam Media's hyperlocal streaming platform Zeam has announced a new distribution deal with LG that will bring the streaming service to LG smart TVs and devi...

09/07/2025

TAM Ireland awards programme data harmonisation contract...

MetaBroadcast, the UK's leading metadata management specialist, announced today that it was awarded a three-year contract from TAM Ireland (Television Audie...

09/07/2025

Bitfocus transforms complex control for any media applica...

More than 700 professional devices and applications already integrated through open software Bitfocus, the specialist in media control and monitoring, is show...

09/07/2025

Actus Digital Transforms Broadcast Compliance with AI-Pow...

Actus Digital, a LiveU company, will demonstrate how broadcasters can transform compliance monitoring from a necessary expense into a strategic revenue driver a...