Sony Pixel Power calrec Sony

Seamless in Seattle: NVIDIA Research Showcases Advancements in Visual Generative AI at CVPR

17/06/2024

NVIDIA researchers are at the forefront of the rapidly advancing field of visual generative AI, developing new techniques to create and interpret images, videos and 3D environments.

More than 50 of these projects will be showcased at the Computer Vision and Pattern Recognition (CVPR) conference, taking place June 17-21 in Seattle. Two of the papers - one on the training dynamics of diffusion models and another on high-definition maps for autonomous vehicles - are finalists for CVPR's Best Paper Awards.

NVIDIA is also the winner of the CVPR Autonomous Grand Challenge's End-to-End Driving at Scale track - a significant milestone that demonstrates the company's use of generative AI for comprehensive self-driving models. The winning submission, which outperformed more than 450 entries worldwide, also received CVPR's Innovation Award.

NVIDIA's research at CVPR includes a text-to-image model that can be easily customized to depict a specific object or character, a new model for object pose estimation, a technique to edit neural radiance fields (NeRFs) and a visual language model that can understand memes. Additional papers introduce domain-specific innovations for industries including automotive, healthcare and robotics.

Collectively, the work introduces powerful AI models that could enable creators to more quickly bring their artistic visions to life, accelerate the training of autonomous robots for manufacturing, and support healthcare professionals by helping process radiology reports.

Artificial intelligence, and generative AI in particular, represents a pivotal technological advancement, said Jan Kautz, vice president of learning and perception research at NVIDIA. At CVPR, NVIDIA Research is sharing how we're pushing the boundaries of what's possible - from powerful image generation models that could supercharge professional creators to autonomous driving software that could help enable next-generation self-driving cars.

At CVPR, NVIDIA also announced NVIDIA Omniverse Cloud Sensor RTX, a set of microservices that enable physically accurate sensor simulation to accelerate the development of fully autonomous machines of every kind.

Forget Fine-Tuning: JeDi Simplifies Custom Image Generation Creators harnessing diffusion models, the most popular method for generating images based on text prompts, often have a specific character or object in mind - they may, for example, be developing a storyboard around an animated mouse or brainstorming an ad campaign for a specific toy.

Prior research has enabled these creators to personalize the output of diffusion models to focus on a specific subject using fine-tuning - where a user trains the model on a custom dataset - but the process can be time-consuming and inaccessible for general users.

JeDi, a paper by researchers from Johns Hopkins University, Toyota Technological Institute at Chicago and NVIDIA, proposes a new technique that allows users to easily personalize the output of a diffusion model within a couple of seconds using reference images. The team found that the model achieves state-of-the-art quality, significantly outperforming existing fine-tuning-based and fine-tuning-free methods.

JeDi can also be combined with retrieval-augmented generation, or RAG, to generate visuals specific to a database, such as a brand's product catalog.

https://blogs.nvidia.com/wp-content/uploads/2024/06/JeDi-cow-sculpture.mp4

New Foundation Model Perfects the Pose NVIDIA researchers at CVPR are also presenting FoundationPose, a foundation model for object pose estimation and tracking that can be instantly applied to new objects during inference, without the need for fine-tuning.

The model, which set a new record on a popular benchmark for object pose estimation, uses either a small set of reference images or a 3D representation of an object to understand its shape. It can then identify and track how that object moves and rotates in 3D across a video, even in poor lighting conditions or complex scenes with visual obstructions.

FoundationPose could be used in industrial applications to help autonomous robots identify and track the objects they interact with. It could also be used in augmented reality applications where an AI model is used to overlay visuals on a live scene.

NeRFDeformer Transforms 3D Scenes With a Single Snapshot A NeRF is an AI model that can render a 3D scene based on a series of 2D images taken from different positions in the environment. In fields like robotics, NeRFs can be used to generate immersive 3D renders of complex real-world scenes, such as a cluttered room or a construction site. However, to make any changes, developers would need to manually define how the scene has transformed - or remake the NeRF entirely.

Researchers from the University of Illinois Urbana-Champaign and NVIDIA have simplified the process with NeRFDeformer. The method, being presented at CVPR, can successfully transform an existing NeRF using a single RGB-D image, which is a combination of a normal photo and a depth map that captures how far each object in a scene is from the camera.

VILA Visual Language Model Gets the Picture A CVPR research collaboration between NVIDIA and the Massachusetts Institute of Technology is advancing the state of the art for vision language models, which are generative AI models that can process videos, images and text.

The group developed VILA, a family of open-source visual language models that outperforms prior neural networks on key benchmarks that test how well AI models answer questions about images. VILA's unique pretraining process unlocked new model capabilities, including enhanced world knowledge, stronger in-context learning and the ability to reason across multiple images.

VILA can understand memes and reason based on multiple images or video frames. The VILA model fa
LINK: https://blogs.nvidia.com/blog/visual-generative-ai-cvpr-research/...
See more stories from nvidia

North America Stories

19/07/2024

L3Harris launches PilotApp - redefining flight data intelligence for aviation safety and efficiency

L3Harris Commercial Aviation has launched PilotApp, designed to empower pilots w...

19/07/2024

L3Harris Secures Avionics Contract with Air India for Next-Generation Voice and Data Recorders

L3Harris Technologies announce a landmark agreement with Air India to become lea...

19/07/2024

Comcast to Offer 'Enhanced 4K Coverage of Paris Olympics on USA Network

PHILADELPHIA Comcast has unveiled new details of its plans for offering enhanced 4K from Xfinity and said that its first enhanced 4K feeds will be available as...

19/07/2024

Sky News UK Among Global Broadcasters Hit by IT Outage

Sky News UK and Sky Sports News have been taken off air by what's being described as the biggest IT outage of all time'....

19/07/2024

FCC Adopts R&O To Make Closed Captioning Settings Easy To Access

WASHINGTON, D.C. Watching television for those with hearing-impairments will become a bit easier following adoption July 18 of a Federal Communications Commissi...

19/07/2024

Perifery Launches AI+ 2.0 to Revitalize and Monetize Media Content

Perifery Launches AI 2.0 to Revitalize and Monetize Media Content Brie Clayton July 19, 2024 0 Comments Advanced AI Software Suite Enables Users to L...

19/07/2024

Comcast NBCU Will Provide Free Olympics Stream for Military Community

Comcast NBCUniversal said it is working with the Army & Air Force Exchange Service to provide military community members with free streaming of NBCU's cover...

19/07/2024

Tom Fenton, Dean of American Foreign Correspondents,' Has Died

Tom Fenton, former CBS News correspondent, died July 16 in Novato, California. He was 94....

19/07/2024

Local News Close-Up: Capital Gains in Albany, New York

The lawmakers who make up the state legislature in Albany, New York, headed home at the end of June, but there's still plenty going on in the Capital Region...

19/07/2024

Access Hollywood' Heads to Paris

Access Hollywood and Access Daily with Mario & Kit will start previewing the 2024 Paris Summer Olympics starting Monday, July 22 with a week of special programm...

19/07/2024

NBCU Transforms Rockefeller Center Into Paris-Themed Olympic Hub

NBCUniversal is turning Rockefeller Center, its base in midtown Manhattan, into what it calls a hub for Team USA fans during the Olympics. That includes Parisia...

19/07/2024

Anchor Tom Garris Jumps From WTAE Pittsburgh to WMUR Manchester (NH)

Tom Garris, weekend morning anchor and weekday reporter at WTAE Pittsburgh, is moving to WMUR in Manchester, New Hampshire. Both are part of Hearst Television a...

19/07/2024

Comcast to Offer Enhanced 4K Coverage of Paris Olympics on USA Network

PHILADELPHIA Comcast has unveiled new details of its plans for offering enhanced 4K from Xfinity and said that its first enhanced 4K feeds will be available as...

19/07/2024

A Smart Z-Finder Review: Transforming the small screen of a smartphone into a powerful tool

The Zacuto Smart Z-Finder is an innovative viewfinder designed for smartphone fi...

19/07/2024

Altice USA Launches $30 a Month 'Entertainment TV'

NEW YORK Altice USA's Optimum is launching Entertainment TV, a new internet TV package of 80 plus channels for $30 a month that is available exclusively on ...

19/07/2024

Calrec Celebrates Diamond Jubilee with New Audio Tech at IBC 2024

As it celebrates its diamond jubilee this year, Calrec has announced that it will be pushing the boundaries of audio broadcasting at IBC 2024 with a full range ...

19/07/2024

CBS Sports Inks Multi-Platform Rights Deal with English Football League

NEW YORK CBS Sports and the English Football League (EFL) have announced an exclusive, multi-year, multi-platform rights agreement that will see CBS Sports offe...

19/07/2024

Netflix Subs Hit 277.6M as Revenue and Profits Spike

LOS GATOS, Calif. Netflix posted very strong Q2 2024 financials, with global Netflix subs growing 16.5% to 277.65 million, revenue up 17% and operating income s...

19/07/2024

ALIBI Music Sets the Tone for True Crime with New Underscores and Drones

ALIBI Music Sets the Tone for True Crime with New Underscores and Drones Brie Clayton July 18, 2024 0 Comments These seven production music albums rat...

19/07/2024

Christmas is Coming to Ting Park and Guy Stadium This Weekend

Only a Few More Chances to see the Salamanders and Yard Gnomes This Year The Holly Springs Salamanders have another big weekend coming up at Ting Park! Join t...

19/07/2024

Paramount Advertising Launches Self-Service Platform for Smaller Businesses

Paramount Advertising said it launched its self-serve ad buying platform, designed to attract more ad dollars from small and mid-sized businesses and other mark...

19/07/2024

Brian Lesser Returns To GroupM as Global CEO

Brian Lesser was named Global CEO of GroupM, the big media buying company that is part of WPP....

19/07/2024

KBLR Las Vegas Names Katia Gutirrez News Anchor and Multimedia Journalist

Katia Guti rrez has been promoted to news anchor and multimedia journalist at KBLR Las Vegas, focused on Noticiero Telemundo at the station. She starts in the n...

19/07/2024

Season 2 of Frasier' Sees Crane Back at Seattle Radio Station

Comedy Frasier returns for Season 2 on Thursday, September 19 on Paramount Plus. Two episodes are out that day, before they drop weekly on Thursdays....

19/07/2024

Total TV Ad Impressions Down 3.73% in First Half: iSpot Report

Total TV ad impressions on streaming and linear TV dipped 3.73% to 4.23 trillion in the first half of 2024, according to a new report from iSpot.tv....

19/07/2024

Insurer Progressive Had Most National and Local Ad Impressions, AdImpact Reports

Insurance company Progressive led all advertisers in national and local impressions in June and July, according to AdImpact's new TV intelligence platform, ...

19/07/2024

Francis Ford Coppola, Grateful Dead, Bonnie Raitt Get Kennedy Center Honors

Filmmaker Francis Ford Coppola, jam band the Grateful Dead, singer-songwriter Bonnie Raitt, jazz performer Arturo Sandoval and New York theater The Apollo will ...

19/07/2024

Google Signs Up With NBCU as Search Sponsor for U.S. Olympic Team

NBCUniversal said that Google has signed up as the official search AI partner for the U.S. Olympic Team and will be a part of NBCU's coverage of the Paris O...

19/07/2024

FOR-A Showcases Software-Defined IP Solutions and Hybrid Production at IBC2024

Company to Demonstrate Flexible Workflows for Remote Production, ST 2110 Transition, and XR Applications...

19/07/2024

Ross Video Brings Global Innovation and Local Expertise to SET Expo 2024

Explore cutting-edge video production solutions with Ross Video and Alliance Technologies at Brazil's largest broadcast tradeshow. Ottawa, Canada - July 1...

19/07/2024

A Dive Into Inside-Out Mixing

By Craig Anderton There's no right or wrong way to mix. For example, many successful engineers adjust individual tracks, and then mix groups of tracks....

19/07/2024

Magnetic Marvels: NVIDIA’s Supercomputers Spin a Quantum Tale

Research published earlier this month in the science journal Nature used NVIDIA-powered supercomputers to validate a pathway toward the commercialization of qua...

18/07/2024

2024 Trans Possibilities Intensive Fellows Announced

Today the nonprofit Sundance Institute announced the artists selected to participate in the Sundance Institute Trans Possibilities Intensive, a three-part event...

18/07/2024

L3Harris Delivers Open-Systems Expertise for the Team Lynx Next-Gen Combat Vehicle

We're outfitting Team Lynx's XM30 with digitally engineered mission syst...

18/07/2024

L3Harris and Epirus Working Collaboratively to Maximize Tactical Radio Efficiency

MELBOURNE, Fla., July 18, 2024 - L3Harris Technologies (NYSE:LHX) is working col...

18/07/2024

L3Harris supports Air Astana's growing training capacity with second Reality7e A320neo Full Flight Simulator

L3Harris is pleased to announce that it has secured a contract with Air Astana J...

18/07/2024

Comcast to Offer Enhanced 4K of Paris Olympics Coverage on USA Network

PHILADELPHIA Comcast has unveiled new details of its plans for offering enhanced 4K from Xfinity and said that its first enhanced 4K feeds will be available as...

18/07/2024

ThinkAnalytics, TMT Insights To Highlight New Partnership at IBC 2024

GLASGOW, U.K. AND DALLAS Artificial Intelligence-based content discovery, audience insight and targeted advertising expert ThinkAnalytics has formed a strategic...

18/07/2024

A Good Movie Is a Good Movie No Matter Where It's Seen

A Good Movie Is a Good Movie No Matter Where It's Seen Andy Marken July 18, 2024 0 Comments The life we enjoy is very much worth the sacrifice. ...

18/07/2024

Asahi Broadcasting Television Uses URSA Broadcast G2 Cameras to Capture Its Streaming Programming

Asahi Broadcasting Television Uses URSA Broadcast G2 Cameras to Capture Its Stre...

18/07/2024

Faculty Notes: Spring/Summer 2024

Faculty Notes: Spring/Summer 2024 Recent accomplishments, releases, and events by Berklee faculty. July 8, 2024 Boston Conservatory at Berklee Assistant Pro...

18/07/2024

Jim McNeely and Miguel Zenn Named the 2024-2025 Ken Pullig Visiting Scholars in Jazz Studies

Jim McNeely and Miguel Zen n Named the 2024-2025 Ken Pullig Visiting Scholars in...

18/07/2024

Jim Lucchese Named Berklee's Fifth President

Jim Lucchese Named Berklee's Fifth President A 20-year music industry veteran and a committed artist advocate with deep Berklee roots, Lucchese sets the t...

18/07/2024

farmerswife and Cirkus to Showcase Advancements to Manage...

In the lead up to IBC2024, farmerswife, a leading provider of tailored project management solutions for the media sector, has today announced what to expect fro...

18/07/2024

Calrec IBC Preview 2024

Featured products at IBC 2024 Celebrating its diamond jubilee this year, Calrec has been putting sound in the picture for six decades and counting, and is stil...

18/07/2024

ioMoVo Unveils Advanced Digital Asset Management System f...

ioMoVo Corp, an AI-driven Digital Asset Management (DAM) developer, will unveil its innovative suite of asset management solutions tailored for the Media & Ente...

18/07/2024

LiveU Celebrates 18th Birthday at IBC2024 with Dedicated...

In celebration of working with remarkable customers for the last 18 years, LiveU is set to unveil its radically efficient and nimble cloud live production solut...

18/07/2024

ZTransform Recruits Broadcast Industry Veteran to Bolster...

Former TEGNA Executive Director of Technology/Broadcast, Reed Wilson brings valuable end-user perspective to Seattle-based systems and solutions provider ZTra...

18/07/2024

SipRadius Redefines Intercom with SipVault for Secure Rem...

Exhibiting alongside MistServer (stand 5.F43, RAI Amsterdam, 13 16 September), SipRadius will be debuting SipVault, a brand-new and innovative approach to int...

18/07/2024

Lightware Announces Advanced Voice Tracking Solution for...

Lightware Visual Engineering, a leading manufacturer of connectivity solutions for the professional integrated systems market and a pioneer in signal management...