Next-Gen Neural Networks: NVIDIA Research Announces Array of AI Advancements at NeurIPS

25/10/2023

NVIDIA researchers are collaborating with academic centers worldwide to advance generative AI, robotics and the natural sciences - and more than a dozen of these projects will be shared at NeurIPS, one of the world's top AI conferences.

Set for Dec. 10-16 in New Orleans, NeurIPS brings together experts in generative AI, machine learning, computer vision and more. Among the innovations NVIDIA Research will present are new techniques for transforming text to images, photos to 3D avatars, and specialized robots into multi-talented machines.

NVIDIA Research continues to drive progress across the field - including generative AI models that transform text to images or speech, autonomous AI agents that learn new tasks faster, and neural networks that calculate complex physics, said Jan Kautz, vice president of learning and perception research at NVIDIA. These projects, often done in collaboration with leading minds in academia, will help accelerate developers of virtual worlds, simulations and autonomous machines.

Picture This: Improving Text-to-Image Diffusion Models Diffusion models have become the most popular type of generative AI models to turn text into realistic imagery. NVIDIA researchers have collaborated with universities on multiple projects advancing diffusion models that will be presented at NeurIPS.

A paper accepted as an oral presentation focuses on improving generative AI models' ability to understand the link between modifier words and main entities in text prompts. While existing text-to-image models asked to depict a yellow tomato and a red lemon may incorrectly generate images of yellow lemons and red tomatoes, the new model analyzes the syntax of a user's prompt, encouraging a bond between an entity and its modifiers to deliver a more faithful visual depiction of the prompt.

SceneScape, a new framework using diffusion models to create long videos of 3D scenes from text prompts, will be presented as a poster. The project combines a text-to-image model with a depth prediction model that helps the videos maintain plausible-looking scenes with consistency between the frames - generating videos of art museums, haunted houses and ice castles (pictured above).

Another poster describes work that improves how text-to-image models generate concepts rarely seen in training data. Attempts to generate such images usually result in low-quality visuals that aren't an exact match to the user's prompt. The new method uses a small set of example images that help the model identify good seeds - random number sequences that guide the AI to generate images from the specified rare classes.

A third poster shows how a text-to-image diffusion model can use the text description of an incomplete point cloud to generate missing parts and create a complete 3D model of the object. This could help complete point cloud data collected by lidar scanners and other depth sensors for robotics and autonomous vehicle AI applications. Collected imagery is often incomplete because objects are scanned from a specific angle - for example, a lidar sensor mounted to a vehicle would only scan one side of each building as the car drives down a street.

Character Development: Advancements in AI Avatars AI avatars combine multiple generative AI models to create and animate virtual characters, produce text and convert it to speech. Two NVIDIA posters at NeurIPS present new ways to make these tasks more efficient.

A poster describes a new method to turn a single portrait image into a 3D head avatar while capturing details including hairstyles and accessories. Unlike current methods that require multiple images and a time-consuming optimization process, this model achieves high-fidelity 3D reconstruction without additional optimization during inference. The avatars can be animated either with blendshapes, which are 3D mesh representations used to represent different facial expressions, or with a reference video clip where a person's facial expressions and motion are applied to the avatar.

Another poster by NVIDIA researchers and university collaborators advances zero-shot text-to-speech synthesis with P-Flow, a generative AI model that can rapidly synthesize high-quality personalized speech given a three-second reference prompt. P-Flow features better pronunciation, human likeness and speaker similarity compared to recent state-of-the-art counterparts. The model can near-instantly convert text to speech on a single NVIDIA A100 Tensor Core GPU.

Research Breakthroughs in Reinforcement Learning, Robotics In the fields of reinforcement learning and robotics, NVIDIA researchers will present two posters highlighting innovations that improve the generalizability of AI across different tasks and environments.

The first proposes a framework for developing reinforcement learning algorithms that can adapt to new tasks while avoiding the common pitfalls of gradient bias and data inefficiency. The researchers showed that their method - which features a novel meta-algorithm that can create a robust version of any meta-reinforcement learning model - performed well on multiple benchmark tasks.

Another by an NVIDIA researcher and university collaborators tackles the challenge of object manipulation in robotics. Prior AI models that help robotic hands pick up and interact with objects can handle specific shapes but struggle with objects unseen in the training data. The researchers introduce a new framework that estimates how objects across different categories are geometrically alike - such as drawers and pot lids that have similar handles - enabling the model to more quickly generalize to new shapes.

Supercharging Science: AI-Accelerated Physics, Climate, Healthcare NVIDIA researchers at NeurIPS will also present papers across the natural sciences - covering physics simulations, climate models and AI fo

LINK:	https://blogs.nvidia.com/blog/2023/10/25/neurips-ai-research/...
	See more stories from nvidia

Most recent headlines

05/01/2027

Worlds first 802.15.4ab-UWB chip verified by Calterah and Rohde & Schwarz to be demoed at CES 2026

Worlds first 802.15.4ab-UWB chip verified by Calterah and Rohde & Schwarz to be ...

07/10/2026

Dalet Flex LTS Delivers Smarter Media Operations from Ingest to Distribution

Dalet, a leading technology and service provider for media-rich organizations, today announced the latest Long-Term Supported (LTS) release of Dalet Flex. Build...

06/09/2026

Dolby and MagentaTV Bring Fans Closer to the FIFA World Cup 2026 in Germany with Dolby Vision and Dolby Atmos

June 9 2026, 23:00 (PDT) Dolby and MagentaTV Bring Fans Closer to the FIFA Worl...

04/08/2026

Dalet Announces Commercial Availability of Dalia, Bringing Media-Aware Agentic AI to Enterprise Productions

Dalet, a leading technology and service provider for media-rich organizations, t...

15/07/2026

S&P Analysis: Three Quarters of Americans Watch Live Sports

Share Copy link Facebook X Linkedin Bluesky Email...

15/07/2026

Scripps Sports, Ion Score Women's Volleyball Rights

Share Copy link Facebook X Linkedin Bluesky Email...

14/07/2026

Bowling Green State Upgrades Doyt Perry Stadium With New Daktronics LED Display

South end zone videoboard, cloud-based control system will be ready for 2026 football season...

14/07/2026

Mizzou Athletics Launches Connected Digital Platform

Redesigned website, enhanced mobile app unify content, ticketing, and personalized fan engagement...

14/07/2026

DePaul Athletics, Playfly Sports Agree to Multimedia Rights Partnership

Agreement spans sponsorship sales, digital monetization, radio production, and new practice facility naming rights...

14/07/2026

American Association Expands Broadcast Reach Through FanDuel Sports Network Partnership

Independent league adds 14 regional sports network affiliates, growing distribut...

14/07/2026

Euroleague Basketball Introduces Euroleague Basketball+ Digital Ecosystem Initiative

New strategy aims to unify competitions, content, fan engagement, and commercial...

14/07/2026

Professional Fighters League, ESPN Reach Multi-Year Media Rights Deal for Brazil

ESPN and Disney+ become exclusive home of PFL events in key international MMA market...

14/07/2026

TEGNA Names Scott Gill VP of Technology and Operations

Gill will oversee engineering, technology, and sports operations across the company's 64 local television stations...

14/07/2026

Guest Post: Dynamic Media Facilities Could Reshape the Future of Broadcast Workflows

Submitted by North American Broadcasters Association (NABA) As broadcasters con...

14/07/2026

Bayerischer Rundfunk Debuts Fully Software-Defined SMPTE ST 2110 Radio OB Van Built Around Lawo Technology

Modernized mobile unit combines HOME Apps, mc 56 console, and IP infrastructure ...

14/07/2026

Scripps Sports, ION Secure U.S. Rights to 2027 FIVB Womens Volleyball World Cup

Every match of the 32-team tournament will air across ION and Scripps Sports platforms in English and Spanish...

14/07/2026

FloSports Lands Exclusive U.S. Rights to IIHF Mens World Championship Beginning in 2027

FloHockey to stream every game of the annual international tournament under four...

14/07/2026

Minnesota Lynx Add Three Games to KARE 11s Over-the-Air Schedule

Victory+ telecasts to be simulcast on TEGNA-owned station, expanding free local distribution...

14/07/2026

Avalanche Tones debut with Chainsaw Suite

Plug-ins for heavy music Avalanche Tones is the brainchild of Ava Toton, a 17-year-old musician and developer who says her goal is to make the lives of gui...

14/07/2026

IK Multimedia introduce ReSing Voices Brazilian Pack

Launched alongside new Singer Showcase purchase model IK Multimedia's innovative vocal-synthesis software has just gained its latest voice add-on, the R...

14/07/2026

MIDI Innovations Awards 2026

Registration open until 1 September 2026 The MIDI Association have revealed that the registration deadline for this year's MIDI Innovation Awards has no...

14/07/2026

Launchkey MK4 88 joins Novation line-up

88-note model completes MK4 range Novation have just introduced the final model in their flagship MIDI controller keyboard range, the Launchkey MK4 88. Roun...

14/07/2026

CBS Atlanta Adds a Noon Newscast

Share Copy link Facebook X Linkedin Bluesky Email...

14/07/2026

Tegna Names Scott Gill VP, Technology and Operations

Share Copy link Facebook X Linkedin Bluesky Email...

14/07/2026

Colorado Wildfires Bring Close Call for Broadcasters

Share Copy link Facebook X Linkedin Bluesky Email...

14/07/2026

IBC2026 sets conference agenda

IBC2026 has unveiled a powerful Conference programme bringing together global media leaders, technology innovators, creators, sports organisations, broadcasters...

14/07/2026

Nominations for Best of Show Awards at IBC2026 Now Open

Share Copy link Facebook X Linkedin Bluesky Email...

14/07/2026

Broadcast Solutions delivers industry-first software-defi...

Broadcast Solutions, a leading systems integrator and provider of innovative solutions for the broadcast media industry, has delivered two highly capable outsid...

14/07/2026

UPDATED: Scripps, DirecTV End Blackout, Ink New Retrans Deal

Share Copy link Facebook X Linkedin Bluesky Email...

14/07/2026

12 States Sue to Block $110 Billion Warner Bros./Paramount Merger

Share Copy link Facebook X Linkedin Bluesky Email...

14/07/2026

Heidi Raphael to Head N.Y. State Broadcasters Association

Share Copy link Facebook X Linkedin Bluesky Email...

14/07/2026

CBS Atlanta Expands Live Local News Programming

Share Copy link Facebook X Linkedin Bluesky Email...

14/07/2026

Nemotron Labs: How Open Models Give Enterprises and Nations AI They Can Trust, Control and Customize

Editor's note: This post is part of the Nemotron Labs blog series, which exp...

14/07/2026

Techtel Successfully Relocates AICD Broadcast Studio to New Sydney Headquarters

Techtel Successfully Relocates AICD Broadcast Studio to New Sydney Headquarters BroadcastBroadcast EquipmentLive StreamingBroadcast Studio2026 14 July Writ...

14/07/2026

First look revealed for Friday the 13th prequel, Crystal Lake, from A24 coming to Sky and NOW in the UK and Ireland this October

Tuesday 14 July 2026 First look revealed for Friday the 13th prequel, Crystal ...

14/07/2026

Surround Is Still the Standard

When immersive audio dominates industry headlines, it's easy to assume that every broadcaster is preparing for an Atmos future. The reality is quite differ...

14/07/2026

Fresh Thinking from MAD//Fest London 2026

Emma and Sophie from ICG's marketing team joined thousands of fellow marketers, brands and agencies at MAD//Fest London 2026, one of the UK's biggest ma...

14/07/2026

Seven paradoxes shaping the next era of media production - Episode 3

Why Trusted and Secure Media Operations Matter In this series, we explore the technologies, architectures and operational realities shaping modern media operati...

14/07/2026

How Merchants Can Prepare for the Next Evolution in Digital Commerce

Pilot Project Shows How Retailers Are Prepared for the Next Step in the Evolution of Digital Commerce Arvato Systems Drives Agentic Commerce Forward G terslo...

14/07/2026

Building a more sustainable future - Our commitment to climate action

As part of this commitment, weve joined the SME Climate Hub, publicly pledging to: measure our greenhouse gas emissions reduce them in line with a net zero p...

14/07/2026

Why Performance per Watt Is the Ultimate Metric for AI Infrastructure Efficiency

Power is AI infrastructure's inescapable constraint. How many tokens an AI factory can generate within a fixed power budget determines its revenue and profi...

13/07/2026

BravesVision GM Jeff Cravens on Launching MLB's Newest Team-Owned Network in 35 Days

The Braves opted to keep production in-house rather than hand it off to MLB...

13/07/2026

Behind The Mic: Adam Schefter Signs Multi-Year Extension with ESPN

Behind The Mic provides a roundup of recent news regarding on-air talent, including new deals, departures, and assignments compiled from press releases and repo...

13/07/2026

Eurovision Sport and European Athletics Bring Live Athletics to More Fans with Multilingual AI Commentary Initiative

Eurovision Sport is making live athletics more accessible to fans than ever befo...

13/07/2026

Milwaukee Bucks Return to Full-Season Over-the-Air Television for First Time in 31 Years

The Milwaukee Bucks will return to full-season over-the-air television for the 2...

13/07/2026

SMPTE Expands Education Offerings with Connected Learning Path for IP Media Workflows

SMPTE has announced an expanded education pathway for media technology professio...

13/07/2026

Vizrt Graphics Power Netflix MVP MMA Event at Intuit Dome

Vizrt has announced that its graphics technology was used by broadcast design agency Girraphic for Netflix's debut MVP MMA event, broadcast live from the In...

13/07/2026

ARRI To Sell Global Rental Business to H2 Equity Partners in Management Buyout

ARRI has announced an agreement to sell its global rental activities in Europe, the United Kingdom, and North America to H2 Equity Partners through a management...

13/07/2026

DAZN and Premier Boxing Champions Announce Global Broadcasting Partnership

DAZN has announced a partnership with Premier Boxing Champions (PBC) to bring PBC fight nights to DAZN subscribers globally. The partnership begins Saturday, Ju...

13/07/2026

TikTok and WSC Sports Partner To Connect Sports Rightsholders With Content Creators

TikTok and WSC Sports have announced a strategic partnership that gives sports r...

View most recent headlines