
Pinterest has engineered a way to serve its photo-sharing community more of the images they love.
The social-image service, with more than 400 million monthly active users, has trained bigger recommender models for improved accuracy at predicting people's interests.
Pinterest handles hundreds of millions of user requests an hour on any given day. And it must also narrow down relevant images from roughly 300 billion images on the site to roughly 50 for each person.
The last step - ranking the most relevant and engaging content for everyone using Pinterest - required a leap in acceleration to run heftier models, with minimal latency, for better predictions.
Pinterest has improved the accuracy of its recommender models powering people's home feeds and other areas, increasing engagement by as much as 16%.
The leap was enabled by switching from CPUs to NVIDIA GPUs, which could easily be applied next to other areas, including advertising images, according to Pinterest.
Normally we would be happy with a 2% increase, and 16% is just a beginning for home feeds. We see additional gains - it opens a lot of doors for opportunities, said Pong Eksombatchai, a software engineer at Pinterest.
Transformer models capable of better predictions are shaking up industries from retail to entertainment and advertising. But their leaps in performance gains of the past few years have come with a need to serve models that are some 100x bigger as their number of model parameters and computations skyrockets.
Huge Inference Gains, Same Infrastructure Cost Like many, Pinterest engineers wanted to tap into state-of-the-art recommender models to increase engagement. But serving these massive models on CPUs presented a 100x increase in cost and latency. That wasn't going to maintain its magical user experience - fresh and more appealing images - occurring within a fraction of a second.
If that latency happened, then obviously our users wouldn't like that very much because they would have to wait forever, said Eksombatchai. We are pretty close to the limit of what we can do on CPU basically.
The challenge was to serve these hundredfold larger recommender models within the same cost and latency constraints.
Working with NVIDIA, Pinterest engineers began architectural changes to optimize their inference pipeline and recommender models to enable the transition from CPU to GPU cloud instances. The technology transition began late last year and required major changes to how the company manages workloads. The result is a 100x gain in inference efficiency on the same IT budget, meeting their goals.
We are starting to use really, really big models now. And that is where the GPU comes in - to help make these models possible, Eksombatchai said.
Tapping Into cuCollections Switching from CPUs to GPUs required rethinking its inference systems architecture. Among other issues, engineers had to change how they send workloads to their inference servers. Fortunately, there are tools to assist in making the transition easier.
The Pinterest inference server built for CPUs had to be altered because it was set up to send smaller batch sizes to its servers. GPUs can handle much larger workloads, so it's necessary to set up larger batch requests to increase efficiency.
One area where this comes into play is with its embedding table lookup module. Embedding tables are used to track interactions between various context-specific features and interests of user profiles. They can track where you navigate, and what people Pin on Pinterest, share or numerous other actions, helping refine predictions on what users might like to click on next.
They are used to incrementally learn user preference based on context in order to make better content recommendations to those using Pinterest. Its embedding table lookup module required two computation steps repeated hundreds of times because of the number of features tracked.
Pinterest engineers greatly reduced this number of operations using a GPU-accelerated concurrent hash table from NVIDIA cuCollections. And they set up a custom consolidated embedding lookup module so they could merge requests into a single lookup. Better results were seen immediately.
Using cuCollections helped us to remove bottlenecks, said Eksombatchai.
Enlisting CUDA Graphs Pinterest relied on CUDA Graphs to eliminate what was remaining of the small batch operations, further optimizing its inference models.
CUDA Graphs helps reduce the CPU interactions when launching on GPUs. They're designed to enable workloads to be defined as graphs rather than single operations. They provide a mechanism to launch multiple GPU operations through a single CPU operation, reducing CPU overheads.
Pinterest enlisted CUDA Graphs to represent the model inference process as a static graph of operation instead of as those individually scheduled. This enabled the computation to be handled as a single unit without any kernel launching overhead.
The company now supports CUDA Graph as a new backend of its model server. When a model is first loaded, the model server runs the model inference once to build the graph instance. This graph can then be run repeatedly in inference to show content on its app or site.
Implementing CUDA Graphs helped Pinterest to significantly reduce inference latency of its recommender models, according to its engineers.
GPUs have enabled Pinterest to do something that was impossible with CPUs on the same budget, and by doing this they can make changes that have a direct impact on various business metrics.
Learn about Pinterest's GPU-driven inference and optimizations at its GTC session, Serving 100x Bigger Recommender Models, and in the Pinterest Engineering blog.
Register for GTC, running Sept. 19-22, for free to attend sessions with NVIDIA and dozens of industry leaders.
Most recent headlines
05/01/2027
Worlds first 802.15.4ab-UWB chip verified by Calterah and Rohde & Schwarz to be ...
04/08/2026
Dalet, a leading technology and service provider for media-rich organizations, t...
04/07/2026
April 7 2026, 19:00 (PDT) Detective Conan: Fallen Angel of the Highway Opens in...
06/06/2026
Check Out Leading Monitor Brands
We'll have monitors of all shapes and sizes at GearExpo UK, so whether you're looking to upgrade or expand your set...
06/06/2026
Two Originals offerings join MPC line-up
Following on from their partnership announcement at NAMM 2026, Spitfire Audio and Akai Pro have announced the relea...
06/06/2026
Share
Copy link
Facebook
X
Linkedin
Bluesky
Email...
06/06/2026
Share
Copy link
Facebook
X
Linkedin
Bluesky
Email...
06/06/2026
Matrox Video today announced the launch of the Matrox Maevex MGX Series, a new lineup of IPMX-ready video encoders and decoders with USB support that is enginee...
06/06/2026
Atomos (booth# S1006) announced immediate availability of Sumo PRO-19, a new 19-inch 4K HDR monitor-recorder-switcher designed as a central production hub for m...
06/06/2026
Cine Gear Expo 2026 DoPchoice introduces a complete light-shaping system for the new Creamsource Vortex2, further expanding the fixture's wide versatility...
06/06/2026
SmallHD will debut a new OLED 4K production monitor with a 16 display, the OLED 16, that combines OLED's industry-benchmark contrast ratio, exceptional col...
06/06/2026
Ed Lachman ASC, Caleb Deschanel ASC, and M. David Mullen ASC Being Honored at the 30th Annual Cine Gear Expo LA
Cine Gear Expo has announced the recipients of ...
06/06/2026
World premiere at Cine Gear Expo Los Angeles, June 5-6...
06/06/2026
See it first at Cine Gear Expo LA, Booth #S1703
DoPchoice introduces the latest addition to its inflatable AIRGLOW series the 8 2 Frame for RuPixel Canvas. ...
05/06/2026
The university-wide initiative has pushed their creative content to a new level
Although collegiate athletics at a single institution can contain numerous spo...
05/06/2026
In-venue and creative video staffers at the professional and collegiate level ha...
05/06/2026
Synamedia has announced that Lumine Group has agreed to acquire its Video Network business. The company is positioning the transition as the start of a new phas...
05/06/2026
Ateme, Broadcasting Center Europe (BCE), and Scaleway have announced a strategic partnership to deliver a cloud-based media supply chain covering ingest through...
05/06/2026
Audinate has announced three new additions to its Dante AVIO Install adapter series: a 4-Channel Analog Input, a 4-Channel Analog Output, and a 2-Ch In/2-Ch Out...
05/06/2026
Sportradar Group AG has announced a multi-year extension of its exclusive global...
05/06/2026
The M6 Group will broadcast FIFA World Cup 2026 matches live and in Ultra High D...
05/06/2026
The Athletic has announced that PGA TOUR highlights will be integrated into its golf coverage beginning with the Memorial Tournament presented by Workday. PGA T...
05/06/2026
When the FIFA World Cup arrives in North America in 2026, it will bring more tha...
05/06/2026
FIFA and DAZN have announced the launch of FIFA exclusively on DAZN, consolidating FIFA's content portfolio within DAZN's sports platform. The move fol...
05/06/2026
Telemundo's exclusive Spanish-language coverage of the FIFA World Cup 2026 G...
05/06/2026
Dolby Laboratories and NBCUniversal have announced that Peacock will stream Tele...
05/06/2026
Formula 1 has announced a 10-year extension to keep the Las Vegas Grand Prix on the F1 calendar through 2037. Las Vegas Grand Prix, Inc., Clark County, and the ...
05/06/2026
The broadcast-engineering team overcomes wind, speed, and salt water - and dista...
05/06/2026
Deploying both onsite and remote crews, the company is providing calibrated-came...
05/06/2026
With Inter&Co Stadium unavailable, ESPN's UFL team rebuilt its broadcast pla...
05/06/2026
One of the most exciting and informative events on the SVG annual event calendar is the Regional Sports Production Summit, an annual gathering of industry profe...
05/06/2026
For the race's third year at the historic racetrack, the broadcaster has added cameras and will incorporate multiple drones
The 158th edition of the Belmon...
05/06/2026
Ratings Roundup is a rundown of recent rating news and is derived from press rel...
05/06/2026
Built from the same recordings as flagship library
Sonuscore's LUX Orchestral Strings has been met with widespread praise since its launch in late 2025,...
05/06/2026
High-end converter, interface & headphone amp upgraded
Said to represent the next evolution of RME's all-in-one reference converter concept, the all-new...
05/06/2026
Win a Soundgas Type 636P & Type G preamps
Soundgas, one of the UK's leading vintage and boutique audio equipment specialists have just announced the lau...
05/06/2026
New leadership of Technology Systems Division at Rohde & Schwarz On July 1, 2026, Hansj rg Herrbold and Andreas H gele will take over as Executive Vice Presid...
05/06/2026
Hitachi and Intel announced a strategic collaboration to explore opportunities t...
05/06/2026
MRI-Simmons and S&P Global Mobility are expanding advanced audience capabilities...
05/06/2026
New Nielsen data shows insurance ad spend grew 11%, while consumers remain highl...
05/06/2026
Share
Copy link
Facebook
X
Linkedin
Bluesky
Email...
05/06/2026
ASG Promotes Joe Marchitto to Western Regional CTO
Brie Clayton June 5, 2026
0 Comments
Appointment to Support Engineering Alignment and Client Experi...
05/06/2026
Stargate Studios Colombia Uses DaVinci Resolve Studio for Vertical Microdramas
Brie Clayton June 5, 2026
0 Comments
End to end post in one platform al...
05/06/2026
People Need to Come First When We Use AI
Andy Marken June 5, 2026
0 Comments
It's just surviving. Life's very existence requires destruction....
05/06/2026
Share
Copy link
Facebook
X
Linkedin
Bluesky
Email...
05/06/2026
Share
Copy link
Facebook
X
Linkedin
Bluesky
Email...
05/06/2026
Share
Copy link
Facebook
X
Linkedin
Bluesky
Email...
05/06/2026
Share
Copy link
Facebook
X
Linkedin
Bluesky
Email...
05/06/2026
Share
Copy link
Facebook
X
Linkedin
Bluesky
Email...
05/06/2026
Share
Copy link
Facebook
X
Linkedin
Bluesky
Email...