
Pinterest has engineered a way to serve its photo-sharing community more of the images they love.
The social-image service, with more than 400 million monthly active users, has trained bigger recommender models for improved accuracy at predicting people's interests.
Pinterest handles hundreds of millions of user requests an hour on any given day. And it must also narrow down relevant images from roughly 300 billion images on the site to roughly 50 for each person.
The last step - ranking the most relevant and engaging content for everyone using Pinterest - required a leap in acceleration to run heftier models, with minimal latency, for better predictions.
Pinterest has improved the accuracy of its recommender models powering people's home feeds and other areas, increasing engagement by as much as 16%.
The leap was enabled by switching from CPUs to NVIDIA GPUs, which could easily be applied next to other areas, including advertising images, according to Pinterest.
Normally we would be happy with a 2% increase, and 16% is just a beginning for home feeds. We see additional gains - it opens a lot of doors for opportunities, said Pong Eksombatchai, a software engineer at Pinterest.
Transformer models capable of better predictions are shaking up industries from retail to entertainment and advertising. But their leaps in performance gains of the past few years have come with a need to serve models that are some 100x bigger as their number of model parameters and computations skyrockets.
Huge Inference Gains, Same Infrastructure Cost Like many, Pinterest engineers wanted to tap into state-of-the-art recommender models to increase engagement. But serving these massive models on CPUs presented a 100x increase in cost and latency. That wasn't going to maintain its magical user experience - fresh and more appealing images - occurring within a fraction of a second.
If that latency happened, then obviously our users wouldn't like that very much because they would have to wait forever, said Eksombatchai. We are pretty close to the limit of what we can do on CPU basically.
The challenge was to serve these hundredfold larger recommender models within the same cost and latency constraints.
Working with NVIDIA, Pinterest engineers began architectural changes to optimize their inference pipeline and recommender models to enable the transition from CPU to GPU cloud instances. The technology transition began late last year and required major changes to how the company manages workloads. The result is a 100x gain in inference efficiency on the same IT budget, meeting their goals.
We are starting to use really, really big models now. And that is where the GPU comes in - to help make these models possible, Eksombatchai said.
Tapping Into cuCollections Switching from CPUs to GPUs required rethinking its inference systems architecture. Among other issues, engineers had to change how they send workloads to their inference servers. Fortunately, there are tools to assist in making the transition easier.
The Pinterest inference server built for CPUs had to be altered because it was set up to send smaller batch sizes to its servers. GPUs can handle much larger workloads, so it's necessary to set up larger batch requests to increase efficiency.
One area where this comes into play is with its embedding table lookup module. Embedding tables are used to track interactions between various context-specific features and interests of user profiles. They can track where you navigate, and what people Pin on Pinterest, share or numerous other actions, helping refine predictions on what users might like to click on next.
They are used to incrementally learn user preference based on context in order to make better content recommendations to those using Pinterest. Its embedding table lookup module required two computation steps repeated hundreds of times because of the number of features tracked.
Pinterest engineers greatly reduced this number of operations using a GPU-accelerated concurrent hash table from NVIDIA cuCollections. And they set up a custom consolidated embedding lookup module so they could merge requests into a single lookup. Better results were seen immediately.
Using cuCollections helped us to remove bottlenecks, said Eksombatchai.
Enlisting CUDA Graphs Pinterest relied on CUDA Graphs to eliminate what was remaining of the small batch operations, further optimizing its inference models.
CUDA Graphs helps reduce the CPU interactions when launching on GPUs. They're designed to enable workloads to be defined as graphs rather than single operations. They provide a mechanism to launch multiple GPU operations through a single CPU operation, reducing CPU overheads.
Pinterest enlisted CUDA Graphs to represent the model inference process as a static graph of operation instead of as those individually scheduled. This enabled the computation to be handled as a single unit without any kernel launching overhead.
The company now supports CUDA Graph as a new backend of its model server. When a model is first loaded, the model server runs the model inference once to build the graph instance. This graph can then be run repeatedly in inference to show content on its app or site.
Implementing CUDA Graphs helped Pinterest to significantly reduce inference latency of its recommender models, according to its engineers.
GPUs have enabled Pinterest to do something that was impossible with CPUs on the same budget, and by doing this they can make changes that have a direct impact on various business metrics.
Learn about Pinterest's GPU-driven inference and optimizations at its GTC session, Serving 100x Bigger Recommender Models, and in the Pinterest Engineering blog.
Register for GTC, running Sept. 19-22, for free to attend sessions with NVIDIA and dozens of industry leaders.
Most recent headlines
05/01/2027
Worlds first 802.15.4ab-UWB chip verified by Calterah and Rohde & Schwarz to be ...
04/08/2026
Dalet, a leading technology and service provider for media-rich organizations, t...
04/07/2026
April 7 2026, 19:00 (PDT) Detective Conan: Fallen Angel of the Highway Opens in...
01/06/2026
January 6 2026, 05:30 (PST) Dolby Sets the New Standard for Premium Entertainment at CES 2026
Throughout the week, Dolby brings to life the latest innovatio...
16/05/2026
Boris FX Continuum Pairs AI Precision and Advanced Creative Controls
Jessie Electa Petrov May 16, 2026
0 Comments
The 2026.5 release adds automatic de...
16/05/2026
Share
Copy link
Facebook
X
Linkedin
Bluesky
Email...
16/05/2026
Share
Copy link
Facebook
X
Linkedin
Bluesky
Email...
16/05/2026
Share
Copy link
Facebook
X
Linkedin
Bluesky
Email...
16/05/2026
Share
Copy link
Facebook
X
Linkedin
Bluesky
Email...
15/05/2026
Seattle Sounders FC and Seattle Reign FC, in partnership with RAVE Foundation an...
15/05/2026
Dan Brumm has served as sound designer on Bluey, the Australian children's t...
15/05/2026
The Professional Audio Manufacturers Alliance (PAMA) and Shure Incorporated are accepting applications for the 6th annual Mark Brunner Professional Audio Schola...
15/05/2026
Netflix has announced an expanded NFL schedule for 2026 and beyond under a four-year partnership extension with the NFL through the 2029-30 season. Each season,...
15/05/2026
Ateme is supporting TVRI (Televisi Republik Indonesia) with a contribution and d...
15/05/2026
Concacaf has announced the launch of a new website and mobile app built on Deltatre's FORGE platform. Concacaf.com and the mobile app, available on iOS and ...
15/05/2026
Eutelsat has announced the launch of QBC Business Economic Channel by Qatar Media Corporation, broadcasting in 4K/UHD via Eutelsat's 7/8 West video neighbo...
15/05/2026
Major League Soccer has announced four original content series timed to the 2026...
15/05/2026
The Alliance for IP Media Solutions (AIMS) has announced it will exhibit and present at InfoComm 2026, taking place June 13-19 at the Las Vegas Convention Cente...
15/05/2026
InfoComm 2026 will take place June 13-19 (exhibits June 17-19) at the Las Vegas Convention Center. The show will include sessions and exhibits covering broadcas...
15/05/2026
Tracy McGrady's Ones Basketball League (OBL) and FuboTV Inc. have announced ...
15/05/2026
Disguise has partnered with Creative Technology (CT) to deliver visual playback ...
15/05/2026
Sony Electronics has announced two new products for professional imaging: the Alpha 7R VI full-frame mirrorless camera and the FE 100-400mm F4.5 GM OSS super-te...
15/05/2026
In-venue and creative video staffers at the professional and collegiate level ha...
15/05/2026
Ratings Roundup is a rundown of recent rating news and is derived from press rel...
15/05/2026
For sports organizations, the most valuable assets are often the most sensitive:...
15/05/2026
The NFL's broadcast partners released their 2026 regular season schedules ye...
15/05/2026
When MMA icons Ronda Rousey and Gina Carano meet inside the Hexagon at Intuit Do...
15/05/2026
Daniel Roher attends the Tuner Premiere during the 2026 Sundance Film Festival at Eccles Theatre on January 22, 2026 in Park City, Utah. (Photo by Neilson Bar...
15/05/2026
Last night, the Spotify Podcast Awards in Mexico returned to the country's capital. Now in its second year, the evening honors creators whose voices are hel...
15/05/2026
Rebranded show announced
Ahead of their 2026 return, Music Expo have announced that they have now officially changed their name to the MONO Music Conference...
15/05/2026
Fuzz pedal joins UK companys line-up
UK-based pedal makers Buzzing Bugs Audio Devices have recently unveiled their latest creation, the Bolster. Said to pay...
15/05/2026
Joint Statement: News Bargaining Incentive
28 April, 2026
Media releases
The vibrancy of Australian democracy relies on the robust and open exchange of new...
15/05/2026
Call it Deltavision, Australia's through to the Grand Final of this year'...
15/05/2026
Join Calrec at MPTS 2026 | May 13-14 | Stand A40 | Olympia, London We're looking forward to meeting up with customers and partners at this year's Media ...
15/05/2026
86% of media planners would move more linear TV budget to CTV if they had show-level targeting and reporting - and 65% would also shift dollars from programmati...
15/05/2026
Share
Copy link
Facebook
X
Linkedin
Bluesky
Email...
15/05/2026
Clear-Com will showcase new communications solutions and major platform updates at InfoComm 2026 (Booth N7005), June 17-19, in the North and Central Halls of t...
15/05/2026
Following an outstanding inaugural year in 2025, Rise AV is proud to announce the return of its flagship leadership initiative, Elevate. The programme continues...
15/05/2026
Berklee Announces Lineup for Inaugural AI Music Summit The three-day event puts musicians at the center of the future of music creation, ethics, and the indus...
15/05/2026
Lightware returns to InfoComm 2026 with a focused showcase of scalable USB-C connectivity, next-generation AV-over-IP solutions, and technologies that help over...
15/05/2026
Share
Copy link
Facebook
X
Linkedin
Bluesky
Email...
15/05/2026
Share
Copy link
Facebook
X
Linkedin
Bluesky
Email...
15/05/2026
Share
Copy link
Facebook
X
Linkedin
Bluesky
Email...
15/05/2026
Share
Copy link
Facebook
X
Linkedin
Bluesky
Email...
15/05/2026
Share
Copy link
Facebook
X
Linkedin
Bluesky
Email...
15/05/2026
Delivering a live, arena-scale production of a massively popular band is no small feat. Between expansive in-arena LED walls and a global live stream fed to onl...
15/05/2026
Connection is the heartbeat of any strong community, and with live streaming becoming more accessible in the modern era, it's much easier for faith-based or...
15/05/2026
Powered by GX 3 media servers, optimised IP-VFC workflows and on-site engineering expertise, the production delivers high-performance visuals for one of the wor...
15/05/2026
The six-part series is a co-commission with BritBox and Sony Pictures Television...
15/05/2026
Back to All News
A Mother, Two Daughters and One Big Scandal: Netflixs Crime-Co...