Sony Pixel Power calrec Sony

AI Esperanto: Large Language Models Read Data With NVIDIA Triton

05/10/2022

Julien Salinas wears many hats. He's an entrepreneur, software developer and, until lately, a volunteer fireman in his mountain village an hour's drive from Grenoble, a tech hub in southeast France.

He's nurturing a two-year old startup, NLP Cloud, that's already profitable, employs about a dozen people and serves customers around the globe. It's one of many companies worldwide using NVIDIA software to deploy some of today's most complex and powerful AI models.

NLP Cloud is an AI-powered software service for text data. A major European airline uses it to summarize internet news for its employees. A small healthcare company employs it to parse patient requests for prescription refills. An online app uses it to let kids talk to their favorite cartoon characters.

Large Language Models Speak Volumes It's all part of the magic of natural language processing (NLP), a popular form of AI that's spawning some of the planet's biggest neural networks called large language models. Trained with huge datasets on powerful systems, LLMs can handle all sorts of jobs such as recognizing and generating text with amazing accuracy.

NLP Cloud uses about 25 LLMs today, the largest has 20 billion parameters, a key measure of the sophistication of a model. And now it's implementing BLOOM, an LLM with a whopping 176 billion parameters.

Running these massive models in production efficiently across multiple cloud services is hard work. That's why Salinas turns to NVIDIA Triton Inference Server.

High Throughput, Low Latency Very quickly the main challenge we faced was server costs, Salinas said, proud his self-funded startup has not taken any outside backing to date.

Triton turned out to be a great way to make full use of the GPUs at our disposal, he said.

For example, NVIDIA A100 Tensor Core GPUs can process as many as 10 requests at a time - twice the throughput of alternative software - thanks to FasterTransformer, a part of Triton that automates complex jobs like splitting up models across many GPUs.

FasterTransformer also helps NLP Cloud spread jobs that require more memory across multiple NVIDIA T4 GPUs while shaving the response time for the task.

Customers who demand the fastest response times can process 50 tokens - text elements like words or punctuation marks - in as little as half a second with Triton on an A100 GPU, about a third of the response time without Triton.

That's very cool, said Salinas, who's reviewed dozens of software tools on his personal blog.

Touring Triton's Users Around the globe, other startups and established giants are using Triton to get the most out of LLMs.

Microsoft's Translate service helped disaster workers understand Haitian Creole while responding to a 7.0 earthquake. It was one of many use cases for the service that got a 27x speedup using Triton to run inference on models with up to 5 billion parameters.

NLP provider Cohere was founded by one of the AI researchers who wrote the seminal paper that defined transformer models. It's getting up to 4x speedups on inference using Triton on its custom LLMs, so users of customer support chatbots, for example, get swift responses to their queries.

NLP Cloud and Cohere are among many members of the NVIDIA Inception program, which nurtures cutting-edge startups. Several other Inception startups also use Triton for AI inference on LLMs.

Tokyo-based rinna created chatbots used by millions in Japan, as well as tools to let developers build custom chatbots and AI-powered characters. Triton helped the company achieve inference latency of less than two seconds on GPUs.

In Tel Aviv, Tabnine runs a service that's automated up to 30% of the code written by a million developers globally (see a demo below). Its service runs multiple LLMs on A100 GPUs with Triton to handle more than 20 programming languages and 15 code editors.

https://blogs.nvidia.com/wp-content/uploads/2022/10/Tabnine.mp4

Twitter uses the LLM service of Writer, based in San Francisco. It ensures the social network's employees write in a voice that adheres to the company's style guide. Writer's service achieves a 3x lower latency and up to 4x greater throughput using Triton compared to prior software.

If you want to put a face to those words, Inception member Ex-human, just down the street from Writer, helps users create realistic avatars for games, chatbots and virtual reality applications. With Triton, it delivers response times of less than a second on an LLM with 6 billion parameters while reducing GPU memory consumption by a third.

It's another example of how LLMs are expanding AI's horizons.

Triton is widely used, in part, because its versatile. The software works with any style of inference and any AI framework - and it runs on CPUs as well as NVIDIA GPUs and other accelerators.

A Full-Stack Platform Back in France, NLP Cloud is now using other elements of the NVIDIA AI platform.

For inference on models running on a single GPU, it's adopting NVIDIA TensorRT software to minimize latency. We're getting blazing-fast performance with it, and latency is really going down, Salinas said.

The company also started training custom versions of LLMs to support more languages and enhance efficiency. For that work, it's adopting NVIDIA Nemo Megatron, an end-to-end framework for training and deploying LLMs with trillions of parameters.

The 35-year-old Salinas has the energy of a 20-something for coding and growing his business. He describes plans to build private infrastructure to complement the four public cloud services the startup uses, as well as to expand into LLMs that handle speech and text-to-image to address applications like semantic search.

I always loved coding, but being a good developer is not enough: You have to understand your customers&#
LINK: https://blogs.nvidia.com/blog/2022/10/05/ai-large-language-models-trit...
See more stories from nvidia

Most recent headlines

09/11/2025

Dalet Unveils Agentic AI Media Workflows at IBC2025

Dalet today announced a transformative leap forward for media operations: Agentic Artificial Intelligence (AI) that unifies the Dalet ecosystem under one natura...

24/10/2025

NEP CEO Martin Stewart on $700M Investment, Restructuring, and the Challenges Facing Live Production

NEP CEO Martin Stewart on $700M Investment, Restructuring, and the Challenges Fa...

24/10/2025

FOX Sports Debuts Next-Gen Graphics, Celebrates Career of Lead Producer Pete Macheska as 2025 World Series Gets Underway in Toronto

FOX Sports Debuts Next-Gen Graphics, Celebrates Career of Lead Producer Pete Mac...

24/10/2025

GROUP MEDIAPRO Chairman and CEO Tatxo Benet Steps Down

GROUP MEDIAPRO Chairman and CEO Tatxo Benet Steps DownBy Ken Kerschbaumer, Editorial Director Friday, October 24, 2025 - 2:37 pm Print This Story | Subscri...

24/10/2025

NBA Tip-Off: Amazon Prime Video Debuts Cutting-Edge Studio, Mobile Units, Globally Distributed Production Ecosystem

NBA Tip-Off: Amazon Prime Video Debuts Cutting-Edge Studio, Mobile Units, Global...

24/10/2025

Director Justin Lin Returns to His Independent Roots With Last Days

(L-R) Director Justin Lin with his cast and producers at Eccles Theatre for the premiere of Last Days in Park City. (Photo by George Pimentel/Shutterstock for...

24/10/2025

Testing and Validation: Driving Reliability in Non-Terrestrial Networks (NTNs)

As global connectivity demands continue to grow, non-terrestrial networks (NTNs) are emerging as a transformative force in telecommunications. By extending cove...

24/10/2025

September 2025 - Decline in Streaming, Growth in TV in Poland

Warsaw - Poland, October 20, 2025 - Nielsen, a global leader in audience measurement, data and analytics, has published its latest All Screens Video Landscape r...

24/10/2025

Springsteen: Deliver Me from Nowhere Filmed at Berklee NYCs Power Station

Springsteen: Deliver Me from Nowhere Filmed at Berklee NYCs Power Station The biopic, starring Jeremy Allen White as the Boss, focuses on the period when Spri...

24/10/2025

AR, Enhanced Audio to Augment Fox Sports' 2025 World Series Coverage

TORONTO Sometimes in sports, as in life, it's the little things that matter, and that aphorism will be on full display tonight when the Toronto Blue Jays ta...

24/10/2025

Spectrum Reach Has Deployed More Than 15,000 AI-Powered Ad Campaigns

NEW YORK Charters Spectrum Reach has announced that its clients have used Waymark's AI-driven ad creation platform to create more than 15,000 ads since Spec...

24/10/2025

Avid Releases Pro Tools 2025.10

BURLINGTON, Mass. Avid has today announced the release of Pro Tools 2025.10, a feature-rich update that the company said offers notable advances in immersive mu...

24/10/2025

Comcast Advertising Unveils Programmatic Solution for Linear TV

NEW YORK In a major change for the ad industry, Comcast Advertising will unveil technology that enables agencies and brands to buy targetable, biddable ads on l...

24/10/2025

ATSC Expands Its Influence with Growing International Ties

WASHINGTON The ATSC broadcast standards group has outlined a growing list of international activities that the group said is expanding its influence and solidif...

24/10/2025

VEON to Release 3Q25 Earnings Update on November 10, 2025

24 Oct 2025 VEON to Release 3Q25 Earnings Update on November 10, 2025 Dubai, October 24, 2025 - VEON Ltd. (NASDAQ: VEON), a global digital operator, today conf...

24/10/2025

Sky Documentaries teams up with Candour Productions and The Observer for The Real Salt Path (w/t)

One-off special from the team behind BAFTA award-winning Libby, Are You Home Yet...

24/10/2025

ABC completes audit of Origin's Virtual ID model

The review examined how the model is developed, managed, and delivered against the requirements set out in the Origin framework. Simon Redlich, Chief Executive...

24/10/2025

NVIDIA GTC DC: Live Updates on What's Next in AI

Countdown to GTC DC: What to Watch Next Week Next week, Washington, D.C., becomes the center of gravity for artificial intelligence. NVIDIA GTC Washington, D...

24/10/2025

Presidential Election Results Coverage on RT

RT will provide extensive coverage of the results of the Presidential Election across television, radio and online on Saturday, 25 October 2025. Throughout th...

24/10/2025

New Coaches, New Families and New Challenges Set for Ireland's Fittest Family

New Coaches, New Families and New Challenges Set for Ireland's Fittest Famil...

24/10/2025

Westlife, Imelda May and Ben Elton among the guests on this week's Late Late Show

Westlife, Imelda May and Ben Elton among the guests on this week's Late Late...

23/10/2025

Unlocking Character: Sportcast on Executing the Bundesliga and Bundesliga 2 New Season Production

Unlocking character: Sportcast on executing the Bundesliga and Bundesliga 2 new ...

23/10/2025

Clear Coordination: Juggling the New Bundesliga Rights Cycle Requirements and Pushing Innovation Forward at Sportcast

Clear coordination: Juggling the new Bundesliga rights cycle requirements and pu...

23/10/2025

Analysis: Is Piracy Just the Cost of Doing Business?

Analysis: Is piracy just the cost of doing business? By Callum McCarthy, Editor-at-Large Tuesday, October 21, 2025 - 09:58 Print This Story It's high ...

23/10/2025

ESPN's Adam Whitlock on Driving Real-World Innovation Across the Video-Transmission Industry

ESPN's Adam Whitlock on Driving Real-World Innovation Across the Video-Trans...

23/10/2025

SVG TranSPORT 2025 Unites 300+ Industry Leaders in New York for Deep Dive Into Live Transmission Technology

SVG TranSPORT 2025 Unites 300+ Industry Leaders in New York for Deep Dive Into L...

23/10/2025

NBA Tip-Off: League Starts Season With Two New Broadcast Partners, In-House NBA TV/NBA App Ops

NBA Tip-Off: League Starts Season With Two New Broadcast Partners, In-House NBA ...

23/10/2025

NFL Deepens Business Partnership with EA Sports; More Madden Casts to Come?

NFL Deepens Business Partnership with EA Sports; More Madden Casts to Come?EA Sports will remain the exclusive producer and distributor of Madden NFL video game...

23/10/2025

NFL Moves Pro Bowl Games Indoors and to Super Bowl Week; Leans Into a Made-for-TV Presentation

NFL Moves Pro Bowl Games Indoors and to Super Bowl Week; Leans Into a Made-for-T...

23/10/2025

Together in Time: Alan Domnguez on the Common Themes in his Films and Sundance Institute's Support

By Alan Dominguez Recently I have been thinking about the intersection of two e...

23/10/2025

Coexistence, My Ass! Dares Peacemaking to Not Be So Serious

(L-R) Amber Fares and Noam Shuster Eliassi attend the 2025 Sundance Film Festival premiere of Coexistence, My Ass! at the Egyptian Theatre on January 26, 2025...

23/10/2025

A Force Multiplier for High-Frequency Communications: The L3Harris ARGUS-HF

The new solution is industry's first multi-channel receiver available for L3Harris's resilient tactical high-frequency data waveforms....

23/10/2025

Survey: Americans Concerned' About AI's Impact on Journalism

NEW YORK During a high-profile session at NAB Show New York, new survey data was shared that revealed significant public concern over artificial intelligence (A...

23/10/2025

Fox Weather Taps T-Mobile's SuperMobile for Extreme Weather Coverage

BELLEVUE, Wash. and NEW YORK Fox Weather has tapped T-Mobile as its preferred communications provider and said all of its reporters will be equipped with SuperM...

23/10/2025

Mike Wright Joins Lawo as VP, Sales, North America

RASTATT, Germany Broadcast and media workflow technology vendor Lawo has tapped Mike Wright as VP of sales, North America....

23/10/2025

European Broadcaster ARTE Taps Grass Valley for IP Transition

MONTREAL European cultural broadcaster ARTE has selected Grass Valley LDX 135 cameras and Creative Grading solution as part of its move from SDI/1080i to a nati...

23/10/2025

Scripps Names Daniel Parsons Chief Information Security Officer

CINCINNATI The E.W. Scripps Company has named Daniel Parsons as its new chief information security officer, effective Oct. 20....

23/10/2025

WWTV Completes IP Studio Upgrade

ALAMEDA, Calif. Northern Michigan broadcaster WWTV recently completed a major IP-based upgrade that connects its new Traverse City studio with its control room ...

23/10/2025

Verizon Fios TV, Nexstar Blackout Looms as Contract Ends on Oct. 24

A deadline is looming for a new carriage deal between Verizon's Fios TV and Nexstar, with both Verizon and the pay TV-backed American Television Alliance bl...

23/10/2025

Survey: Americans 'Concerned' About AI's Impact on Journalism

NEW YORK During a high-profile session at NAB Show New York, new survey data was shared that revealed significant public concern over artificial intelligence (A...

23/10/2025

Fox Weather Taps T-Mobile's Supermobile for Extreme-Weather Coverage

BELLEVUE, Wash. and NEW YORK Fox Weather has tapped T-Mobile has as its preferred communications provider and announced that all Fox Weather reporters are being...

23/10/2025

PBS Taps Amazon Bedrock to Improve Search on Digital Platforms

PBS will use generative AI from Amazon Web Services to provide enhanced search results to viewers on the PBS App and PBS LearningMedia platforms, the network an...

23/10/2025

News Corp to Report Fiscal 2026 First Quarter Earnings

News Corp to Report Fiscal 2026 First Quarter Earnings New York, NY (October 23, 2025) - News Corp will release its first quarter Fiscal 2026 results on Thursd...

23/10/2025

Actor Jessica Barden joins Becoming Victoria Wood - U&GOLD's feature-length documentary celebrating the life of Victoria Wood

The 90-minute film is produced by Rogan Scotland, part of BAFTA-winning Rogan Pr...

23/10/2025

The Resurrected' Marks First Chinese-Language Series to Launch Netflix Profile Icons

Back to All News The Resurrected' Marks First Chinese-Language Series to L...

23/10/2025

RT publishes Register of External Activities for Q2/2025 (statistical summary)

RT is today publishing a statistical summary from the Register of External Activities for the second quarter of 2025. The RT Register of External Activities ...

23/10/2025

THE BOYS ARE BACK IN TOWN THE 2 JOHNNIES LATE NIGHT LOCK IN RETURNS FOR SERIES 3

Series three of the award winning, hit comedy entertainment series The 2 Johnnies Late Night Lock In is back on your screens, celebrating the very best of all t...

23/10/2025

Fleadh Cheoil, presented by Dith S and Muireann Nic Amhlaoibh returns to RT

Performances by Michael Flatley, Andy Irvine, Cuckoo's Nest, Foster and Allen and more Friday 24 October, 8pm on RT One and RT Player Fleadh Cheoil re...

23/10/2025

Fangs Out, Frames Up: Vampire: The Masquerade - Bloodlines 2' Leads a Killer GFN Thursday

The nights grow longer and the shadows get bolder with Vampire The Masquerade: B...

22/10/2025

ITE Singapore Officially Opens Next-Generation Hybrid Learning Space with X2O Media's OneRoom

MONTR AL - October 2, 2025 - The Institute of Technical Education (ITE) last mon...