
Julien Salinas wears many hats. He's an entrepreneur, software developer and, until lately, a volunteer fireman in his mountain village an hour's drive from Grenoble, a tech hub in southeast France.
He's nurturing a two-year old startup, NLP Cloud, that's already profitable, employs about a dozen people and serves customers around the globe. It's one of many companies worldwide using NVIDIA software to deploy some of today's most complex and powerful AI models.
NLP Cloud is an AI-powered software service for text data. A major European airline uses it to summarize internet news for its employees. A small healthcare company employs it to parse patient requests for prescription refills. An online app uses it to let kids talk to their favorite cartoon characters.
Large Language Models Speak Volumes It's all part of the magic of natural language processing (NLP), a popular form of AI that's spawning some of the planet's biggest neural networks called large language models. Trained with huge datasets on powerful systems, LLMs can handle all sorts of jobs such as recognizing and generating text with amazing accuracy.
NLP Cloud uses about 25 LLMs today, the largest has 20 billion parameters, a key measure of the sophistication of a model. And now it's implementing BLOOM, an LLM with a whopping 176 billion parameters.
Running these massive models in production efficiently across multiple cloud services is hard work. That's why Salinas turns to NVIDIA Triton Inference Server.
High Throughput, Low Latency Very quickly the main challenge we faced was server costs, Salinas said, proud his self-funded startup has not taken any outside backing to date.
Triton turned out to be a great way to make full use of the GPUs at our disposal, he said.
For example, NVIDIA A100 Tensor Core GPUs can process as many as 10 requests at a time - twice the throughput of alternative software - thanks to FasterTransformer, a part of Triton that automates complex jobs like splitting up models across many GPUs.
FasterTransformer also helps NLP Cloud spread jobs that require more memory across multiple NVIDIA T4 GPUs while shaving the response time for the task.
Customers who demand the fastest response times can process 50 tokens - text elements like words or punctuation marks - in as little as half a second with Triton on an A100 GPU, about a third of the response time without Triton.
That's very cool, said Salinas, who's reviewed dozens of software tools on his personal blog.
Touring Triton's Users Around the globe, other startups and established giants are using Triton to get the most out of LLMs.
Microsoft's Translate service helped disaster workers understand Haitian Creole while responding to a 7.0 earthquake. It was one of many use cases for the service that got a 27x speedup using Triton to run inference on models with up to 5 billion parameters.
NLP provider Cohere was founded by one of the AI researchers who wrote the seminal paper that defined transformer models. It's getting up to 4x speedups on inference using Triton on its custom LLMs, so users of customer support chatbots, for example, get swift responses to their queries.
NLP Cloud and Cohere are among many members of the NVIDIA Inception program, which nurtures cutting-edge startups. Several other Inception startups also use Triton for AI inference on LLMs.
Tokyo-based rinna created chatbots used by millions in Japan, as well as tools to let developers build custom chatbots and AI-powered characters. Triton helped the company achieve inference latency of less than two seconds on GPUs.
In Tel Aviv, Tabnine runs a service that's automated up to 30% of the code written by a million developers globally (see a demo below). Its service runs multiple LLMs on A100 GPUs with Triton to handle more than 20 programming languages and 15 code editors.
https://blogs.nvidia.com/wp-content/uploads/2022/10/Tabnine.mp4
Twitter uses the LLM service of Writer, based in San Francisco. It ensures the social network's employees write in a voice that adheres to the company's style guide. Writer's service achieves a 3x lower latency and up to 4x greater throughput using Triton compared to prior software.
If you want to put a face to those words, Inception member Ex-human, just down the street from Writer, helps users create realistic avatars for games, chatbots and virtual reality applications. With Triton, it delivers response times of less than a second on an LLM with 6 billion parameters while reducing GPU memory consumption by a third.
It's another example of how LLMs are expanding AI's horizons.
Triton is widely used, in part, because its versatile. The software works with any style of inference and any AI framework - and it runs on CPUs as well as NVIDIA GPUs and other accelerators.
A Full-Stack Platform Back in France, NLP Cloud is now using other elements of the NVIDIA AI platform.
For inference on models running on a single GPU, it's adopting NVIDIA TensorRT software to minimize latency. We're getting blazing-fast performance with it, and latency is really going down, Salinas said.
The company also started training custom versions of LLMs to support more languages and enhance efficiency. For that work, it's adopting NVIDIA Nemo Megatron, an end-to-end framework for training and deploying LLMs with trillions of parameters.
The 35-year-old Salinas has the energy of a 20-something for coding and growing his business. He describes plans to build private infrastructure to complement the four public cloud services the startup uses, as well as to expand into LLMs that handle speech and text-to-image to address applications like semantic search.
I always loved coding, but being a good developer is not enough: You have to understand your customers
Most recent headlines
05/01/2027
Worlds first 802.15.4ab-UWB chip verified by Calterah and Rohde & Schwarz to be ...
01/06/2026
January 6 2026, 05:30 (PST) Dolby Sets the New Standard for Premium Entertainment at CES 2026
Throughout the week, Dolby brings to life the latest innovatio...
02/05/2026
Dalet, a leading technology and service provider for media-rich organizations, t...
01/05/2026
January 5 2026, 18:30 (PST) NBCUniversal's Peacock to Be First Streamer to ...
01/04/2026
January 4 2026, 18:00 (PST) DOLBY AND DOUYIN EMPOWER THE NEXT GENERATON OF CREATORS WITH DOLBY VISION
Douyin Users Can Now Create And Share Videos With Stun...
12/02/2026
Share
Copy link
Facebook
X
Linkedin
Bluesky
Email...
12/02/2026
The production team of the long-running German investigative series Achtung Abzocke recently upgraded its cameras for the show's 12th season. The objectiv...
12/02/2026
Leading provider of video streaming solutions, Bitmovin, has appointed Ian Baglow as Co-CEO alongside existing CEO and Co-Founder Stefan Lederer. Under this str...
12/02/2026
Vizrt, a leading viewer engagement platform and a trusted expert in live production technologies, today announces the launch of four Campus Stadium Production B...
12/02/2026
Strategic agreement to deliver S3 cloud storage in Switzerland with full data sovereignty and local control including at the level of individual cantons plu...
12/02/2026
Mad About Video is a leading specialist in video for live events and installations throughout Malta. In operation since 2011, it has evolved from a company focu...
12/02/2026
JAGGAER, a global leader in digital procurement and supplier collaboration solutions, today announced the successful delivery of a procurement digitalization pr...
12/02/2026
At NAB Show, LiveU will showcase its broadest IP-video EcoSystem to date, designed to help broadcasters and content creators embrace digital first operations, d...
12/02/2026
Share
Copy link
Facebook
X
Linkedin
Bluesky
Email...
12/02/2026
Share
Copy link
Facebook
X
Linkedin
Bluesky
Email...
12/02/2026
Share
Copy link
Facebook
X
Linkedin
Bluesky
Email...
12/02/2026
Share
Copy link
Facebook
X
Linkedin
Bluesky
Email...
12/02/2026
Share
Copy link
Facebook
X
Linkedin
Bluesky
Email...
12/02/2026
The GeForce NOW sixth-anniversary festivities roll on this February, continuing a monthlong celebration of NVIDIA's cloud gaming service.
This week brings ...
12/02/2026
TIME100 Health list features Scripps Research Professor Darrell Irvine Irvine is recognized for his work in empowering the immune system to fight disease, which...
11/02/2026
FYI: Phone Support Maintenance One thing we pride ourselves on here at Utah Scientific is our 24-hour support included with our signature 10-year hardware warra...
11/02/2026
Leading provider of video streaming solutions, Bitmovin, has appointed Ian Baglow as Co-CEO alongside existing CEO and Co-Founder Stefan Lederer. Under this str...
11/02/2026
Paramount and the CBS Television Network will partner to air UFC 326: HOLLOWAY vs. OLIVEIRA 2 live on Saturday, March 7, from T-Mobile Arena in Las Vegas, mar...
11/02/2026
Beginning February 10, fans can buy MLB.TV on ESPN, a new milestone in one of sports media's longest-standing partnerships. ESPN becomes the new streaming h...
11/02/2026
Fubo Sports Network is available to Hulu's Live TV subscribers in the core $89.99 a month subscription plan, which also includes full access to the entire H...
11/02/2026
Following a competitive public tender process, Rai (Radiotelevisione Italiana), the national public broadcasting company of Italy, has awarded Imagine Communica...
11/02/2026
Major League Baseball is making in-market streaming subscriptions for 20 Clubs available today for fans. Subscriptions for the following Clubs are available vi...
11/02/2026
Building on successful demonstrations during the Paris Olympics 2024, Italian public service broadcaster Rai and the European Broadcasting Union (EBU) are condu...
11/02/2026
Following Sunday's Super Bowl LX, ESPN and Disney unveiled We're Going,...
11/02/2026
Delayed streams are a growing source of frustration for sports fans. During the 2026 Super Bowl, some streams lagged up to 62 seconds behind the action on the f...
11/02/2026
NASCAR and FloSports announces an expanded slate of racing events that will bring FloRacing coverage live throughout the 2026 season to the NASCAR Channel, furt...
11/02/2026
Manifold technologies GmbH announces the appointment of Nick Tucker as Sales Manager for Europe, reinforcing the company's continued growth across broadcast...
11/02/2026
Genies, the AI avatar technology company powering the next era of interactive digital identity, entered into a landmark collaboration with MLB Players, Inc., th...
11/02/2026
The International Cricket Council (ICC) and Google have joined forces for an AI-...
11/02/2026
Dolby's CEO Kevin Yeaman and Giles Baker, SVP of Dolby Cloud Solutions, shared how the brand's latest innovations - Dolby Vision, Dolby Atmos, and Dolby...
11/02/2026
Ilitch Sports + Entertainment has entered a first of its kind partnership with Major League Baseball, which will provide broadcast support to both the Detroit T...
11/02/2026
For major U.S. events like Super Bowl 2026, FIFA World Cup, America 250, and the...
11/02/2026
Broadcasts of the NHL's Detroit Red Wings will also be produced by the leagu...
11/02/2026
Video moves fast can your DAM keep up?
Join Blue Lucy in LA for the West Coast's leading Digital Asset Management event as we explore, celebrate, and acc...
11/02/2026
NEW YORK - February 10, 2026 - An estimated 124.9 million viewers watched Super Bowl LX on Sunday, February 8, according to Nielsen's Big Data Panel measu...
11/02/2026
Share
Copy link
Facebook
X
Linkedin
Bluesky
Email...
11/02/2026
Share
Copy link
Facebook
X
Linkedin
Bluesky
Email...
11/02/2026
Share
Copy link
Facebook
X
Linkedin
Bluesky
Email...
11/02/2026
Clear-Com provided an advanced, IP-based communications infrastructure for TEDNext 2025, supporting production, media, and editorial teams with a highly flexib...
11/02/2026
Astera introduces QuikBeam, the newest addition to its acclaimed Quik family of focusing LED Fresnels. This ultra-compact spotlight combines the equivalent powe...
11/02/2026
Following a competitive public tender process, Rai (Radiotelevisione Italiana), the national public broadcasting company of Italy, has awarded Imagine Communica...
11/02/2026
With Convertible Mount for NL Bowens & Aputure A Mounts See it at BSC Expo Stand #133 LCA
DoPchoice continues to refine light shaping tools for professional LE...
11/02/2026
World Premiere at BSC Expo, Booth #319 Oberkochen/Germany, 10 February 2026
ZEISS introduces the new Aatma, set of nine high-end full frame T1.5 cinema primes ...
11/02/2026
As Re-recording Mixer and Head of Sound at The Farm, one of UK's leading post-production facilities, Nick Fry has built his career on making stories sound a...
11/02/2026
Share
Copy link
Facebook
X
Linkedin
Bluesky
Email...