Sony Pixel Power calrec Sony

AI Esperanto: Large Language Models Read Data With NVIDIA Triton

05/10/2022

Julien Salinas wears many hats. He's an entrepreneur, software developer and, until lately, a volunteer fireman in his mountain village an hour's drive from Grenoble, a tech hub in southeast France.

He's nurturing a two-year old startup, NLP Cloud, that's already profitable, employs about a dozen people and serves customers around the globe. It's one of many companies worldwide using NVIDIA software to deploy some of today's most complex and powerful AI models.

NLP Cloud is an AI-powered software service for text data. A major European airline uses it to summarize internet news for its employees. A small healthcare company employs it to parse patient requests for prescription refills. An online app uses it to let kids talk to their favorite cartoon characters.

Large Language Models Speak Volumes It's all part of the magic of natural language processing (NLP), a popular form of AI that's spawning some of the planet's biggest neural networks called large language models. Trained with huge datasets on powerful systems, LLMs can handle all sorts of jobs such as recognizing and generating text with amazing accuracy.

NLP Cloud uses about 25 LLMs today, the largest has 20 billion parameters, a key measure of the sophistication of a model. And now it's implementing BLOOM, an LLM with a whopping 176 billion parameters.

Running these massive models in production efficiently across multiple cloud services is hard work. That's why Salinas turns to NVIDIA Triton Inference Server.

High Throughput, Low Latency Very quickly the main challenge we faced was server costs, Salinas said, proud his self-funded startup has not taken any outside backing to date.

Triton turned out to be a great way to make full use of the GPUs at our disposal, he said.

For example, NVIDIA A100 Tensor Core GPUs can process as many as 10 requests at a time - twice the throughput of alternative software - thanks to FasterTransformer, a part of Triton that automates complex jobs like splitting up models across many GPUs.

FasterTransformer also helps NLP Cloud spread jobs that require more memory across multiple NVIDIA T4 GPUs while shaving the response time for the task.

Customers who demand the fastest response times can process 50 tokens - text elements like words or punctuation marks - in as little as half a second with Triton on an A100 GPU, about a third of the response time without Triton.

That's very cool, said Salinas, who's reviewed dozens of software tools on his personal blog.

Touring Triton's Users Around the globe, other startups and established giants are using Triton to get the most out of LLMs.

Microsoft's Translate service helped disaster workers understand Haitian Creole while responding to a 7.0 earthquake. It was one of many use cases for the service that got a 27x speedup using Triton to run inference on models with up to 5 billion parameters.

NLP provider Cohere was founded by one of the AI researchers who wrote the seminal paper that defined transformer models. It's getting up to 4x speedups on inference using Triton on its custom LLMs, so users of customer support chatbots, for example, get swift responses to their queries.

NLP Cloud and Cohere are among many members of the NVIDIA Inception program, which nurtures cutting-edge startups. Several other Inception startups also use Triton for AI inference on LLMs.

Tokyo-based rinna created chatbots used by millions in Japan, as well as tools to let developers build custom chatbots and AI-powered characters. Triton helped the company achieve inference latency of less than two seconds on GPUs.

In Tel Aviv, Tabnine runs a service that's automated up to 30% of the code written by a million developers globally (see a demo below). Its service runs multiple LLMs on A100 GPUs with Triton to handle more than 20 programming languages and 15 code editors.

https://blogs.nvidia.com/wp-content/uploads/2022/10/Tabnine.mp4

Twitter uses the LLM service of Writer, based in San Francisco. It ensures the social network's employees write in a voice that adheres to the company's style guide. Writer's service achieves a 3x lower latency and up to 4x greater throughput using Triton compared to prior software.

If you want to put a face to those words, Inception member Ex-human, just down the street from Writer, helps users create realistic avatars for games, chatbots and virtual reality applications. With Triton, it delivers response times of less than a second on an LLM with 6 billion parameters while reducing GPU memory consumption by a third.

It's another example of how LLMs are expanding AI's horizons.

Triton is widely used, in part, because its versatile. The software works with any style of inference and any AI framework - and it runs on CPUs as well as NVIDIA GPUs and other accelerators.

A Full-Stack Platform Back in France, NLP Cloud is now using other elements of the NVIDIA AI platform.

For inference on models running on a single GPU, it's adopting NVIDIA TensorRT software to minimize latency. We're getting blazing-fast performance with it, and latency is really going down, Salinas said.

The company also started training custom versions of LLMs to support more languages and enhance efficiency. For that work, it's adopting NVIDIA Nemo Megatron, an end-to-end framework for training and deploying LLMs with trillions of parameters.

The 35-year-old Salinas has the energy of a 20-something for coding and growing his business. He describes plans to build private infrastructure to complement the four public cloud services the startup uses, as well as to expand into LLMs that handle speech and text-to-image to address applications like semantic search.

I always loved coding, but being a good developer is not enough: You have to understand your customers&#
LINK: https://blogs.nvidia.com/blog/2022/10/05/ai-large-language-models-trit...
See more stories from nvidia

Most recent headlines

04/08/2024

Dalet Appoints Santiago Solanas as CEO to Lead Next Era of Growth and Innovation

Dalet, a leading technology and service provider for media-rich organizations, is excited to announce Santiago Solanas as its new Chief Executive Officer (CEO)....

03/06/2024

Dalet and Veritone Reach Agreement to Distribute, Transact and Monetize Media Archives

Dalet, a leading technology and service provider for media-rich organizations, a...

28/04/2024

Mediahaus delivers the first SRT live-streaming sports production over 5G with URSA Broadcast G2

Mediahaus delivers the first SRT live-streaming sports production over 5G with U...

27/04/2024

L3Harris Chair and CEO Christopher E. Kubasik Discusses 1Q24 On CNBC's "Closing Bell: Overtime"

On April 26, L3Harris Chair and CEO Christopher E. Kubasik joined CNBC's Mor...

27/04/2024

Audinate Adds Major New Features to Dante Connect

PORTLAND, Oregon Audinate Group Limited, the developer of the Dante AV-over-IP solution, announced significant new additions to Dante Connect, its cloud-based D...

27/04/2024

Bell Media Launches New Portfolio of FAST Channels

TORONTO Bell Media has launched 10 English and French-language FAST channels featuring entertainment, factual, news, and sports programming. The new free stream...

27/04/2024

Study: Broadcast TV Evening News Avoids Serious Economic Issues

An extensive new analysis of the news segments in the broadcast evening news programs of ABC, CBS, NBC and PBS has found that broadcasters devoted most of their...

27/04/2024

Hughes Opens Manufacturing Facility and Private 5G Incubation Center in Maryland

GERMANTOWN, Md. EchoStar's Hughes Network Systems has opened a new manufacturing facility and private 5G incubation center in Germantown, Maryland....

27/04/2024

Broadcasting Legend Harry Pappas Dead At 78

Harry Pappas, one of three brothers who founded Pappas Telecasting Companies in 1971, died April 24. He was 78 years old....

27/04/2024

Televisa Selects Synamedia For Broadcast Distribution Overhaul

ATLANTA and LONDON Mexican telecommunications and broadcast company Televisa has selected Synamedia for an overhaul of its broadcast distribution....

27/04/2024

Participate in the Survey - The Impact of AI on Media and the Creative Industry

Participate in the Survey - The Impact of AI on Media and the Creative Industry Pascal Wagner April 26, 2024 0 Comments By participating in this surve...

27/04/2024

SDVI Rally Access Workstation Earns Two Top Awards at 2024 NAB Show

SDVI Rally Access Workstation Earns Two Top Awards at 2024 NAB Show Brie Clayton April 26, 2024 0 Comments SDVI, the leading platform provider for clo...

27/04/2024

Berklee's Music and Health Institute Launches Community Health Musician Certificate

Berklee's Music and Health Institute Launches Community Health Musician Cert...

27/04/2024

Charter Reports Higher Q1 Profits Despite Broadband, Video Losses

Charter Communications reported higher first-quarter profits despite continued cord-cutting and competition for broadband customers....

27/04/2024

Environmental Groups Aim To Make Unscripted TV More Sustainable

Two environmentally-focused groups are partnering to engage the unscripted TV world in finding better ways to address climate change. Reality of Change is an ec...

27/04/2024

Sarah Garcia Named Weekend Anchor at Telemundo 40 in Texas

Sarah Garcia has been promoted to weekend anchor at KTLM McAllen, Texas, known as Telemundo 40. Starting April 27, she will anchor Noticias Telemundo 40 weekend...

27/04/2024

CBS Sports Kicks Off FAST Channel for UEFA Champions League on Pluto TV

CBS Sports said it launched a new 24-hour free, ad supported streaming television (FAST) channel devoted to the UEFA Champions League....

27/04/2024

Brian Roberts's Pay Rose To $35 Million at Comcast

Comcast chairman and CEO Brian Roberts received $35.4 million in compensation in 2023, up 11% from the previous year, according to a proxy statement filed by th...

27/04/2024

John Lithgow Goes Back to School in Art Happens Here'

Art Happens Here With John Lithgow, which sees the actor study dance, ceramics, silk-screen printing and vocal jazz with students in Los Angeles, debuts on PBS ...

27/04/2024

FETV Wants Upfront Buyers Seeking Cable Viewers To Join Its Family

Remember Leave It to Beaver? Bewitched? Dragnet? When cable ratings were rising?...

27/04/2024

Catchy Comedy Features Gomer Pyle, USMC' Weekend Marathon

Next up for the weekend binge at Catchy Comedy is Gomer Pyle, U.S.M.C. Every weekend, Catchy Comedy features The Catchy Binge, a marathon of a classic sitcom....

26/04/2024

Sundance Film Festival CDMX 2024 kicks off today at Cinpolis

Sundance Film Festival CDMX 2024 kicks-off today with screenings in 5 theaters in Mexico City and the opening-night film, FRIDA, directed by Carla Guti rrez...

26/04/2024

Interview: Lourdes Portillo, Director of Las madres de la Plaza de Mayo, La Ofrenda

[Editor's Note: This interview is part of a larger feature about the women d...

26/04/2024

Career insights instead of everyday school life

Once again this year, SGL Carbon opened its doors to interested children and young people. On the occasion of the German Girls and Boys Day, which took place on...

26/04/2024

L3Harris Technologies Reports Strong First Quarter 2024 Results, Increases 2024 Profitability Guidance

Orders1 of $5.5 billion; book-to-bill of 1.06x Revenue of $5.2 billion, up 17%,...

26/04/2024

What Makes A Network Resilient?

Five Considerations For Communications Modernization In The 21st Century In the digital-enabled battlespace, the Joint Force needs to shoot, move and communica...

26/04/2024

CBS Sports Launches New Free Streaming Channel

CBS Sports has launched Champions League as a new, 24-hour streaming channel that will serve as the year-round destination for nonstop highlights of the UEFA ...

26/04/2024

Roku Streaming Homes Hit 81.6M

Despite tough competition in the streaming space, Roku reported solid results in Q1 2024, beating revenue expectations, with total net revenue up 19% YoY to $88...

26/04/2024

Sarah Farrell Named General Manager Of Pinewood Toronto Studios

LONDON AND TORONTO Pinewood Toronto Studios has appointed Sarah Farrell as general manager of the Studios in downtown Toronto....

26/04/2024

Quantum to Offer Advanced Filesharing Technology and Performance in StorNext and Myriad Solutions

Quantum to Offer Advanced Filesharing Technology and Performance in StorNext and...

26/04/2024

FilmLight Colour Awards welcomes 2024 entries and introduces new Emerging Talent' award

FilmLight Colour Awards welcomes 2024 entries and introduces new Emerging Talen...

26/04/2024

Picture Shop Announces Chris Evans as Head of Unscripted

Picture Shop Announces Chris Evans as Head of Unscripted Brie Clayton April 26, 2024 0 Comments Picture Shop announced Chris Evans will lead Unscripte...

26/04/2024

Participate in a Survey - The Impact of AI on Media and the Creative Industry

Participate in a Survey - The Impact of AI on Media and the Creative Industry Pascal Wagner April 26, 2024 0 Comments By participating in this survey,...

26/04/2024

Hi Barbie! Mattel Launching First FAST Channels on Samsung TV Plus

Toy maker Mattel said it is working with Samsung to launch its first free ad-supported streaming television (FAST) channels later this year....

26/04/2024

Marty Moe Named President Of Trusted Media Brands

Trusted Media Brands (TMB) said it named Marty Moe as president....

26/04/2024

Ron Howard Directs Jim Henson Documentary for Disney Plus

Ron Howard is the director on Jim Henson Idea Man, a documentary that premieres on Disney Plus May 31. Henson of course created Kermit the Frog, Miss Piggy, Big...

26/04/2024

Kraken Skate Away From RSN Root Sports for Deals With Tegna, Amazon

The ice continues to melt under the regional sports network business as the Seattle Kraken of the National Hockey League have made a long-term deal to broadcast...

26/04/2024

Warner Bros. Discovery Launches Olli First-Party Data Platform

Heading into the upfronts, Warner Bros. Discovery said it launched Olli, a first-party data platform advertiser can use for converged, targeted advertising camp...

26/04/2024

The Equalizer' Gets Season 5 on CBS

CBS has renewed the drama The Equalizer, which will see season five on in 2024-2025. Queen Latifah stars....

26/04/2024

The CW Inks New Deal for Miss USA, Miss Teen USA

The CW has entered into an exclusive multiyear broadcast partnership for the Miss USA Pageant and the Miss Teen USA Pageant. The 73rd Miss USA Pageant will air ...

26/04/2024

Fuse Urging Young Viewers To Vote With Blunt Campaign

Fuse Media isn't mincing words in a campaign urging its young viewers to register and participate in the 2024 elections....

26/04/2024

Neil Gaiman's Sandman' Universe Expands With Dead Boy Detectives'

Dead Boy Detectives, a series from Neil Gaiman about a detective agency staffed by ghosts, debuts on Netflix April 25. George Rexstrew and Jayden Revri are in t...

26/04/2024

The Story Collective opens largest film and TV studio in the heart of London

The Story Collective has gradually repurposed the former Mortlake Brewery to include production offices, workshops and sound stages By Matthew Corrigan Publi...

26/04/2024

Richard Perkett joins Amagi as chief product officer

Perkett joins the company following a 25 year career in product management, product marketing, engineering and user experience (UX) across multiple industries ...

26/04/2024

Teradek Announces Smaller More Robust Built-in Antennas f...

Teradek, the industry leader in wireless video transmitters and receivers, announced today the launch of new Bolt 6 LT 750 and Bolt 6 Monitor Module 750 RX with...

26/04/2024

Amagi Names Richard Perkett Chief Product Officer

NEW YORK Amagi has appointed Richard Perkett chief product officer (CPO)....

26/04/2024

NAB Board Election Results Announced

WASHINGTON, D.C. The National Association of Broadcasters (NAB) has announced the results of the 2024 NAB Radio and Television Board of Directors elections. The...

26/04/2024

Mattel to Launch First FAST Channels on Samsung TV Plus

EL SEGUNDO, Calif. & NEW YORK Mattel has announced a deal to launch its first three 24/7 free ad supported streaming (FAST) channels on Samsung TV Plus, Samsung...

26/04/2024

NextGen TV Launches In Portland, Maine

PORTLAND, Maine Viewers here can now receive the NextGen TV signals of five local stations with the launch of ATSC 3.0 service from host station WPFO, which is ...