Sony Pixel Power calrec Sony

By Jove, It's No Myth: NVIDIA Triton Speeds Inference on Oracle Cloud

02/01/2024

An avid cyclist, Thomas Park knows the value of having lots of gears to maintain a smooth, fast ride.

So, when the software architect designed an AI inference platform to serve predictions for Oracle Cloud Infrastructure's (OCI) Vision AI service, he picked NVIDIA Triton Inference Server. That's because it can shift up, down or sideways to handle virtually any AI model, framework and hardware and operating mode - quickly and efficiently.

The NVIDIA AI inference platform gives our worldwide cloud services customers tremendous flexibility in how they build and run their AI applications, said Park, a Zurich-based computer engineer and competitive cycler who's worked for four of the world's largest cloud services providers.

Specifically, Triton reduced OCI's total cost of ownership by 10%, increased prediction throughput up to 76% and reduced inference latency up to 51% for OCI Vision and Document Understanding Service models that were migrated to Triton. The services run globally across more than 45 regional data centers, according to an Oracle blog Park and a colleague posted earlier this year.

Computer Vision Accelerates Insights Customers rely on OCI Vision AI for a wide variety of object detection and image classification jobs. For instance, a U.S.-based transit agency uses it to automatically detect the number of vehicle axles passing by to calculate and bill bridge tolls, sparing busy truckers wait time at toll booths.

OCI AI is also available in Oracle NetSuite, a set of business applications used by more than 37,000 organizations worldwide. It's used, for example, to automate invoice recognition.

Thanks to Park's work, Triton is now being adopted across other OCI services, too.

A Triton-Aware Data Service Our AI platform is Triton-aware for the benefit of our customers , said Tzvi Keisar, a director of product management for OCI's Data Science service, which handles machine learning for Oracle's internal and external users.

If customers want to use Triton, they don't have to worry about the configuration because it will be done automatically by the service, launching a Triton-powered inference endpoint for them, said Keisar.

Triton is included in NVIDIA AI Enterprise, a platform that provides full security and support businesses need - and it's available on OCI Marketplace.

A Massive SaaS Platform OCI's Data Science service is the machine learning platform for both Oracle NetSuite and Oracle Fusion Applications.

These business application suites are massive, with tens of thousands of customers who are also building their frameworks on top of our service, he said.

It's a wide swath of mainly enterprise users in manufacturing, retail, transportation and other industries. They're building and using AI models of nearly every shape and size.

Inference was one of the group's first services, and Triton came on the team's radar not long after its launch.

A Best-in-Class Inference Framework We saw Triton pick up in popularity as a best-in-class serving framework, so we started experimenting with it, Keisar said. We saw really good performance, and it closed a gap in our existing offerings, especially on multi-model inference - it's the most versatile and advanced inferencing framework out there.

Launched on OCI in March, Triton has already attracted the attention of many internal teams at Oracle hoping to use it for inference jobs that require serving predictions from multiple AI models running concurrently.

Triton has a very good track record and performance on multiple models deployed on a single endpoint, he said.

Accelerating the Future Looking ahead, Keisar's team is evaluating NVIDIA TensorRT-LLM software to supercharge inference on the complex large language models (LLMs) that have captured the imagination of many users.

An active blogger, Keisar's latest article detailed quantization techniques for running a Llama 2 LLM with a whopping 70 billion parameters on NVIDIA A10 Tensor Core GPUs.

Even down to four-bit parameters, the quality of model outputs is still quite good, he said. Deploying on NVIDIA GPUs gives us the flexibility to find a good balance in latency, throughput and cost.

After announcements this fall that Oracle is deploying the latest NVIDIA H100 Tensor Core GPUs, H200 GPUs, L40S GPUs and Grace Hopper Superchips, it's just the start of many accelerated efforts to come.
LINK: https://blogs.nvidia.com/blog/ai-inference-oci-triton/...
See more stories from nvidia

Most recent headlines

04/08/2024

Dalet Appoints Santiago Solanas as CEO to Lead Next Era of Growth and Innovation

Dalet, a leading technology and service provider for media-rich organizations, is excited to announce Santiago Solanas as its new Chief Executive Officer (CEO)....

03/06/2024

Dalet and Veritone Reach Agreement to Distribute, Transact and Monetize Media Archives

Dalet, a leading technology and service provider for media-rich organizations, a...

09/05/2024

Rohde & Schwarz showcases operational efficiency at MPTS

Rohde & Schwarz showcases operational efficiency at MPTS Uncompromised quality and reliability coupled with simplicity and efficiency Rohde & Schwarz will ...

08/05/2024

Saved! and Rudo y Cursi Among Films With May Anniversaries

Diego Luna's character Beto aims to play professional soccer in Carlos Cuar n's film Rudo y Cursi....

08/05/2024

SGL Carbon: Focus on global growth trends pays off

SGL Carbon had a solid start to the first quarter of 2024. Despite the slight decline in sales of 3.9% to 272.6 million (Q1 2023: 283.7 million), adjusted EBI...

08/05/2024

The results are in from Malm Arena as Semi Final 1 participants move through to the Grand Final of this year's Eurovision Song Contest

The results are in from Malm Arena as Semi Final 1 participants move through to...

08/05/2024

Powerful NITV documentary Kindred to premiere during Reconciliation Week

Powerful NITV documentary Kindred to premiere during Reconciliation Week Media releases A journey of family, friendship and home *Watch the trailer here* ...

08/05/2024

Aerojet Rocketdyne Selected by DoD to Demonstrate Powder-in, Engine-out Hypersonic Propulsion Manufacturing

Aerojet Rocketdyne delivers a broad range of capabilities to support hypersonic ...

08/05/2024

L3Harris is Advancing the Future Fleet by Delivering Integrated Solutions

Selected by Fincantieri Marinette Marine, L3Harris is the shipboard design integrator and a key partner on the Fincantieri team for delivering critical subsyste...

08/05/2024

EditShare Goes Beyond Storage at MPTS

EditShare Goes Beyond Storage at MPTS Workflow, server and delivery solutions from storyboard to screen Stand D40, London Olympia, May 15-16, 2024 - EditShar...

08/05/2024

Tegna, Indiana Fever Expand Broadcast Distribution to 11 Additional Markets

TYSONS, Va. Tegna Inc. and the Indiana Fever have announced they have expanded the broadcast distribution of 17 Fever WNBA games to 11 additional markets....

08/05/2024

Comcast Business Boosts Internet Speeds

PHILADELPHIA Comcast Business has increased Internet speeds for customers, nationwide, at no additional cost and has introduced its fastest Internet plans yet ...

08/05/2024

Clear-Com to Showcase Cutting-Edge Communication Solution...

Clear-Com , a renowned provider of professional real-time communication solutions, is thrilled to announce its participation at InfoComm 2024, the largest profe...

08/05/2024

MRMC Achieves its Second Kings Award for Enterprise

Mark Roberts Motion Control (MRMC), a Nikon company, is one of the 252 organisations recognised with a prestigious King's Award for Enterprise. Announced to...

08/05/2024

EditShare goes beyond storage at MPTS

EditShare, the technology leader that enables storytellers to create and manage collaborative workflows at every stage from storyboard to screen, is exhibiting ...

08/05/2024

Magnifi and Linius Fuse Live and Archive Content to Redef...

Magnifi by VideoVerse an AI-driven video technology company and powerful video-editing SaaS platform is pleased to announce a strategic partnership with Linius ...

08/05/2024

NVIDIA RTX GPU Connects DaVinci Resolve to Power

NVIDIA RTX GPU Connects DaVinci Resolve to Power Brie Clayton May 8, 2024 0 Comments Two of the most important questions that the Creative COW audienc...

08/05/2024

Q&A with Visiting Artist Lei Liang

Q&A with Visiting Artist Lei Liang The acclaimed composer discusses his collaboration with Boston Conservatory, his work as a research artist, and what it mea...

08/05/2024

More Stations in San Antonio Launch NextGen TV Broadcasts

A second group of stations has begun broadcasting NextGen TV signals in San Antonio, Texas....

08/05/2024

NBC Orders More Deal or No Deal Island'

NBC has ordered a second season of unscripted Deal or No Deal Island. Joe Manganiello hosts, and executive produces alongside Howie Mandel. Manganiello will ret...

08/05/2024

Tegna Names Greg Retsinas GM At KGW, Portland, Oregon

Tegna said it named Greg Retsinas as president and general manager of KGW, the NBC affiliate in Portland, Oregon, effective June 3....

08/05/2024

Measurement Company Comscore Posts $5.2 Million Loss in 1st Quarter

Comscore lost money in the first quarter as weak linear ad sales impacted its cross-platform measurement business....

08/05/2024

Funny Business as NBC Orders Happy's Place,' Renews Lopez vs. Lopez'

NBC said it bolstered its comedy slate by renewing Lopez vs. Lopez and ordering Happy's Place, starring Reba McEntire....

08/05/2024

Knuckles' Knocks Fallout' From Top of TVision Power Score Rankings

Paramount Plus's Knuckles knocked Amazon Prime Video's Fallout out of the top spot in TVision's Power Score ranking of the top show on connected TV ...

08/05/2024

Tegna Adds 11 Markets To Lineup for Indiana Fever Games

Tegna, which made a deal to broadcast 17 Indiana Fever games featuring top WNBA draft pick Caitlin Clark, said it made deals that will put the games on stations...

08/05/2024

EchoStar Loses 348,000 Pay TV Subs in 1st Quarter

EchoStar said it lost another 348,000 pay TV subscribers so far this year, closing the first quarter with 8.18 million subscribers. That total includes 6.26 mil...

08/05/2024

Fox Has Net Income of $666 Million in 3rd Quarter

Fox reported a profit for its fiscal third quarter....

08/05/2024

IPG's Kinesso Unit Adds Senior Executives to Lineup

Kinesso, the centralized tech and data unit at media agency IPG Mediabrands, named Tom Amies-Cull as global chief operating officer and Amie Owen as global chie...

08/05/2024

Post production houses adopt Cleanfeed Cinema solution

The solutions in-browser stream focuses on low latency, making it suited to low bandwidth scenarios By Matthew Corrigan Published: May 8, 2024 The solutio...

08/05/2024

TAG unveils subtitling language detection feature

Driven by algorithms, the solution performs a quality analysis informed by language-specific dictionaries By Matthew Corrigan Published: May 8, 2024 Drive...

08/05/2024

Mobilelinks acquires 2 SAT Europe to increase SNG truck fleet

The increase in fleet size will reduce travel distances, aligning with sustainability goals, said Mobilelinks By Matthew Corrigan Published: May 8, 2024 T...

08/05/2024

Arqiva adds Caroline Cardozo and James Lelyveld to technology team

The company said the new arrivals would drive collaboration and technological transformation across key business units By Matthew Corrigan Published: May 8, ...

08/05/2024

Actus Digital Set to Shine at CABSAT and Broadcast Asia

Following a successful NAB Las Vegas 2024 and winning a Best of Show Award, Actus Digital, a leading provider of Intelligent Monitoring Platforms, will bring it...

08/05/2024

TAG Revolutionizes Closed Captions and Subtitles Quality...

TAG Video Systems, a leading force in video monitoring solutions, has developed a new Language Detection feature set to transform how operators ensure quality a...

08/05/2024

Pliant Technologies Unveils New SmartBoom LITE Headset at...

Pliant Technologies, a leading provider of professional wireless intercom solutions,presents its new SmartBoom LITE Headset at InfoComm 2024 (Booth C5116). The ...

08/05/2024

Prism Sound Showcases High-Quality Audio Conversion at MP...

At the Media Production & Technology Show 2024, Prism Sound will showcase high-quality audio conversion products designed to suit the demands of professional us...

08/05/2024

Julian Day Joins FooEngine

Soho stalwart Julian Day has joined FooEngine as Business Development Director after 13 years at ZOO Digital. Julian has been at the heart of the London post pr...

08/05/2024

Cleanfeed Cinema Redefines Audio Post Production Workflow...

Following its successful launch at NAB 2024, Cleanfeed Cinema - the latest remote recording innovation from Emmy Award-winning Cleanfeed - is already helping au...

08/05/2024

DHD to Showcase New Product Line-up at MPTS 2024

DHD's range of digital broadcast equipment, systems and related software will be promoted on stand D22 at the Media Production & Technology Show by UK-regio...

08/05/2024

MSP CloudReso selects Cubbit hyper-resilient DS3 distribu...

Cubbit, the innovator behind Europe's first distributed cloud storage enabler, today announced that CloudReso, a France-based distributor of MSP security so...

08/05/2024

MwareTV boosts Americas presence with Daniel Conde Coto

MwareTV, a prominent cloud-based multi-tenant platform provider, has attracted Daniel Conde Coto to join the company as director, sales operations. This is a si...

08/05/2024

LiveU Demonstrates its Efficient IP-Video Workflows for L...

In a year set to see record-breaking IP-video adoption, with over 70 elections and global sports events, LiveU heads to Broadcast Asia 2024 with a focus on its ...

08/05/2024

nxtedition Showcases a Fully Automated AR Studio Gallery...

Pioneers in microservices-based production environments, nxtedition, will demonstrate the latest advances in storytelling technology at the Media Production & T...

08/05/2024

MPTS 2024 - Leader and PHABRIX to showcase multiple new T...

Test & measurement innovator, Leader Electronics of Europe, has announced that it will again exhibit at The Media Production & Technology Show (MPTS), which tak...

08/05/2024

Livepeer Studio cuts the cost of live streaming and trans...

Livepeer Studio unveils a revolutionary video streaming platform offering an unprecedented combination of quality and cost-efficiency to content creators, media...

08/05/2024

Ikegami to Demonstrate Complete Broadcast Media Productio...

Ikegami Electronics (Europe) will demonstrate a complete broadcast media production system on stand S1-A15 at CABSAT 2024 in Dubai, Tuesday May 21 through Thurs...

08/05/2024

Experience Commerce Bags Digital Agency Mandate for SAVSO...

Experience Commerce (EC), a leading digital agency within the Cheil Network, is pleased to announce that it has won the digital mandate for SAVSOL, the flagship...

08/05/2024

Global Telecom & Pay TV Services Market to Slowdown in 2024

NEEDHAM, Mass. The International Data Corporation is predicting that worldwide spending on telecom services and pay TV services will increase by 1.4% in 2024 to...

08/05/2024

Apple Unveils New iPad Live Multicam Production Studio

CUPERTINO, Calif. In a notable development for news and live video production, Apple has unveiled a number of significant upgrades to its Final Cut Pro software...

08/05/2024

Pliant Technologies to Showcase New Smartboom Lite Headset at InfoComm 2024

Pliant Technologies has announced that it will be presenting its new SmartBoom LITE Headset at InfoComm 2024 (Booth C5116). The latest updates include enhanceme...