Netint Joins Ampere’s AI Platform Alliance

April 24, 2024

//php echo do_shortcode(‘[responsivevoice_button voice=”US English Male” buttontext=”Listen to Post”]’) ?>

Video processing unit (VPU) maker Netint has joined the AI Platform Alliance, an Ampere-led initiative whose members plan to work together to validate joint AI solutions for the market.

The aim of the alliance is to get deployable solutions out into the market, Ampere chief product officer Jeff Wittich told EE Times.

ampere-jeff-wittich-3399519 — Jeff Wittich (Source: Ampere)

“The issue today is that one of the big bottlenecks for [AI accelerator] companies is there isn’t a deployable solution for their accelerator, there isn’t an OEM server that you can plug their accelerator into,” he said. “What you’d ideally want is to buy the server with that card already in it—you can’t do that today. You could buy that card, but you’re still going to have to put it in there and hope it works. Because it wasn’t part of the qualified list of components that could go into that server.”

While AI accelerator companies have PCIe cards available, there is a big difference between reference platforms and deployable systems, Wittich said.

By Global Unichip Corp. 04.18.2024

gr-23-091175-240104-poe-email-powerssmartcities2-marketo-600x340-1-6721490

By Shruti Usgaonkar, Principal Engineer, Microchip Technology 04.18.2024

“What end users really want is to go to HPE or Supermicro and buy a server that has a CPU in it and one of these cards in it, and it works on day one, and that’s not what’s available today,” he said.

Ampere already has CPUs in qualified servers from a variety of server makers. Wittich said the company plans to use its relationships with server makers to work towards qualified Ampere CPU plus AI accelerator-equipped servers.

“One of [Ampere’s] underestimated strengths is we’ve been able to build up a really good ecosystem in places that people don’t usually pay attention, that’s been a big strength of ours,” he said. “There are 50 or 60 Ampere OEM/ODM [qualified servers] out there, so there’s a ton of places people have invested in building out solutions and building out server hardware that uses our CPU, and that is a huge barrier to entry for other startups.”

The AI Platform Alliance has nine AI accelerator makers among its members and the plan is to enable a diverse set of solutions for different use cases. One of the biggest challenges to widespread AI deployment was always going to be the ecosystem, Wittich said.

“To some extent, it’s about providing a distribution model or go-to-market model for [AI accelerator makers] so people can actually consume these technologies,” he said. “The number one short term thing we’re able to provide is access to the market in a way that’s actually deployable, so that users have a good experience.”

The alliance members will work together to validate joint solutions, including optimizing CPU-accelerator systems for optimal performance. From there, the companies can build on that base.

“Over time there can obviously be much more extensive solutions that we build as we partner with these types of companies,” Wittich said. “It doesn’t end with just getting a bunch of boxes out there, but if we don’t do that, we can’t go to the next step, we can’t build anything more complex.”

Ampere also promotes its CPUs for AI inference, but Wittich is clear that CPU-only solutions will not suit all inference applications.

“There’s no one-size-fits-all solution for all models—models are going to continue to evolve, and there are places where having other types of hardware solutions is going to be really beneficial, as long as the solutions are efficient and we’re providing flexibility to end users,” he said. “The way we’ve chosen to provide this flexibility is at the platform level.”

Newest alliance member Netint makes VPUs—hardware accelerators for video transcoding—which are designed for high-density live video streaming, but also target applications like security cameras and surveillance.

Netint CMO Mark Donnigan told EE Times that the AI Platform Alliance gives the company the ability to plug into host platforms, extending its reach and visibility as a small company.

“Data centers are running out of energy,” he said. “Compute requirements are continuing to increase, which means the ecosystem needs to be built, and needs to have a voice, because it requires that people think a bit differently about how they build systems.”

Joining the alliance has already resulted in a qualified Ampere plus Netint server solution from Supermicro. The use case for this server would be to have the VPU perform efficient video transcoding—converting video streams to other resolutions and formats—with the Ampere CPU handling AI inference tasks, such as video analytics or subtitling.

Donnigan said that video analytics applications rely on fast decode and encode capabilities to maximize utilization of AI accelerators or GPUs, preventing bottlenecks.

“In this context, we’re like a gateway that can optimize video into a resolution and a format so that inferencing systems can work efficiently and do more work,” he said, noting that even in the biggest video analytics applications like retail stores and complicated logistics and port environments, it still comes down to cost. “So if it requires two or three or more servers, or more power to run because they don’t have as much video processing capability, that can make the difference between making the system viable or not,” he said.

netint-whisper-sm-4257883 — Netint and Ampere demonstrated captioning on a video stream using Whisper running on the Ampere Altra CPU at NAB. (Source: Netint)

At the recent NAB (National Association of Broadcasters) conference, Netint and Ampere demonstrated their qualified Supermicro server running AI inference on a transcoded video stream. The demo system uses a Netint VPU for video transcoding and a 96-core Ampere Altra CPU for AI inference—in this case, running the Whisper speech-to-text model to generate subtitles for the video stream. Previously, subtitling was done offline or required very heavy compute, Donnigan said.

“We’re now able to run this on the Ampere CPU, so the server that is doing both the transcoding and video processing is now also doing subtitling,” he said. “That hasn’t been possible to do with the density [we provide]—you might have had a small handful of streams, now you can get dozens of streams out with subtitling on a single box, and in a very efficient way.”

The NAB demo shows a single stream, but Donnigan said Netint thinks around 20 live video streams with adaptive bitrate ladders should be possible (around 100 separate live encodes, since each of the 20 channels would typically be transcoded to create five versions in different resolutions for different target devices). The Ampere Altra CPU runs Whisper inference and other management functions for video streaming.

Other members of the AI Platform Alliance include Cerebras, Furiosa, Graphcore, Kalray, Kinara, Luminous, Neuchips, Rebellions and Sapeon.