Broadcast audio has undergone a dedicated development process over the years, progressing from mono to stereo, analog to digital and baseband to Audio over IP (AoIP). Despite this, it continued to live in the shadow of broadcast video and was almost subsumed when embedded audio was introduced in the 1970s. Today, with dedicated standards and protocols such as AES67, Dante and Ravenna, the position of audio in TV has grown in stature while also coming under the aegis of the more video-oriented SMPTE ST 2110.

This, however, does not appear to be a hindrance to the ongoing evolution of TV/streaming audio, now firmly established on an AoIP foundation and looking towards the continuing implementation of Next Generation Audio (NGA), with AI promising more practical benefits for processing and control.

Building on the Backbone

The advent of ST 2110 enabled the discrete transport of audio and video, giving operators the freedom and flexibility to send either stream to different destinations. This is part of the “second wave of IP,” which Christian Struck, senior product manager for audio expert infrastructure at Lawo, predicted will be at the 2026 NAB Show, April 18-22 in Las Vegas.

“It’s fair to say that AoIP is the de facto standard now, but I think what we will see at the NAB Show will build on that as the backbone architecture [for systems],” he said. “This means using the abstraction layer or the generic infrastructure that is capable of dealing with audio and video in an equal manner…to abstract further away from bespoke hardware and move the offering towards software values.” He also highlighted the EBU’s recommendations for the Dynamic Media Facility [DMF] as “a clear trend” towards generic backbone architecture that will “allow operators to define how to utilize these facilities for whatever purpose and on what scale.”

Costa Niklos (Image credit: Telos Alliance)

Costa Nikols, executive team strategy adviser for media and entertainment (M&E) at Telos Alliance, picked up on the concept of the DMF, adding that ST 2110, NMOS (Networked Media Open Specifications) and virtualized audio engines are now the “baseline” for new builds and major refurbishments.

“MXL [Media eXchange Layer] is [also] becoming predominant in the broadcast space, allowing for different virtual machines and processes as well as memory share, with ST 2110 for signal flow and a bunch of transport standards like SRT and NDI and, in the audio world, Dante,” he said. “As well as those, there are software-as-a-service [SaaS] platforms from companies including Grass Valley, Vizrt and Matrox, [all of which can enable] virtualization like the DMF, configuring different rooms via software depending on the function you need to have that specific day.”

Going Full 2110

The Dante AoIP protocol is now firmly established in broadcast and its developer, Audinate, has expanded further into the market with video functionality. “Operationalizing IP and software workflows at scale” for broadcast audio is a growing trend, Principal Product Manager Will Waters said.

“New facilities are being brought about with IP at the core, with ST 2110 as the backbone,” he said. “But it’s not just about the [audio] quality, which is there on the digital side. It’s more about how you operationalize that, scale the network and ensure you don’t get into challenges because we really need a deterministic behavior in the audio, where networks are inherently packetized, so they’re meant to be agnostic to what’s flying over them. The trend I see is operationalizing that alignment with PTP [Precision Time Protocol] across the facility to get better control of multicast flows.”

Jason Waufle, global strategic business development manager at Shure, observed that Dante and AES67 are now being backhauled into ST 2110 networks as part of OB truck integrations.

“The industry consensus is a full ST 2110 or IP infrastructure is the way to go,” he said. “You’ve got significantly more flexibility, control and interoperability of audio and video and transport. Shure has been making Dante and AES67 native products for nearly a decade, and advancing our onboard DSP [to] make our array microphones essentially an edge device for a network, capturing more audio data than a single analog mic could.”

Phil Owens, senior sales engineer at Wheatstone, said there are also advancements in routing and distributing AoIP in live broadcast.

“The preferred method of operation now is to be able to grab any source or signal and send it and use it anywhere you want,” he said. “That kind of distribution and sharing is right in the IP wheelhouse, whether it’s long-distance transport with AoIP or just within the same building.” Right now, Owens added, Wheatstone is focused on server-based systems that can be in the cloud, but he sees on-prem setups as more practical. “Server-based apps can provide scalability and redundancy that bespoke hardware is lacking.” he said.

Immersive and Personalization

While AoIP is now either being implemented or is part of broadcasters’ plans across the sector, NGA, immersive systems such as Dolby Atmos and personalization of audio for alternative languages and commentaries are more confined to streaming services.

“Amazon and Netflix are overtaking the conventional linear TV broadcasters [when it comes to] Dolby Atmos and binaural,” Struck said. “It has always been a significant challenge to make the infrastructure and complete signal chain immersive-capable. It’s much more common to the large streaming services than it is for broadcasters, where only exceptional and selected things are transmitted in immersive.”

Nikols is more positive, saying Telos is seeing immersive and personalized audio “showing up in strategic plans” for new projects and installations. “They’re no longer just nice to have,” he said. There is now a “software-first virtualization mentality” in broadcast, Nikols continued, coupled with AI-assisted workflows. “It can assist in the creative process, [through] loudness control, compliance monitoring, speech enhancement and noise reduction.”

Another assistive application is in audio mixing, with Shure and others using AI to develop systems that can handle the more mundane aspects of a production. “We came up with the capability of developing DSP that, from an AI standpoint, allows the engineers not to have to worry about the audience mix because it does it automatically,” Scott Sullivan, vice president of strategy and innovation, explained. “But the things that are more in the hands of the professionals, that’s going to be significantly down the road. Right now, I think AI is a complementary tool.”

No doubt AI in all its forms will be prominent at the 2026 NAB Show, but it could be the less-showy aspects of audio, such as the second wave of IP, that really point the way ahead for this ever-evolving sector.