Future of TV Audio: Looking Into the Crystal Ball

Dennis Baxter is the new columnist for Inside Audio

The beginning of a new year is a good time to reflect on the past and look down the road to the future. Jay Yeary, my friend and colleague, is taking some time off. I will attempt to fill his big shoes by continuing to present concepts and ideas about audio for broadcast and broadband and how to navigate through the sound minefield to success. TV Technology is known for presenting the latest in technology and innovation, and I plan to continue Jay’s excellent work with a year long exploration of the technology and production practices that will make possible the transition into a new world of sound.

(Image credit: Thinkstock/allanswart)

Technology makes possible advances in production practices and value. For example, advances in technology resulted in stereo and surround sound, and today we have the technology to transmit between 16 and 24 full fidelity audio channels and many more channels by adjusting the encoding parameters.

But now the question is: what will we do with all those channels of audio? Immersive sound and interactivity come quickly to mind.


For several years we have all read in TV Technology and other publications about ATSC 3.0, a set of voluntary broadcast standards for the production of advanced audio and video and at least one major U.S. network, NBC, has been moving forward with testing immersive audio.

NBC has successfully completed the audio testing by capturing and producing immersive sound using Olympic and college football content and distributing the production over its DirecTV partner. At an AES presentation, I heard a senior level NBC audio executive acknowledge it was better than he expected.

But there can be confusion. There have been a variety of audio papers and publications espousing a multiplicity of opinions about what immersive sound is, so it is beneficial to hypothesize a set of discussion parameters. Fundamentally, immersive sound includes some sound reproduction above the listener: either two front speakers for height, denoted as 5.1.2 or four speakers for height represented as 5.1.4. The 5.1 is the typical surround sound speaker setup and the .2 or .4 is the number of sound source channels above the listener. I say “sound source channels” because the sound above the final listener will likely be from side-firing or up-firing speakers or soundbars.

The question now is how does the audio practitioner get from 5.1 surround sound production to some sort of immersive sound?

There is little information and practical experience on how to produce immersive sound live and I plan to share the experience of my fellow sound designers and mixers as well as my practice with producing immersive sound in my 11.1 Genelec studio for the last six years. I assure you the solution is more than a “black box.”

Over a series of columns, I will examine the technology that will make immersive sound possible for the audio mixer, sound designer and content producer. Additionally, we will discuss production practices and possibilities to inform the audio practitioners who has to craft immersive sound.

In the last decade, audio has seen the development and implementation of digital infrastructure technology with digital networks using MADI and IP protocol and it seems that IP systems have touched virtually every signal path and flow that an audio person deals with. This may be the death of analogue audio except for the conversion of sound with transducers such as microphones and speakers.

Microphone reviews show up in the pages of TV Technology, but I plan to look at new approaches to conventional analog capsule microphones such as MEMS-based technology and computer controlled microphones that can beam form into a variety of patterns.

All the microphone manufacturers are experimenting with MEMS technology, but I suggest they get away from conventional microphone design. A MEMS-based shotgun microphone would probably improve durability, but I see new capture methods using beam steering as the future. Now marry those technologies. I have listened to a 19 capsule MEMS microphone and was surprised at the cost and sonic performance. We will discuss these topics later this year.


Fundamental to immersive sound production is vertical panning. The mixing consoles used for live broadcast and broadband are different than the virtual mixing desk in the post production world. Mixing immersive sound in a studio is firmly entrenched with film and drama because vertical panning is possible with Dolby Atmos, Auro 3D, THX and many other unknown (cheaper) programs that have vertical panners. However, 3D panners for live production of sports and entertainment have only recently seen the light of day with 2018- 2019 offerings from SSL, Calrec, LAWO and Stagetec, to name several. This year I will take the time to explore the possibilities and requirements of next generation mixing consoles.

I believe if we sound practitioners/mixers can hear it, we can mix it. Immersive sound monitoring in the confined space of the OB van is going to be a significant challenge. Test productions have been conducted in some existing OB vans and OB suppliers and designers are considering ways to accommodate immersive sound. Additionally, speaker manufacturers such as Genelec have a range of speaker sizes that seem to be a temporary solution. I would like to suggest that speaker manufacturers come up with alternative designs that would fit better in small spaces. I suggest there could be mathematically placed speakers to create some sort of virtual sound reproduction of any format from stereo to 11.1 or even a 22.1 immersive soundfield that would reside in the OB van. Master control would monitor in a proper acoustical environment with a variety of speaker configuration including soundbars and advise accordingly. Remember, essentially all consumers will listen on some configuration/variation of a soundbar, making a virtual sound reproduction platform at the capture point useful.


Immersive sound is coming quicker than expected because there are standards, encoders and processors, but most importantly, the consumer can experience a better sound product over soundbars and headphones. Reproduction quality is only getting better. Is it perfect immersive sound? Good question, but probably not. Does it sound better to the consumer? Probably. But the consumer will get the ear of the content producers and drive the adoption of immersive sound.

For the audio practitioner the question is how do we produce sound for the future? Practice, test, learn and question.

The question that concerns me the most is how does the audio practitioner design sound for a wide range of playback options? Last month’s CES showed that there are plenty of consumer playback options… but soundbars are at the top.

Audio production will suffer from a range of reproduction options which are probably not capable of precisely reproducing the sound designer’s aural vision. The consolation is that it probably sounds better than what the average consumer was listening to and that is the starting point.

Over the course of the next year, we will take a look into that crystal ball for the future of audio, and most importantly, how you can best prepare for it.

Dennis Baxter has spent more than 30 years as a sound designer. He is the author of “A Practical Guide to Television Sound Engineering,” and is working on a book about immersive sound practtices and production. He can be reached atdbaxter@dennisbaxtersound.com.

Dennis Baxter

Dennis Baxter has spent over 35 years in live broadcasting contributing to hundreds of live events including sound design for nine Olympic Games. He has earned multiple Emmy Awards and is the author of “A Practical Guide to Television Sound Engineering,” published in both English and Chinese. His current book about immersive sound practices and production will be available in 2022. He can be reached at dbaxter@dennisbaxtersound.com or at www.dennisbaxtersound.com.