Immersive Content and VR for Audio at Avid Connect 2018

Since 2013, my first NAB Show stop has been at Avid Connect. This year I was extremely intrigued to see how the event would differ from previous years, since Jeff Rosica had taken the helm of the company following the still-fresh ouster of the previous CEO — who originated the event — along with the Avid Customer Association. It didn’t take long to find out.

Avid Connect has always tended to be product-heavy from start to finish, but this one simply did not feel that way. There was plenty of discussion of Media Composer, Pro Tools and the Media Central Platform — Avid’s business is to make and sell products after all — but the overall event, and the breakout sessions in particular, were far more interesting and educational than prior years, covering a wide range of topics that were not specific to the company and its products.

TECH TRENDS

One session that was surprisingly engrossing, “Technology Trends Musicians Can Utilize for Tracking Royalties.” It covered, among other topics, the pros and cons of streaming.

Panelist Jordan “DJ Swivel” Young made the case that, for those who have a dedicated fan base, streaming can actually be a greater source of revenue than CD or song purchases because the artist gets paid every time the song is played, not just when a purchase is made.

“Technology Trends” panel from left: Ray Thompson (moderator), Jordan “DJ Swivel” Young, Henry Frecon, Benji Rogers.

“Technology Trends” panel from left: Ray Thompson (moderator), Jordan “DJ Swivel” Young, Henry Frecon, Benji Rogers.

A different technological solution to the issue was presented by Benji Rogers, CEO of Dot Blockchain Media, who claims that current audio file formats are antiquated — WAV is 27 years old; MP3 is 24; and FLAC is 17 — and are part of the problem because they don’t contain persistent metadata.

Rogers proposes a new format and wrapper with blockchain as part of the fundamental framework so that ownership can be tracked from inception through all iterations, sampling and reuse.

Whether or not you’re an artist trying to get paid, the ability to encode persistent, unmodifiable metadata will soon become an essential element of every single piece of media as the ability to replace original audio and video content with completely believable digital substitutes becomes even more mainstream than it already is.

There were several sessions on immersive content and VR, both key elements of Next Gen Audio, and these sessions were far more informative than any I attended at NAB.

“Creating Spectacular Experiences with Dolby Atmos” panel from left: Curt Behlmer (moderator), Ron Bartlett, Doug Hemphill, Tim Hoogenakker.

“Creating Spectacular Experiences with Dolby Atmos” panel from left: Curt Behlmer (moderator), Ron Bartlett, Doug Hemphill, Tim Hoogenakker.

The panel of “Creating Spectacular Experiences With Dolby Atmos” included re-recording engineers Doug Hemphill, Ron Bartlett, and Tim Hoogenakker. During the discussion on mixing immersive content, Hemphill pointed out that as sound density increases, imaging and intelligibility decrease, so films with busier soundscapes are more difficult to mix and it becomes harder to make individual sounds stand out. In fact, he sometimes pulls back the density of the sound mix when there are really intense visuals to give people’s brains a break.

When asked what it was like to be able to place sound in so many locations, Bartlett said that they went a little overboard with the overhead speakers at first, but realized that this actually made the film sound more mono and less immersive, so more judicious use of the overheads is now employed.

[Read: 2018 Audio Prognostication—What Lies Ahead?]

Everyone felt that the soundscape in Atmos is more akin to listening in the natural world than standard surround mixes, though every so often the choices made in the surround mix are the right ones compared to the Atmos mix.

While they do verify downmixes as they mix, Hoogenakker said that he specifically checks television downmixes on a soundbar since more people than ever are listening on them at home.

Something that helps keep costs down while speeding up workflow is being able to work in 7.1.4 in smaller rooms to prepare audio for the Atmos mix room. Hemphill brought us all back to reality by reminding us that even though the ability to create amazing immersive soundscapes is now in everyone’s grasp, dialog remains king. Always.

CHALLENGES OF AUDIO IN VR

In the first session on audio for VR, “Techniques for Mixing Audio in VR Content,” the panelists covered aspects of VR production I had not even contemplated. Benedict Green of Ecco VR stressed the importance of preproduction and the fact that, when shooting content, the crew, recording equipment, cables and anything else that might take people out of the experience must be concealed, which is a tad difficult when shooting live actors while trying to capture and playback audio.

Avid Connect has always tended to be product-heavy from start to finish, but this one simply did not feel that way.

Green and Varuna Nair of Facebook both noted that ambisonics is the preferred format for VR audio and there are different tools for working with different order ambisonics.

They also discussed some of the challenges of working with audio for VR, including never being able to use a boom mic on set; and the difficulty of using ambisonics microphones because they cannot move without changing the soundfield. Managing loudness, and keeping audio from clipping is a huge challenge, as can working with nonaudio people who may reorder or mangle the ambisonics tracks. According to Nair, when thinking about VR, audio engineers should “think of it as another tool in the toolbox.”

In the session “Creating a Compelling Immersive Mix for VR Content,” Scott Gershin made the case that realism is not always the best approach, since the purpose of sound is to help tell the story in whatever way necessary.

Like Green and Nair, he also mentioned things I had not really considered. For instance, if you move a sound in the soundscape, people will pay attention to it, so don’t move it if you don’t want it noticed. Gershin does not move backgrounds, but he does group and move all sounds for given characters together to the same place in the soundscape because that is where the character exists at that moment.

Since we don’t really hear things behind us he feels that a 270-degree soundfield is wide enough. As for where this is all heading, Gershin thinks that VR will be commonplace in five years and boring in 10, but augmented reality is really where we’re heading anyway.

Jay Yeary is a television engineer who specializes in audio. He is a member of AES, SBE, SMPTE and TAB. He can be contacted through TV Technology magazine or attransientaudiolabs.com.

[Want more information like this? Subscribe to our newsletter and get it delivered right to your inbox.]