HD Tips & Techniques: Building an Audio Infrastructure for HD


I have often heard it jokingly remarked that “HD” must stand for “Hard-to-Do.” Inevitably this comes from people who have just been bitten by one of the numerous challenges or problems posed by migrating to the HD world.

While it’s certainly true that HD is theoretically complicated, my experience is that these newly frustrated professionals have likely been thwarted by the practical difficulties involved in doing seemingly “easy” things. Invariably, these practical difficulties have a lot to do with trying to make television with a given set of equipment in a technical environment.

My starting point for this article is then the suggestion that facilities design is more crucial to the successful creation of television than it was in the old NTSC world. Given the fact that most of us on this continent have spent our careers working within the narrow and comfortable limits of the NTSC format (and its digital cousin, SMPTE 259M-C) with stereo audio, the almost overwhelming flexibility and complexity of the ATSC specification inescapably leads to frustration. This frustration isn’t specific to either audio or video. It can involve either discipline, but for my money—it’s usually the notion of multichannel audio that causes everything to grind to a halt.

I don’t mean to suggest that video is easy. There is a lot of choice when it comes to planning the video side of the equation: i.e., video formats, frame rates, and codecs. All of these things influence how pictures get made and delivered in HD, but the difference between any one choice and another is quite slender for the folks down in the trenches. The process by which they execute the picture-portion of a TV show remains virtually the same, regardless of technical choices.


Production/operations personnel and engineering staff can be equally affected by these frustrations, but the issue can rapidly become a crisis when these two groups are learning and growing independently.

New technology and a deluge of new buzzwords aside, the day-to-day struggle between what is possible and what is practical is still the cornerstone of most television production. The difference in the HD world is that we are frequently confronted by a third variable that can be described by the question “What is allowable”?

This question has practical ramifications in both the strategic and the tactical sense. The strategic question concerns the expectations for HD in terms of production quality and whether a given broadcaster is going to make demands, like requiring that all HD programming be delivered in 5.1 Surround. The tactical question concerns the engineering response to a broadcasters’ strategic vision (i.e. the demands of production). This is one of the key questions that need to be answered when beginning to design an HD broadcast infrastructure.

There are countless ways to build an audio infrastructure for HD. Embedded HD-SDI, discreet AES3, Dolby E, VANC-embedding, metadata, loudness management, Dolby AC-3... the minefield of issues that engineering staff need to navigate is challenging to say the least.

No solution or approach is necessarily better than the next; save for the crucial fact that long lead-times needed to design and build HD plants mean that engineering decisions are generally taken prior to production decisions. This simple fact creates a situation where engineering staff often needs to guess what the production people are ultimately going to want, or need.

Think of it this way—the engineering folks often can’t imagine why a certain thing might be desirable, and the production folks probably don’t have a clear enough grasp of what should be possible and therefore have no means to ask for what they want in advance.

The result is that production and operations often have to make do with what they’re given. In a sense, the scope and capabilities of an HDTV facility tend to represent handcuffs on production and operations. This is because the outer limits of what is possible have been predetermined by engineering designs.

Let me explain.


Imagine a typical news control room that is being rebuilt for HD. Let’s also imagine that this control room has six VTRs (or server channels), four channels of CG (Deko, Chyron, etc.) and a fairly common complement of three music playbacks (Digicarts, etc.), in addition to the usual fare of studio microphones and receive lines. Confronted by the task of rebuilding the audio infrastructure for this room—the engineering team is immediately faced with a dramatic increase in the possible number of inputs needed by the audio console.

In the old SD control room, each of these devices (VTRs, CGs, Digicarts) would typically only be wired for two channels. If the engineering team is looking at a 40-input console that needs to be replaced, and if each of those two-channel sources could conceivably by replaced by 6-, or 8-, or even 12-channel HD equivalents, that simple little 40-input desk needs to be replaced by a 200+ input monster!

Do you want to guess how the average engineer deals with this conundrum? He decides that not all of the inputs are needed!

In the realm of seemingly innocuous engineering decisions, something as simple as deciding that only two of the six VTRs need to be able to play out multichannel audio is a major problem waiting to happen. In the old control room, production could structure their shows such that all VTRs were identical in their performance. Tape traffic in the old days was an issue of personal preference, and the needs of a given show.

But, in this new environment, if engineering has predetermined that only two of the machines can play back surround material, production has a significant limitation on how they can structure and execute their show.


The list of potential problems in this area precisely matches the number of times engineering makes a format-limiting decision. Deciding that multichannel capable music playback devices aren’t necessary because “all of their music is in stereo anyways” starts to be a real difficulty when production decides to have fancy new 5.1 theme music commissioned.

Not planning for, and wiring for, the audio I/O on the groovy new HD character generator because “the font never made noise before” becomes a problem when the sound designer builds 5.1 sound effects to accompany the animated fonts, and there’s suddenly no way to accommodate them. As you might imagine, this list of “oh, by the ways” is quite lengthy and accounts for most of the “Hard-to-Do” sentiment.

The solution? It’s easy and hard all at the same time. The simplest way to build an HD infrastructure which is truly future-proof (to say nothing of production-proof) is to imagine that everything and anything that was two-channels is going to be replaced by a multichannel equivalent.

In other words, make no assumptions about how your new facility will ultimately be used. A VTR that can handle 12 channels of audio is perfectly happy to deal with just stereo, and these days, an audio console that has 200 inputs is no more unwieldy than one that only has 80 inputs.

It’s undoubtedly a hard thing to sell at budget time, but, doing the minimum now almost certainly means you’ll be doing it again soon, and usually at a higher total cost. Building an infrastructure which is audio-format agnostic is simply the best decision you can take. It will seem a little absurd at times, but I guarantee it will pay dividends when producers and directors start prowling around asking for the world.