Mixing 5.1 surround audio for HDTV production is more than just moving faders.
Together with my business partner Ian Rosam, I’ve was busy most weeks this past year mixing live sports for 5.1 broadcast. In the summer, we were heavily involved in broadcast-quality control for the London Games, and for the ninth year running, I have also been responsible in the autumn for the weekly live broadcast mix of Simon Cowell’s “The X Factor” on UK network ITV. “The X Factor” is produced by talkbackTHAMES (part of the FremantleMedia Group) and SyCo Tv. The program franchise is licensed by FremantleMedia Enterprises and is produced in 25 territories around the world.
The purpose of this article is to discuss some of the surround mixing techniques I have developed through this work and other projects, but the fader-moving part of live mixing for broadcast is often the easy part of the process, relatively speaking; the more involved part by far is the planning and preparation.
Carefully thought out mic placement, system and signal path design, and attention to mixer layout are essential prerequisites for any 21st-century live broadcast mix. Time permitting, there’s no substitute for sitting down with a pen and paper before a job and doing a bit of planning. What does the client want? What can I deliver, and does it match that? How much time will I get for troubleshooting?
If possible, time should always be set aside to carry out tests on the mics, checking, for example, that they’re all equally sensitive and adjusting input gains if they aren’t, and looking for crackles or other artifacts. Even physical tests are helpful, like shaking a windshield to make sure that a mic hasn’t fallen out of its mount.
It might sound old-fashioned, but sending test tones through the groups and paths of the system can often flush out problems. For example, if a side-chain has not been properly set up, you might find that your compressors aren’t working as expected. If you have a synchronization problem — and they still happen all the time with digital cameras and systems — a simple clapperboard test makes it immediately apparent. Ideally, all of this should be done before you can even think about mixing.
Sports vs. light entertainment
Although rigging a stadium for a World Cup game and placing mics in a live audience for a light entertainment show might seem different tasks, they share many common aspects. In both situations, we create our coverage so that it’s 5.1/stereo-compatible, as far as possible; we have to ensure that downmixed stereo from our 5.1 mixes will still work for viewers on older stereo or even mono equipment.
Our approach is similar for both types of work: We use a SoundField DSF-2 surround mic mounted high above the crowd or audience to give us the basic ambience of the location in a phase-coherent, downmix-compatible form — we say it’s like the glue that holds the rest of the sound together — and then we supplement that with spot mics. In sports, the exact number varies by the type of game we’re covering. If we’re doing a particularly important match, we might decide to put an extra stereo pair of 416s at each end looking at the crowd from behind the goals, to better reflect how detailed we think the visual coverage is going to be. On “The X Factor,” we have more spot microphones so we can capture more close-up detail, like individual laughs.
There are plenty of other ways to achieve similar coverage. You can put four discrete mics in a stadium and get a 5.1 effect. A lot of people like that approach; you will certainly be able to hear different sources coming out of different speakers. But overall, we find that 5.1 with a strong element of coherence is a more pleasant thing to listen to. In evolutionary terms, we’re programmed to respond to sudden audio impulses from the rear because that’s a direction we can be attacked from more easily. That’s why films use the rear channels for sounds that are designed to scare or unsettle the viewer. When your audio features constant rear-channel sounds, it becomes uncomfortable to listen to after a while.
Also, in a domestic environment, people don’t listen near the middle of the speaker array, in the “sweet spot.” In a typical European living room, people tend to sit towards the rear channels; that’s where the sofa usually resides. So our view is that you don’t need to use the rear channels much; they should really be there to give you a sense of the space in which the event is happening.
Another common aspect is the audience or crowd. From a mixing perspective, the challenge is the huge dynamic range of a crowd. On a football pitch, you have to capture anything from a few shouts and the odd bit of gentle applause right up to 60,000 people screaming their heads off because someone’s just scored in the last minute of the match. The same is true in light entertainment, so you have to have a good compression strategy.
Miking the crowd
Of course, there are many mixing challenges unique to each type of work. In a stadium, the elements make a big difference. If it’s a humid day, with lots of water in the air, or if it’s foggy, that makes a big difference to the high-frequency content of the sound we’re capturing, and we EQ the mics to compensate. For sports, there’s also debate as to where to put the commentary in a 5.1 mix. Do you spread it across the front three channels to create a phantom center, or use the dedicated center channel only? We choose the latter. From a post-production point of view, and also if you’re doing world feeds, it’s convenient to be able to remove the center channel, and thus the commentator in the country of origin, and replace it with something else. Behind that center speaker, we also put some effects from the 5.1. A commentator’s mic is often cut off completely when he or she is not talking, and it does sound odd if the center channel suddenly goes dead.
Mixing live football is also less predictable than light entertainment. Obviously, you never know where a football is going to be kicked, so you’re always chasing it around the pitch, crossfading from one pitch mic or stereo pair to the next, and trying to maintain a consistent level as you fade from one mic channel to the next. You soon learn to pull channels down pretty quickly once the ball is kicked, as players often swear profusely after taking a shot!
Singer’s microphone technique
The technical challenges with a show like “The X Factor” are different. You have people with a wide range of abilities singing into microphones, some with good technique and others with none. On the night of show, they can be twice as loud as they were in rehearsal because they’re excited or half as loud because they’re nervous.
I now mix “The X Factor” on a Calrec Apollo. Before that, we needed so many simultaneous inputs, I used multiple consoles. Somewhere, there’s a picture of me working with five sub-desks. With more than a 1000 inputs, I had to find a way of laying everything out that covers all the unexpected things that can happen during a live show. It’s down to planning again.
I’ve evolved certain standard ways of organizing things over time. There’s a lot of layer capability in the console, up to 12, and A and B options in every layer. One of the first things I did was decide not to use any of the B layers, so I asked the manufacturer to write a software switch for me. I asked for locking as well, so that certain key presenter channels remained on certain faders irrespective of the active layer. I also asked for them to implement layer faders, where you put the outputs of everything on one layer through a single fader. If you’ve got a noisy mic on a layer somewhere and you need to take it out in a hurry, there’s no time to worry about how to access it. You need a single fader you can reach for. Layer 6 is always where the master faders are for the groups, and I have some emergency mics on hand on layer 4, for example, in case the radio mics break down, or I have to use cabled mics because the radio rack has died.
Compression and limiting is important on “The X Factor” because no one wants the large dynamic range of the audience on an in-your-face pop show like this. I use a tried-and-trusted external compressor/limiter across a group that I make up of all the individual audience mics, excluding the surround microphone. All of the spot mics are compressed separately, and then I add the surround microphone on top, and I ride the gain on the composite. Some fairly severe limiting goes on for the applause, and compression too, at a 3:1 or 4:1 ratio.
I use the desk’s built-in dynamics processing for control of the final 5.1 mix, but by that stage, the really unpredictable element, the audience, has been dealt with by the external compressor/limiter. A downmix is done automatically within the desk to stereo for live transmission. I also generate auxiliary feeds for the foldback desks and PA desks, which are separate, and a multichannel MADI stream of the week’s backing tracks and vocals for another team to remix and send to iTunes. Within a few minutes of the show’s end, the tracks can be sold online.
Plan for breakdowns
Be sure to plan redundant systems into your setup. Our SpotOn systems have never crashed on-air, but just in case, we mirror them so that when we trigger one, a second one is triggered in sync. We even have a third in reserve in case a mains bump takes out the two primaries. Other people might say that’s overkill; to me, it’s just job preservation. Similarly, with the microphones, we have lots of spares, and the presenters and judges have dedicated spares.
I don’t worry about mixing the show live any more. It’s live TV, so something can always go wrong, but I hope that we’ve done enough planning to get us out of most things that could go wrong.
—Robert Edwards is Sound Director at Video Sound Services Ltd, a UK-based broadcast sound and mixing consultancy providing services to international broadcasters including BSkyB & HBS. He has been the sound supervisor on ITV’s “The X Factor” since the show’s inception in 2004.