One of the more astonishing audio demos I experienced at the recent AES didn’t happen on the show floor. In the W Hotel, just across 3rd Street from the Moscone Center, a company called GenAudio was showing off something they call AstoundSound, said to have the ability to delivery accurate spatial cues, including height, from a typical two-speaker audio setup.
Needless to say, I was skeptical.
In the suite, I was regaled by GenAudio chairman and CEO Jerry Mahabub, a compact dynamo of a man with a background as a child genius and an insatiable intellectual curiosity. Entering Renssalaer Polytechnic Institute at age 13, Mahabub became an integral part of the development team for magnetic resonance imaging (MRI) at Intermagnetics General Corporation, and leveraged that knowledge to decode how the human brain perceives audio spatial cues.
“So I got handed the keys to this MRI laboratory at the age of 16,” he said. “I would go in after midnight and scan my own brain using a 10 Tesla coil, which is incredibly powerful and requires a cryogenic system to bring it down to a temperature of 70 Kelvin. I’d image my brain and started noticing how different areas were lighting up in response to sounds, and I started noticing a pattern.”
Mahabub realized that if he combined that data with electrical activity data taken simultaneously (EEG), the resulting database could be analyzed numerically. Over the course of 12 years and more than 8000 brain scans, he developed a mathematical model of how the brain perceives sound in space. This was then transferred into a set of digital audio filters that are, in essence, a totally new way of interpreting head-related transfer functions that he calls Human Brain Response Functions (HRBF).
“Essentially, HBRF is a measurement of the auditory cortex. The cochlea of the ear, where frequencies are broken down, is tonitopically mapped to the auditory cortex on the left and right side. What we did was to measure both the frequency breakdown and magnitude response, along with internal time delay. By looking at the time the areas of the brain fire up, you get an accurate measure of internal time delay. In essence, when you combine all this together, you can create a filter that gives you localization, yet keeps everything in phase.
“The next step was to learn how to control perception of azimuth and elevation,” Mahabub continued. “That involved putting a fishbowl on the head and measuring those things, then translating that into our filter set. We ended up with a set of 7337 filters that, used properly, can create accurate and predictable sound localization of any sound source, except mono, for both azimuth and elevation, plus distance and movement. It’s pretty cool stuff.”
The result of all of this R&D is AstoundSound, a technology that leverages HBRF to create extremely realistic perception of location and movement of sound. Unlike other systems, it uses an encode-only methodology. The difference is that where existing systems rely on panning, phase and multiple speakers to convey spatial cues, AstoundSound lets the listener’s brain do the work.
“That’s a pretty simple way of putting it, of course,” Mahabub said. “The real advantage is the ability to move sound around, including elevation changes. So when the sound source is moving, say in a movie or TV show, even listeners with two-speaker sound systems can perceive it. In 3-D, when the visual cue is an object pushing out of screen toward you, AstoundSurround gives the visual effect a lot more impact, because the sound jumps off the screen with the visual effect. That’s especially important in movie theaters, many of which are still using old speaker systems. The advantage of AstoundSound is that it works with any of them: LTRT, 5.1 discrete, Dolby Digital, 7.1, even straight stereo.”
The proof, of course, is in the pudding. The GenAudio team played a series of demos that had been mixed in AstoundSound and were played back through a straightforward, two-speaker stereo system. Most impressive was a CD by Kitaro. The production technique was to take the 5.1 audio straight from the SACD disc, transcode it to PCM and remix through the AstoundSound filter set to create spatial cues.
The reproduction did, in fact, sound very natural and 3-D. The soundstage had definite depth, creating the experience of the instruments being arrayed around me. But, most impressive was the elevation effect. Suddenly, one instrument played a staccato run, but its location was in motion, starting on the front right, distinctly high in the air (I actually looked up) and flying past me to the back of the room. It was a very distinct and realistic effect, and one that I had never heard before.
I was amazed — and that’s not an easy thing to do. I asked to hear the track again.
The effect was equally evident no matter where I stood or sat: front and center, along the side wall, in the back of the room. The surround effect was as impressive as any 5.1 system, but the very distinct elevation change was the stunner. I’ve never heard anything so convincing, and it’s amazing to think that this was accomplished using a two-channel folddown through a standard stereo playback system.
The final steps for GenAudio are to find a practical application for this technology and get it into the market. “Our objective was to enable professional audio mix engineers to interact with the filter set in a way that didn’t inhibit their workflow,” he said. “Our first product is a set of plug-ins called Professional Audio Tools, which we are offering for licensing in two implementations: one is mono in, anything out; the other is stereo in, anything out. This is where we get to AstoundSound, which can be integrated directly into the surround mix and turns it into surround sound on steroids.”
In practice, AstoundSound fits easily into existing workflow and is fully compatible with all encoding and delivery technologies, including Dolby, DTS and SDDS. The plug-ins control the filter set and work equally well in any production context, including Final Cut Pro, Pro Tools and conventional console mix sessions.
GenAudio is promoting AstoundSound through its West L.A. recording studio, where demos can be created and played back. Recent projects included a remix of six songs for Interscope artist Robin Thicke and a repurposing of the “Hellboy II” DVD (Universal Pictures) for home theater release.
Overall, perhaps the most interesting aspect of AstoundSound is the fact that it does not require more than two speakers for effective surround playback and seems equally effective regardless of the size and shape of the room. “To me, that really shows how advanced this technology is,” Mahabub said. “AstoundSound will literally work in any space. Why? Because it’s based on the brain and the way it works. Present the proper audio cues, and the human brain knows what to do. That’s where the real processing happens, and that’s what we’ve tapped into.”