Audio compression in 2004

There is a divergence in the use of digital audio data compression and bit-rate reduction technologies. On the one hand, there are the proponents of higher compression ratios using psychoacoustic techniques to squeeze stereo audio over cellular and POTS lines using compression ratios approaching 20:1. On the other hand, there are the proponents of traditional PCM replacement technology using modest compression rate Adaptive Differential Pulse Code Modulation (ADPCM) technologies.

But why settle for 16-bit PCM quality when the recording/professional market has moved on to 24-bit technology? If recorded audio is the starting point, and if so much care is being taken at source to ensure the best possible quality, then why accept any audio process that lessens the creative results of artists? So what can bit-compression do to help the process?

Figure 1. The projected continual rise in broadband E1/T1 chips sales and links in coming years. Courtesy of Infineon Technologies. Click here to see an enlarged diagram.

Let’s do the math: Assume we have a Pro Tools file of a 20-minute stereo spot sampled at 48kHz and employing 24-bit PCM. This means it will take at least 180 minutes to transmit the file over a 256Kb/s link. If we instead use advanced compression technologies with 24-bit word resolution to deliver the same file, time can be saved. Typically, these advanced compression solutions are almost lossless in terms of audio quality and can deliver the same 20-minute spot in 45 minutes (same assumptions apply) with all the original content and quality.

If we have access to a 512Kb/s ADSL link, delivery time is further reduced to almost real time of 23 minutes. In high-end applications, bits are not a consideration when compared to attributes such as quality, end-to-end delay and delivery times. Compression technology is changing. Studio and broadcast professionals are no longer limited to links using high compression ratios, but rather can choose from a selection of different technologies that better match the desired function.

If recording engineers, as an example, need to get approval for their creative work from a remote site, sacrificing the creativity of their work on a poor quality audio link isn’t a good choice. Broadcasters, on the other hand, are focused more on limiting listener fatigue and retaining listenership.

The best way to avoid the dreaded tune out factor is to make sure the listening experience is enjoyable. This doesn’t just mean compiling a great play list or employing famous talent. It also means paying careful attention to the audio transmission chain, both inside and outside the studio.

Compression algorithms are not about boastful compression ratios but are intended for the delivery of high-quality content in its original form. Figure 1 below, from chip manufacturer Infineon Technologies, shows the projected continual rise in broadband E1/T1 chips sales and links in coming years. This presumably will lead to cheaper high bit-rate synchronous and IP networks, which in turn may lessen the need for high compression ratios. The result will be a wider selection of delivery link options, often through less expensive channels. So what are some of today’s goals? First, 24-bit PCM sampling at 48kHz with low delay is key. This requires moving from 14-bit commanding-based technologies.

Second, ISDN costs, which used to be a determining factor in compression ratio selection, can still result in poor quality audio because of the use of highly compressed MP3 signals. However, as link costs are reduced, broadcasters no longer have to make such negative compromises. IP connectivity can provide a low-cost communications infrastructure, but still has the problem of delay. This is particularly relevant with live broadcasts, where delays cannot exceed 20ms. Therefore, it makes sense to combine higher quality 24-bit audio with low delay (sub 10ms) with the low-cost IP + E1/T1 infrastructure.

Noel McKenna is the managing director for APT.