Time Warner's sound

Time Warner Cable currently serves more than 8.5 million digital video subscribers and provides an enormous amount of programming to its customers. As the cable industry continues to evolve and new digital tools become available, companies such as Time Warner Cable need to refine the way they ingest audio content. To this end, the cable company recently completed a study on how to best maintain consistency of audio levels across programming, commercials, local ad spots and other inserted content.

Time Warner Cable Media Sales' Southwest Region Operations Center teamed with Dolby in conducting a test focused on its ingestion of digital content. This particular operations center supports more than 230 insertion servers, with five cable multiple system operator (MSO) partners. It also maintains more than 25 physical insertion locations, ingests about 1000 pieces of content weekly, and inserts and monitors content on more than 2450 channels. It manages designated market areas (DMAs) in Colorado, Kansas, Nebraska and Texas, which comprised the testing region.

The study targeted digital program ad insertion applications, which total more than a quarter million ad inserts daily. The study focused on the proper use of dialog normalization to improve subscriber experience, and both real-time and non-real-time (file-based) system components were considered. First, the provisioning practices of local (headend) digital simulcast encoders was investigated by analyzing content over which local control was possible (via the dialog normalization value and incoming loudness levels with baseband signals). Also considered were loudness levels of pass-through services (for which local control is not possible).

A snapshot of ingest content

In the study, 18,695 commercial ad spot files were analyzed, and among these files (all of which were stereo audio), 500,000 data points were collected. The file-based ad spot content used MPEG-1 LII audio almost exclusively at the acquisition point. Unlike Dolby Digital content, MPEG-1 LII does not contain audio metadata (and therefore, does not contain a dialog normalization metadata parameter that the decoder can use to maintain consistent loudness levels among programs, ad spots, and so on).

Working with Dolby, the Southwest Region Operations Center refined their existing file-based approach that included the deployment of the Dolby DP600 program optimizer. Simple workflow adjustments to Time Warner Cable's approach have improved consistency and predictability for its viewers, the ultimate goal. In addition, the process is scalable, and therefore relevant to all distribution stages, from program creation to networks and local stations. Overall, the results of this study provide a practical blueprint applicable to any broadcaster wishing for consistent, repeatable and predictable results in quality control procedures and the proper setting of dialogue normalization parameters.

Due to the lack of dialog normalization metadata, Time Warner Cable's ad spots had previously been transcoded in a blind fashion, using a commercially available transcoder with a default set of Dolby Digital metadata settings, but quality controlled using a LM100 and then processed appropriately matching dialogue normalization with content loudness. Specifically, all were transcoded to a static dialnorm value prior to final QA.

Steps to improve subscriber experience

The blind file-based transcoding with a static (default) dialog normalization value leads to unnecessary level shifts away from the Dolby Digital decoder reference level (in the set-top box or A/V receiver). The master control center used a Dolby DP600 program optimizer to analyze and subsequently nondestructively correct the success rate of blind transcoding, using the system's adaptive and automated speech-based ITU-R BS.1770 measurement method.

The analyzed (and subsequently corrected) ad spots were divided into two groups for analysis. Group I contained locally produced content, primarily ingested via FTP directly from ad agencies. Group II comprised ad spots delivered via a popular ad content aggregator.

For Group I, only 34 percent of material had correct dialog normalization (after the blind transcode). (See Figure 1.) Group II did not fare nearly as well, with only 6.7 percent success. (See Figure 2.)

In both groups, the incorrectly set values were dispersed over a fairly wide dynamic range. Group II fared even worse in this regard. While the majority of content was spread over a smaller loudness range, that distribution of individual loudness values was centered at approximately 3dB above the correct dialog normalization value.

Subsequently, the automated analysis and correction engine on the DP600 was used to correct the dialog normalization value within each of the ad spots. This process does not require a decode/re-encode cycle. It recomputes dynamic range control metadata parameters (i.e., it does not impact program dynamics), in addition to correcting the dialog normalization value, in faster than real time.

The difference between blind-transcoded files and files corrected using the DP600 was dramatic. For corrected files, the level was much more in balance with the surrounding program content. A comparison of corrected and uncorrectd ad spot audio levels as they relate to the surrounding program audio levels can be seen in Figures 3 and 4.

Lastly, the study also required ensuring that the dialog normalization value on every digital simulcast encoder was correctly provisioned. The correct dialog normalization value for each digital simulcast service was derived via long-term dialog measurements using a Dolby LM100 broadcast loudness meter.

Checklist for success

Overall, the results of this study provide a practical blueprint applicable to many broadcasters wishing for consistent, repeatable and predictable results during the quality control process as well as ensuring the proper setting of dialog normalization parameter for every piece of content.

There is an increasing amount of documentation supporting proper use of Dolby Digital metadata and dialog normalization as a means to improve subscriber satisfaction. Loudness issues that have hampered the consistency of cable broadcast audio can be easily addressed if the industry moves forward together with the proper use of dialog normalization (or by a local decode/re-encode stage at the headend). If content providers, post-production houses and manufacturers throughout the industry continue to work with networks and programmers, together we can ensure proper provisioning of dialog normalization, consistent levels for all programming, reduced complaints and better viewing for all.

Ivan Larsen is regional technical services manager with Time Warner, and Jeffrey Riedmiller is technology architect, broadcast, for Dolby Laboratories.