Skip to main content


A highly integrated HD/SD processing infrastructure is becoming mandatory, while tight cost constraints remain a focal point as the HD “business case” continues to be validated. Tools that facilitate this transition for broadcasters include signal conversion, aspect ratio conversion and signaling, multichannel audio processing using audio metadata and video-to-audio timing test and measurement.

In addition, issues beyond conversion of the essence of baseband digital video and audio need to be addressed. An HDTV infrastructure introduces an entirely new level of data and metadata handling requirements. In the analog and SD era, data and metadata tended to be the domain of vertical blanking interval (VBI)-based information and its digital equivalent. In this new hybrid era, data and metadata elements go beyond closed captioning to maximize the consumer listening experience. The list now includes active format descriptor (AFD) for signaling aspect ratio and audio metadata required to signal stereo to multichannel transitions.

Significant issues and challenges face broadcasters that attempt to operate an HD/SD TV processing infrastructure in which digital video, audio data and metadata elements seamlessly interoperate together. Addressing these challenges is now critical because the approaching analog shutoff and the increased DTV awareness are forcing many broadcasters to rapidly implement a DTV transition plan. And an integral component of this plan is to design a facility that can accommodate HDTV and SDTV signal formats in a single-operation workflow environment.

Signal format conversion

A fundamental challenge in any facility transitioning to an HDTV infrastructure is the requirement to transparently handle multiple new signal formats. In addition to handling analog video and audio signals, a broadcast facility may be required to handle HDTV formats, including 720p, 1080i and 1080p signals of various frame rates.

Handling multiple signal types makes signal conversion — converting the ingested material to the facility's native format — difficult. Ingested material can be a mix of analog, HDTV and SDTV content. The standard solution is to use standalone HDTV up- and downconversion within the workflow. There are two issues to consider when incorporating HDTV conversion within the workflow: format detection and the application.

Ideally, an HDTV upconverter automatically detects the incoming signal type and reconfigures as needed to output the desired signal format. This functionality does not require external triggering when the input format changes and minimizes opportunity for error. However, if the upconverter does not incorporate automatic signal format detection, external triggering may be required. The preferred method for triggering upconversion is via the broadcast facility's automation system. Ideally, the automation system's traffic system would track the content's signal format so it could trigger the upconverter as required.

HDTV upconverters are typically offered in different varieties, each suited for different applications. Upconversion for on-air playout and transmission typically incorporates high-quality, motion-adaptive temporal HDTV scaling technology. Monitoring requirements within the workflow can be handled using lower cost HDTV upconverters, which use less expensive spatial upconversion technology.

Aspect ratio management

SD places minimal demands on aspect ratio management. HD content, however, introduces additional complexity to aspect ratio selection. HDTV content can be a mix of upconverted 4:3 SDTV content and native 16:9 HDTV content. There are multiple aspect ratios within the workflow that must be contended with. Standalone aspect ratio converters or aspect ratio converters embedded within other devices (e.g., within an HDTV upconverter or server system) require remote signaling or triggering to ensure correct aspect ratio configuration.

Many aspect ratio converters feature triggering via a General Purpose Interface (GPI). Aspect ratio triggering cues can be incorporated within the facility's traffic system, and triggering can occur via the automation system. (See Figure 1A.) GPI triggering, however, offers significant limitations. The GPI interface does not provide feedback to confirm a successful aspect ratio change. In addition, a GPI interface offers access to only a limited number of aspect ratio and control parameters, limiting the overall functionality of the aspect ratio converter.

A more comprehensive approach involves using aspect ratio signaling information embedded within the vertical ancillary (VANC) of the HDTV content. (See Figure 1B.) AFD metadata carries information regarding the aspect ratio of the active picture and can be used to trigger aspect ratio conversion devices.

Other existing aspect ratio signaling technologies, such as widescreen signaling (WSS), carry limited information about the active picture and are used in the SD domain only. AFD will facilitate the transparent aspect ratio conversion of the various content within the HDTV workflow. An aspect ratio converter should be aware of both WSS (for legacy content) and AFD metadata to be used effectively in a hybrid facility.

Audio considerations

There are many types of audio employed in a hybrid facility today. It is important to understand the differences between two-channel (stereo) and multiple-channel (surround sound) audio processing.

A stereo signal that sounds like stereo may have inaudible surround sound information encoded in the stereo signal (e.g. Dolby Pro-Logic II or Neural Surround). Surround sound mixes can have four to eight channels, depending on the format. Typically, stereo and 5.1 mixes are used.

Additionally, audio content may be moved around a facility as separate, embedded or compressed signals. This adds a level of complexity similar to the processing of HD and SD video signals.

When embedding a compressed audio stream into an SDI signal, it is critical to ensure alignment of the compressed audio header with the SDI frame boundary. When encoding baseband audio for contribution purposes using Dolby E in a compressed audio stream, the audio content is delayed by one video frame.

For compressed audio embedding, the audio content delay varies depending on the alignment of the compressed audio. When the compressed audio packet header leads the video switching line by less than 10ms, it is delayed by one video frame plus the delay required to place the packet header at the appropriate location in the video. When the compressed audio packet header leads the video switching line by more than 10ms, it is delayed by the time required to place the packet header at the appropriate location in video (no video frame delay added). Therefore, the embedded audio content delay may vary from about 1/3 to 1 1/3 frames.

One approach to ease the transition to a hybrid stereo and surround sound facility is to use audio metadata and identify the associated audio as stereo or 5.1. When upconverted HD and SD signals are switched or mixed through master control, the audio metadata is used to signal the Dolby Digital (AC-3) encoder. Another approach is to process the audio just before the AC-3 encoder by either passing 5.1 or upmixing stereo, matrixed (Dolby) or watermarked (Neural Surround) audio signals into a 5.1 signal.

Audio to video timing

Asynchronous processing delays in today's HDTV workflow have introduced the possibility of misalignment of audio and video with respect to each other. This is otherwise referred to as lip sync. The ideal strategy for dealing with this potential issue is to have incremental A/V synchronization integrity checking throughout the workflow. This minimizes the possibility that any single component within the signal path can contribute to A/V synchronization issues.

The use of embedded audio has minimized A/V synchronization errors but has not eliminated them entirely. Despite careful systemic attention to A/V synchronization, lip sync issues may still occur. Incoming content with asynchronous transmission paths may arrive with A/V synchronization errors. Addressing A/V synchronization issues can be done either online or via offline test signals. (See Figure 1C on page 68.)

Offline A/V synchronization correction can be accomplished through test signal generators with synchronized audio and video events. This synchronized audio and video test signal can be used by a downstream device to calculate relative A/V timing of the signal. The primary advantage of an offline A/V synchronizer is robustness. The offline test signal can be applied anywhere in the workflow and withstand any conversion or processing within the signal path. The main limitation of an offline A/V synchronization signal is that it can only be used in a signal path during a maintenance window when carriage of content is not required.

A/V synchronization can also be accomplished with online testing. This requires insertion of A/V markers within the program content. These markers must be invisible and usually take the form of watermarking. Online A/V synchronization is less robust, and various types of processing tasks may adversely affect online A/V synchronization markers. These include noise reduction, signal compression and image scaling.


Transitioning a facility to hybrid operations introduces many new operational requirements to a processing infrastructure. A careful understanding of the requirements is essential to optimize workflow efficiency.

Bob Fung is product manager for Harris Video Processing and Distribution.