What About Metadata?

One of the opportunities created by the digitalization of television is the ability of digital media and transports to carry metadata in addition to video and audio. Metadata is information about the program payload data. This information can be as basic as the program’s title and the date and time it was created. It can also include specific details about the video and audio material itself; technical details such as scanning format, colorimetry and audio parameters; archive details such as dates and times of creation and versions; artistic details as well as a host of other things.

Closed-captioning data is a special form of metadata that might not be strictly considered to be "metadata." Metadata can be created and inserted into the program signal at a number of places along the DTV production, postproduction and broadcast system chain.

Some metadata elements are created at the point of audio/video capture, some are added in the postproduction process and some are added at the point of network distribution or broadcast emission. Some pre-existing metadata elements must be changed or updated along the way.

From the technical standpoint, two categories of metadata are those that are used to perform functions or set parameters at various stages of the end-to-end production-to-broadcast process, and those that are to be used by the home receiver.


The metadata that may be used by the receiver includes such information as aspect ratio, scanning parameters, A-frame identification for 3/2 material, colorimetry, audio metadata such as dialnorm, PSIP, content advisory information, closed-captioning and audience measurement data. Metadata that can be used at interim stages in the production, postproduction and broadcast process, but which is not to be transmitted over the air, includes such categories as the universal program identifier (UPID), time of day, broadcaster logo triggers and A/V synchronization cues.

ATSC DTV encoders have the facility to multiplex metadata into the transport stream that they emit. Upstream of the ATSC encoder, there are as yet no standardized methods to carry metadata in the production and postproduction stages or in the bitstreams routed around television broadcast plants, but progress is being made in this area.

The manufacturers of the two HDTV VTRs that are most commonly used in the industry will soon introduce models that are capable of recording metadata. SMPTE has recently trial-published 334M, a proposed standard for mapping metadata into both the 292M HD-SDI interface and the 259M SDI interface. 334M standardizes a method to embed metadata into the serial digital interfaces along with video and audio.


Both the 292M and the 259M interfaces contain ancillary data space in the horizontal and vertical dimensions. Horizontal ancillary data space, or HANC, is included in the portion of each scanning line outside the active picture area, and it is used to carry embedded audio in the interfaces.

Vertical ancillary data space, or VANC, is the data space corresponding to the analog vertical blanking interval. It encompasses much bigger chunks of data space than HANC, and it amounts to several Mbps in the HD-SDI. The VANC data space is where metadata will be embedded.

The narrow point for metadata flow in the HD television plant and the production/postproduction process is the VTR itself. Both the VTRs previously mentioned were designed to accommodate SD video signals. They store high-definition signals by compressing them and mapping them into formats that emulate the SD signals the recorders were designed to record.

Both of these recorders use most of the data storage capacity of their respective media to accomplish this, in order to keep the compression ratio as low as possible and thereby produce acceptable pictures over a number of generations of re-recording. This means that the amount of ancillary data storage space on the media is necessarily limited.

In a process similar to that used with embedded audio, the recorder must de-embed the incoming metadata from the HD-SDI and pack it into available data spaces in the signal that is recorded onto tape. The inverse of this process must occur when the tape is played back, with the metadata being re-embedded into the outgoing HD-SDI bitstream.

If the metadata is to be reauthored or changed in any way, it must be de-embedded before such changes are done and re-embedded afterward. Likewise, when the HD-SDI signal reaches the ATSC or network distribution encoder, the metadata it contains that is to be passed through must be de-embedded and multiplexed into the MPEG or ATSC transport stream.

The ability to store, transport and manipulate metadata constitutes another big step in the DTV implementation process.

Randy Hoffner