Metadata

Because digital transmission systems distribute data, one can say that metadata is "data about the data." However, as we are usually concerned in this column with broadcasting of program content, a better description would be “information about the content.” The forms of defining, handling and relaying metadata are quite varied, and this month we’ll examine a few, from ATSC to DVB and onto the Internet.

Mobile receivers

One thing metadata provides is program information to fixed and mobile receivers. The well-known Program and System Information Protocol (PSIP) has long been the prime carrier of metadata using the ATSC A/53 standard. PSIP carries several important data tables, including timekeeping and channel information, and the Event Information Table (EIT), which supplies titles and program guide data for each program event associated with a virtual channel. Each EIT is limited, however, to a period of three hours, so broadcasters must routinely update the EITs when new or more accurate information becomes available.

In the ATSC Mobile Standard (A/153), more extensive program metadata can be carried within various components: Signaling, Announcement and Non-real-time (NRT) file transfer. The Transport Signaling System is a signaling layer that uses the Fast Information Channel (FIC), in combination with the Service Signaling Channel (SSC), to deliver critical information. This information allows for rapid program acquisition by the receiver. Although they do not provide detailed program descriptions, the FIC and SSC provide enough structural information to allow video and audio decoding to initialize in the receiver. The information carried within the SSC is similar to the high-level PSIP information carried within ATSC A/53.

Announcement, which is optional, provides the framework for an Electronic Service Guide (ESG), using components from the Open Mobile Alliance (OMA) Broadcast Services Enabler Suite (OMA BCAST). An ESG is delivered as a file consisting of several XML sections, using File Delivery over Unidirectional Transport (FLUTE), a scheme that ensures quality of file transfer over the potentially lossy, one-way broadcast medium.

ESGs can carry information for upcoming as well as current programs, including start times and duration of events, channel icons, program titles and descriptions, genre, and ratings. Multiple ESGs can be carried simultaneously, and an aggregated ESG across providers could be downloaded via an out-of-band (interactive) channel, such as by 3G or Wi-Fi.

The concept of metadata can be extended hierarchically, using what is called an XML schema, a document that describes the structure of other XML documents. In that respect, an XML schema can be considered “data about the metadata.” XML schema can be used to define extensions that add ATSC NRT-specific metadata; one example could be to define a grouping of files into content items within the FLUTE sessions used for fixed or mobile NRT content delivery. Standard methods for constructing an XML schema include the document content description (DCD).

Metadata requires an infrastructure through which it can be generated, processed and delivered to the transmission point. For this purpose, there is an ATSC standard (A76B) called Programming Metadata Communication Protocol (PMCP). PMCP is specifically defined to provide a consistent means to handle metadata in systems and equipment supporting the production and delivery of PSIP and Announcement tables. Target applications include traffic, PSIP generation and automation systems at broadcast centers and program listing services.

Using PMCP, advance program information as well as last-minute scheduling changes can be quickly delivered to the universe of digital receivers. Examples of elements within PMCP include the “ShowData” and “PsipEvent” elements, which can communicate metadata about a program, independently of its scheduled broadcast air time, affecting both current and future ESG information. Figure 1 shows one such example.

Figure 1. PCMP uses XML to provide program metadata.

Audio also carries associated metadata, including several factors that can be controlled for each program, such as Bitstream Mode, Dialogue level (Dialnorm), Dynamic range control (DRC) and Downmixing. Bitstream Mode defines the arrangement of discrete or associated services, the most commonly used being: Complete Main (CM) supports from one to 5.1 channels of audio; Main M&E is similar to CM, but omits the dialogue channel, which can be carried separately as Associated Dialogue (D); descriptive audio and increased-intelligibility audio can be sent as Associated Visual Impaired (VI) and Associated Hearing Impaired (HI), respectively; and Associated Emergency (E) can be sent to override all other audio.

Dialogue level sets the average level of speech in the program audio at playback time, referenced to a known sound level. This parameter can now be used to help assure compliance with the CALM Act. Dynamic range control allows the user to optimize the dynamic range of the content, essentially setting it to a pre-calibrated compression curve. Downmixing allows for appropriate reproduction in the home environment, so that every user can enjoy a compatible experience regardless of the presence or absence of multiple speakers.

Room Type is another audio parameter, which describes the equalization used during the final production mixing session. The “Large room” parameter emulates a dubbing stage with the industry-standard X-curve equalization; the “Small room” has flat equalization. This parameter allows a home audio system to be set to the same equalization.

Although various audio metadata parameters are reserved for professional use, the ones mentioned here could all be provided to the consumer. Audio metadata in broadcast and production facilities has its own carriage interface, either by SMPTE RDD 06-2008 over an RS-485 serial connection or via an HD-SDI connection.

Service information over DVB provides extensible metadata. DVB similarly provides metadata for video, audio and program information, supported through various structures and XML schema for live, on-demand and file-based content; DVB-SI (Service Information) is codified in ETSI EN 300 468 and ETSI TR 101 211. DVB now includes a metadata profile defined by the TV-Anytime Forum (ETSI TS 102 323 and others), with XML schemas and profiles adapted for enhanced PVRs. A program guide for broadband (Internet) content is also in the works, together with associated metadata definitions.

Support for screen formats

Because the transition to digital broadcasting has required the support of new display formats, the adequate presentation and cross-compatibility of content authored in the 4:3 and 16:9 aspect ratios has created a need for supporting metadata. To fill this need, various organizations have developed the Active Format Description (AFD), a standard set of codes defining the aspect ratio and active picture characteristics of a video program.

Together with optional bar data (indicating the size of top, bottom, left and right bars), DTV receivers can be instructed to crop, letterbox, pillarbox, or pan and scan images for best viewing compatibility. ATSC, ETSI (for DVB) and SMPTE have all published similar but non-identical versions of AFD codes, in A/53, TS 101 154 and 2016-1-2007, respectively.

Aldo Cugnini is a consultant in the digital television industry.