Captioning systems

Although the FCC established rules for the carriage of closed captioning quite some time ago, the process of generating, encoding and transmitting captions has had to adapt to the evolution of digital broadcasting. In the analog world, EIA-608 (now CEA-608) defined how captions should be carried on line 21 of an NTSC transmission. With the onset of digital transmission, EIA-708 (now CEA-708) emerged, defining how to carry captions over a DTV transmission.

Captioning is complex

Within the broadcast plant, the distribution of DTV closed captioning (DTVCC) is based on the use of SMPTE 334-1, which defines a method of embedding DTVCC and other data services in the vertical ancillary (VANC) data space of HD-SDI and SDI signals. Carriage over these two interfaces is defined by SMPTE 292M High-Definition Serial Digital Interface and SMPTE 259M Serial Digital Interface, respectively. The 334 standard defines a caption distribution packet (CDP), the basic unit of data that is carried through the DTVCC distribution chain. The CDP consists of a specific sequence of bytes that can carry CEA-708 DTV caption data, CEA-608 caption data, caption service information and SMPTE 12M-1 time code. All SMPTE 334-1-compliant distribution equipment handling HD-SDI and SDI signals should pass DTVCC when properly configured.

A standardized protocol for carrying CDPs over an EIA/TIA-232 (formerly RS-232) serial interface from a caption encoder to an MPEG encoder is defined by SMPTE 333. As with many devices using a serial interface, this protocol uses software (as opposed to hardware) “handshaking” for flow control and synchronization between the two devices. Alternatively, the serial protocol described in SMPTE RP2007 does not use flow control; the caption data is “pushed” from the caption encoder to the MPEG encoder. DTVCC caption data can also be carried in an AES3 digital audio data stream, as specified in SMPTE 337M. Such an application could be used, for example, in place of DTVCC carried in the VANC of an SDI interface.

Caption authoring starts with the creation of captioning intentions: a high-level, usually text-based, description of how and when the captions should appear on consumer equipment. While there have been attempts to establish an agreed CEA-708 caption intentions format, there is currently no such standard protocol. Without a standard, the creation, handling and conversion of caption intentions is often a proprietary process. Caption encoders and VANC embedders consequently may or may not have built-in functionality to translate the caption intentions into CDPs.

Selection of video processing equipment should also take into account how that equipment handles captions when the video is delayed, frames are repeated or dropped, etc. When a processor cannot handle captions appropriately, bridging processors can be used to bypass the problematic system elements. The reader is directed to SMPTE Engineering Guideline EG 43:2009 for further information on the carriage of captions in the broadcast plant.

Broadcast carriage required

Both CEA-608 and CEA-708 captioning must be sent on broadcast transmissions for all nonexempt captioned programming. CEA-708 data also includes “legacy” CEA-608 captions, so that analog receivers can use that to generate captions, e.g., when the signal is provided through a DTV converter. While DTV receivers can use 608 data when 708 data is not available, this functionality is not mandatory in the receiver.

The carriage of ATSC closed captions is defined in A/53, Part 4. (See Figure 1.) Up to 16 services can be announced in a caption service descriptor (CSD), which identifies each service as “digital” 708 or “line-21” 608-type captions. With NTSC, CEA-608 allowed each field to contain only two characters at one time (2 bytes, or 16 bits); the data rate was thus 2 × 60 = 120 characters per second, or 960b/s. The CEA-708 data rate, however, is constrained only by the particular transport (or transmission) method. ATSC A/53 allows a data rate of 9600b/s, i.e., 1200 characters per second.

ATSC A/72 Part 1 specifies CEA-708 when using AVC; cable systems incorporate it as well for both AVC (ANSI/SCTE 128) and SMPTE VC-1 (SCTE 157). Digital Video Broadcasting (DVB) similarly defines CEA-708 closed captions when using MPEG-2, AVC and VC-1 video coding.

Broadcasters and MVPDs must ensure compliance with FCC regulations on correct carriage of closed captions, and cable plants retransmitting broadcast programming must correctly make captions available to subscribers. In the past, an all-analog cable plant would simply pass the entire NTSC signal from source to subscriber, keeping the line-21 captions intact. Today's situation, with digital sources and mixed analog-digital cable plants, is more complex. CEA-608 captions can be re-encoded onto line 21 in an STB, but CEA-708 (DTVCC) captions require different processing, as the rendered captions are essentially a superset of CEA-608. TVs receiving their video from STBs, whether from cable, satellite or even DTV converters, have the additional complication that the captions could be rendered by the STB or within the TV itself. (See Figure 2.)

Mobile broadcasting has its own adaptation of captioning, too. Because the mobile multiplex is intended to be decoded independently of the main program, mobile captions must be embedded separately within the mobile transmission, even if simulcasting the main program. At present, there is no FCC mandate to carry captions within a mobile transmission.

Captions transmitted in the ATSC mobile standard ATSC A/153, as described in CEA-708D, can be carried using descriptors as defined in ATSC A/65, constrained as per ATSC A/72 (i.e., as for AVC coding). The captions are listed, together with video, audio and other services, in the service map table, or SMT-M/H. One key difference from A/72 is that variable bit rates, not to exceed 9600b/s, are permitted for the closed caption payload. (That is, packing bytes need not be used, and when captions are not present, no bandwidth allocation is needed.) This was an intentional difference from older versions of CEA-708, which required the fixed allocation of 9600b/s for DTVCC. Another difference is that closed captioning, AFD and Bar Data are not carried in the SVC enhancement layer, when that is used. Receivers decoding the SVC enhancement layer are expected to use the information present in the SVC base layer, i.e., the AVC “compatibility” stream.

3-D closed captioning

The Television Data Systems Subcommittee of CEA is considering how CEA-708 caption services can be rendered with stereoscopic 3-D program content, and how basic 3-D coding capability might be added to CEA-608. CEA is welcoming participants to join the appropriate workgroups.

Aldo Cugnini is a consultant in the digital television industry.

Send questions and comments to: aldo.cugnini@penton.com