Multilingual subtitle creation

A prominent feature of today's global media world is the increasingly important role played by subtitling. More broadcasters are achieving global media distribution through the use of tapeless post production, storage and playout, and relying on the latest subtitling technologies to make content accessible in regions where dubbing is too costly an alternative and only suitable for premium content such as blockbuster movies. Legislation creates an additional requirement for subtitling for the hearing impaired; although, all content providers (including advertisers) are keen to reach the widest audience possible, regardless of accessibility issues. With the proliferation of Internet-based video, lawmakers are starting to propose that streamed and downloaded content should also carry subtitles. And in public places, such as at airports, gyms and other high-traffic areas where sound is muted or where there are high ambient sound levels, subtitles are becoming more widely used. Overall, the demand for subtitling is at an all-time high. (See Figure 1.)

But there are key issues to overcome to achieve the successful and efficient playout of subtitles to an ever-growing array of platforms. There is a very diverse range of video and wrapper formats to be supported, and there are significant limitations with some playout technologies. Although wrapper formats such as Media eXchange Format (MXF) and QuickTime are sophisticated containers for transporting and storing media, broadcast equipment does not typically support the storage of multilanguage subtitles in a format suitable for repurposing.

Although there are several newer formats that promise to be the new all-encompassing standard for the future, the massive number of legacy files in the field means any system must cope well with the import, repurposing and export of a huge number of formats. Broadcasters and content producers have simply invested too much to discard legacy files, with the vast majority being proprietary and often very guarded, vendor-specific formats. So in any subtitling process, great care must be taken when crossconverting formats to ensure the preservation of metadata and other essential information about the subtitles.

Subtitle-processing software can take advantage of existing technologies to embed data, enabling multilingual subtitles in a single version of an asset. This can be achieved by using VBI or VANC tracks associated with the media asset. The embedded subtitle data can be converted downstream to a variety of formats, including burnt-in subtitles or DVB bitmap subtitles.

Wrapping for diversity and repurposing

Like any aspect of broadcasting, subtitling can be a slow, labor-intensive process if it's done with outdated methods. The key to using subtitling extensively and effectively is an efficient workflow that allows broadcasters to shorten the creation and playout cycle and keep costs at a manageable level. All broadcasters have different requirements and processes and need individually tailored workflow solutions. The subtitling component needs to be integrated into the broadcaster's overall solution, preferably during the initial design of the system. With the goal of supporting a multitude of output video formats, the focus is shifting away from the traditional production system and transmission chain toward digital asset management (DAM) systems.

To aid in format and resolution conversion for diverse distribution formats, content is increasingly stored as a single, common “mezzanine” format representing the highest-quality version, and all subsequent broadcast and streaming versions are derived from that. This can be wrapped as a universal format for easier exchange, and, to further aid repurposing, subtitle data should also be stored in a highly generalized form suitable for repurposing at transmission time.

There are two methodologies broadcasters employ when taking this approach. One relies on the creation of an “übersubtitle,” which has as much information as possible related to the subtitle and from which less sophisticated subtitles can be derived. A “mezzanine subtitle” will often rely on a professional subtitler to make informed choices for presentational aspects such as font, color, position and alignment information, drop shadow, and character edging — sometimes following prescriptive formats or house styles defined by the agency or broadcaster. Such “über” formats can support the media asset for its lifetime, allowing for elegant, effective and highly automated translation to various output distribution formats.

Alternatively, broadcasters may use a transcode method that relies on a lowest-common-denominator file (such as a WS Teletext or U.S.-style EIA-608-compliant captions) being created and then upconverted to the target format. It's a quick and easy approach, but it does not take advantage of the sophisticated options available within higher-end standards (such as DVB ETS-300-743, DVD bitmap or EIA-708).

Many broadcasters choose to adopt a hybrid of the two approaches, implementing some of the capabilities while limiting the overall time dedicated to creating the subtitle by constraining and automating some options. In any case, it is well worth noting that different standards offer varying levels of control and sophistication.

The good news is that with effective content management, the subtitle data contained in the media wrapper can enhance the asset metadata, providing a rich, searchable source of content-related information. This can further enhance efficiency for logging and research as well as enable monetization of content, thus allowing the subtitling process to deliver value throughout the future of the asset.

Early, late or live binding for delivery

Once the created subtitle file is signed off for delivery, it must be bound to the content to enable it to be presented to viewers when they watch the programming.

This binding can be done at one of three stages in the process:

Early bindingThe preprepared file is matched to the programming well ahead of transmission.
Late bindingLate binding occurs near to airtime and is only possible thanks to faster-than-real-time encoding techniques.
Live bindingLive binding is employed for either live content or preprepared content that only becomes available very close to airing, making it impossible to prepare subtitles in advance. (See Figure 2.)

The traditional method for subtitling preprepared content was early binding by creating a submaster tape with the subtitles encoded into the VBI space on the tape by inserting them into baseband video. Although this was an appropriate method when linear, single-channel broadcast TV was the only form of output (it is still possible to do subtitling this way), it is now outmoded because it's so time- and labor-intensive. Instead, files are now either sent for time-of-air transmission (live binding) or are transcoded into a file-based video asset (during early or late binding). (See Figure 3.)

Time-of-air transmission generally involves systems that integrate with the automated workflow of a master control facility, with the subtitle playout system approving files in advance, and then airing the correct file at the right time automatically, either with or without time code. The time-of-air system can also be used as a gatekeeper for real-time subtitling, with the system authenticating the subtitler and work slot prior to allowing pass-through to air. This system of checks is useful given the distributed and freelance nature of real-time subtitling.

A hybrid of ingest and time-of-air methods has become increasingly popular and results in ingest of the video asset whenever possible, with time-of-air playout where appropriate. The time-of-air subtitle system receives the playlist information directly from the automation system. Issues such as missing files, missing time code and media lacking encoded subtitles or metadata can be reflected directly back to the automation system to ensure master control staff take immediate remedial action. Where the automation playlist indicates that the video asset already contains subtitle data, the time-of-air subtitle system can check that it is complete and QC the subtitles, flagging any errors appropriately. The time-of-air subtitle system can also provide interfaces to other ancillary data signals and XDS information such as widescreen signaling, vChip parental controls, broadcast flag information, DRM controls such as CGMS-A data, DPI data and more.

Broadcast equipment technologies continue to roll out at a rapid pace, and it is important that ancillary data (such as subtitle data) is supported in all equipment in the transmission chain. Systems have to remain interoperable and ideally include APIs developed to assist in the exchange of data around the modern workflow.

The MXF format provides an open-standard wrapper for broadcast media including ancillary data, and many equipment vendors support it because it enables seamless workflows. MXF is an extremely flexible specification, and most broadcasters who use the standard design a profile or shim to remove some of the ambiguity from this comprehensive and far-reaching standard. Any profile used should define the way subtitling data is supported.

Adapting to moving targets through subtitle transcoding

When broadcasters want to generate multiple versions of the same content for alternative nonbroadcast distribution, such as for online video, distribution methods must employ a variety of subtitling technologies, all of which must be supported in a holistic way by the subtitling system.

Implementing a mezzanine video format and including the übersubtitle file, the subtitle data component can be transcoded appropriately at the same time as the video, ensuring the same or better quality as in the broadcast version. In addition to supporting different output distribution formats, workflows often need to provide for reversion outputs where the time of the video asset is manipulated, or it is split into different program segments. This reversioning process is often carried out on an NLE system and can effectively destroy the subtitling data, but modern transcoding solutions can avoid this by using the edit decision list from the NLE to bridge the original and final versions of the subtitle data.

Again, the embedded data concept plays an important part here because the mezzanine format can be either a rich or lowest-common-denominator format; but, it is vital that it is unbreakable to reduce risk and ensure that the subtitles reach the viewer correctly. The wrapper provides a means to store subtitle data in a way that makes it transparent to the video and allows multiple languages to be stored within the same asset. This further reduces risk and saves time and storage space through the use of a single-asset version.

By using the VBI or VANC tracks that are available in wrapper formats such as MXF rather than the video essence, subtitle data can be stored without having to be encoded into the video essence. In addition to the previously mentioned advantages, this also ensures that the video remains clean and is not potentially degraded through additional transcoding processes.

By implementing an integrated subtitle creation/repurposing, encoding and transmission solution, the subtitle data embedded in the media asset can then be seamlessly transcoded to the desired format at time-of-air, or even completely bypassed. Viewers will enjoy the highest quality of subtitles however they choose to consume their content, and content providers will reach the widest audience.

Darren Forster is chief technology officer at Softel.