Multilanguage broadcasting

As multichannel satellite and cable reaches out to every corner of the world, many of the prime channels are global entertainment brands. Although national broadcasters make productions in their local languages, much of the programming on multichannel television will be sourced from content factories of the global brands, notably from Los Angeles.

In the days of videotape, the process of repurposing content for local languages, and ensuring compliance with local regulations, moved very slowly. It could take well over a year for a series out of Hollywood to promulgate across the globe. The rise of piracy, especially file sharing, meant that pirate copies could be viewed a year before a program was shown in a local language version.

One way to address this is to accelerate the release cycle. Leading media companies have now set up processes allowing a French or German version to be available within days of the U.S. airing. What enables this change? One factor has been the shift from tape to file distribution. Dubbing becomes a simple copy, and transcoders can quickly create browse files for subtitlers and other editorial processes.

Repurposing

There are two necessary operations to re-version content for different markets. Many countries have different or stricter regulations on program content. Material of a sexual or violent nature may need to be edited in order to comply with local laws. Similar editing may be necessary for airline versions or for family viewing.

Once a compliant version is available, the next step is to translate the soundtrack. This can be by dubbing a new language track or by subtitling.

In some territories, access services may also be required: closed captions and audio description in the new language. For some programming, it may be necessary to create graphics in the new language, although for titles and credits, a subtitle usually suffices.

The cost of translation will vary widely. The least expensive is the translation of existing same-language closed captions to the new language to use as subtitles. This is a text-to-text translation without sight of the original video. The most expensive is a full dub with well-known actors voicing the parts.

Processes

Videotape has stood the industry well. However, as the demands of multichannel, multiformat delivery add to the necessary processes of repurposing, the costs of using tape are proving a barrier for broadcasters looking to exploit the new opportunities. (See Figure 1 on page 8.)

Subtitling

Translation subtitling, as opposed to same-language closed captions, is a cost-effective alternative to dubbing. Subtitles were originally keyed over the video — open subtitles. The development of teletext allowed the subtitles to be carried in the VBI of the PAL system, and receiver design evolved to support teletext and closed captioning. This same technology can also be used to carry closed subtitles, with up to four languages per VBI line.

DTV provides new technologies; the DVB Project defines two systems for delivering subtitles. DVB Teletext (EN 300 472) allows EBU teletext to be carried in a DVB bitstream. This retains backwards compatibility with analog VBI carriage. The second system is DVB subtitling EN 300 743, a more efficient and comprehensive system specifically for adding captions. A DVB subtitling stream can contain several services to support a number of languages simultaneously. The viewer selects the appropriate language at the receiver. The European trade association DIGITALEUROPE (formerly EICTA) and many national bodies recommend that digital TV receivers should support EN 300 743 subtitles and EN 300 472 Teletext.

For satellite delivery, DVB subtitles have the added advantage that multiple subtitle languages can be packaged in the same transport stream as the video, so one channel can serve many language regions. Contrast the older burnt-in “open” subtitles that require one video channel per language. Ten to 16 closed subtitles can easily be accommodated within a single video channel.

To create subtitles, the first step is to create a dialog script, either from the shooting script or by transcription. To lower costs, this may use speech-to-text software, with the transcriber respeaking the dialog. By using a single speaker, the software achieves much better accuracy.

Dubbing

The choice of subtitling or dubbing is largely cultural. Viewers in Germany, France, Italy and Spain expect foreign-language productions to be dubbed. Other countries are happy with subtitles. Dubbing is a much more expensive process and can cost 10 times or more than subtitling, If that is what viewers want, however, then broadcasters are obliged to provide the service.

Some East European countries use an alternative that costs less than dubbing. In lectoring, a narrator voices over the full mix in the original language, describing the original dialog.

The quality of dubbing varies widely. It may be a single narrator speaking all the parts, male and female. At the high end, five or six actors read the parts. It is conventional to associate one local actor with famous Hollywood stars, so viewers will get to associate that local actor's voice with that of the foreign star.

Dubbing again starts with the dialog script, which is first translated to the final language. Good dubbing will attempt lip-sync. The result of the dub is generally a full mix delivered in stereo or 5.1 as a BWAV file.

Technology

Tape-based systems were naturally inefficient as every stage required dubbing to new tapes. Not only does this take time and human resource, but also it adds extra QC stages as well as the inevitable quality losses with each generation of copying.

Consider a stereo SD program. The master tape carries four audio tracks L & R full mix, and L & R music and effects (M&E). Language versions require a tape for one of two stereo language final mixes. In a multilingual playout center, that adds up to a lot of tape movements.

Using files in the workflow, it is possible to add extra languages and subtitles without the need to transcode or modify the video track in any way. The advantages are much simpler QC, as well as the elimination of all the physical handling stages needs for tape dubbing.

A popular file format for playout is the QuickTime movie. The video and audio components are stored in the QuickTime container as separate files. When additional audio tracks become available from the dubbing process, they can be added into the container.

More recently, the MXF format has become a favored format. The family of SMPTE standards has been created to meet the needs of the film and television sector. Although the format provides great flexibility, the most used formats are the OP Atom (SMPTE 390M) and OP1a (SMPTE 378). These formats find application in acquisition, and the OP1a as a general replacement for the videotape, but in the file domain. However, neither of these fits the needs of re-versioning.

OP1a interleaves the content, so adding extra audio tracks for new languages means the files must be rebuilt. OP Atom defines only a single essence track, and needs some external means to synchronize video and audio tracks.

A new application specification of MXF is being developed specifically as a wrapper in multiversion applications, AMWA's AS-02 for MXF Versioning. This will be suited not only to multilingual use cases, but also wherever different version of an asset are needed. Examples include airline edits, family versions or versions tailored to a consumer device like a mobile phone.

AS-02 caries several metadata files, specifically the asset and the manifest file. (See Figure 2.) It can optionally carry a shim. The asset or version file is an MXF file with external essence in OP1b, OP2b or OP3b format. It references the separate OP1a essence tracks in the media folder and provides sync information. Additional language tracks can be added to the media folder without any need to process the existing video and audio tracks. The only operation needed is to update the asset file. The manifest file is like a packing list or directory, and it lists all the files in the asset bundle.

The specification is not prescriptive, so broadcasters can adapt the use of the asset bundle to fit their workflow. It is designed to be flexible, but broadcasters can define a “shim” that constrains the specification to meet their needs.

AS-02 allows additional, non-MXF files to be wrapped as part of the asset. (See Figure 3.) These could be QC reports as caption files and scripts possibly as Microsoft Word files. These extra files are human readable, so back office staff can open and edit these supporting files without the need for an NLE, which would be needed to open media files.

An asset could represent all versions of an asset or could be used for version subsets. For example, a broadcaster could have a Los Angeles asset, a London asset and a Singapore asset. The Singapore asset would wrap all the languages used in transmission to the Pacific Rim, while the London asset would wrap all the European languages. Multiple asset files can access the same media essence files, which could lead to savings in storage requirements.

Management

The creation of foreign language versions is a complex operation that requires management of assets and resources. A DAM can manage the association of subtitles and language tracks with the original content on return from the subtitling and dubbing processes. The DAM gives transmission controllers a complete view of the status of the extra language components.

Formats like AS-02 can maintain association of all the resulting files as a bundle following subtitling and dubbing, hopefully avoided missing material incidents in transmission.

Resource and workflow management systems can ensure the allocation of tasks to internal staff or outworkers, and supervise the scheduling. As turnaround times for foreign language versions become ever shorter, slick management is key to controlling costs.

Summary

Aside from prime networks, the revenue for channels is being squeezed as audiences disperse across more and more outlets to view content. This is forcing content providers to become “media factories.” This means adopting the methodology of mass production, much like the auto industry. Operating in multiple languages only adds to the need to control cost.

File-based operations contribute to cost-saving, but due regard must also be given to interoperability between systems. Subtitling, video servers, DAM, playout automation and management systems must all interoperate seamlessly to achieve the goal of lowering costs. Current operations often involve incompatible file formats, dictating rewrapping or conversions. Metadata must often be rekeyed — a process that can introduce errors. Unnecessary use of videotape to transport content adds QC stages.

To achieve efficient operations in multilingual content distribution, many broadcasters will have to redesign their process flows. The technology exists, and several major names in the United States are already operating digital end-to-end systems based on the latest implementations of the MXF standards. The result is cost savings and the flexibility to adapt the business to changing delivery formats.