Multilingual broadcasting

In today's multicultural broadcasting environment, the demand for program material in multiple language versions is increasing. Broadcasters seeking to build wider audiences for their content can do so by making programs viewable in the appropriate local languages for each target region. By providing this service, broadcasters not only increase their potential audience size, but also foster greater viewer loyalty and realize new opportunities for increasing advertising revenues through highly targeted campaigns.

While the addition of language tracks to the broadcast environment offers distinct benefits, it also adds greater complexity, which in turn can introduce new operational issues. The dubbing of additional languages to the original program can take up valuable time, and management of content becomes more difficult as staff members work with multiple versions of the same program, each in a different language. Given these challenges, broadcasters today need a faster, more manageable way to provide multilanguage broadcasting.

The trouble with tapes

Broadcasters that serve multilanguage markets frequently need to play out the same program concurrently in two or more languages. To provide this service for their audiences, broadcasters must add a different audio track for each language to the program. Historically, to add more languages, broadcasters often had to send a copy of the program to a dubbing house, where a new audio track was created for each required language. Back at the station, operators then had to insert the new track into the program as an additional audio track (space permitting), or make a separate copy of the program containing the new language track if they exceeded the number of audio tracks that they could physically get on one tape.

In instances where multiple languages were stored on tapes, automation systems had to understand which language was on which track, and the broadcaster had to establish clear operational practices describing the stack of audio tracks. However, in many operations, content originated from multiple sources, often resulting in the various languages being on the wrong track. This led to a requirement for the automation system to shuffle the tracks for playout by controlling audio routing at transmission or via audio mixing capabilities in the master control switcher.

Managing multiple audio tracks required multiple passes through the tape creation workflow, with different languages being added to a tape at different times. To a large degree, this operational overhead has persisted as systems have migrated from tape to server-based playout.

The addition of captions or subtitles brought another set of challenges to multilanguage broadcasting. Typically there were two options for adding the caption information: Insert the captions and subtitles during ingest from tape to server, or insert them live during playout. In either case, the workflows were linear operations, and required management processes and intelligence in the automation system to ensure the correct captions were combined with the correct video and audio essence.

Solving the problem in files

Today there are solutions available that reduce the overhead of managing multilanguage playout. The approach is to facilitate working with video, audio and captions as discrete pieces of essence that can be ingested and processed at different times and then combined as a complete asset for playout.

Within this streamlined workflow, the system enables operators to classify language tracks with an identification code that is understood by the automation system and server. This track tagging technology allows audio files for language tracks to be combined in a wrapper with the video essence file and ancillary data such as captioning.

For example, a broadcaster sends a video clip with English audio on track 1 to the translation service provider with a request to add a Spanish-language audio track. Instead of laying down the new track in real time against the video clip, the translation service records a new language track and sends only the audio track back to the broadcaster. (See Figure 1 on page 8.)Using this approach, the broadcast facility can quickly add the Spanish audio to the video clip as track 2.

In addition to introducing operational efficiencies during ingest and preparation, the track tagging functionality also solves the problem of audio shuffling during playout. Instead of requiring routing systems or master control switchers to shuffle audio tracks, the automation systems and server use the track ID system to ensure the correct audio tracks are played out on the correct channels.

This more efficient and highly flexible model of multilanguage playout represents a significant improvement for broadcasters, but today's marketplace demands even greater versatility in localizing programming and playout.

Audiences are growing more complex, and in order to deliver the expected level of service to customers and to capitalize on revenue generation opportunities, broadcasters need to be innovative and flexible in their approach to program distribution in multiple languages.

In some instances, getting the right language version to the right channel at the right time is no longer enough. Increased demands for localized versions that take into account advertising break structures in different countries, edits of popular shows with different parental guideline ratings, and versions appropriate to cultural standards and religious sensitivities must all be accommodated by broadcasters.

In different languages, it can take a longer or shorter duration to say the same thing, so differing video edit points may also be needed. Multiply this variation of the length of a clip for many clips, and the difference can be significant. For example: The English sentence “The same sentence in different languages can have different durations,” in German is translated as “Der gleiche Satz kann in anderen Sprachen unterschiedliche Länge haben.”

In addition to managing program audio, broadcasters also must take into account the need to deliver different credit sequences and promotional material to various regions. All of these tasks add up to a many versions — and potentially an operational and storage overhead.

Greater flexibility

The MXF standard supports wrapping multiple essence and data streams in a single file. There are a variety of operational patterns that support varying degrees of modularity in how the components can be combined within a wrapper and subsequently accessed.

The more sophisticated patterns can support multiple sets of essence and data within the file, along with metadata that describes how they are combined for playback, in effect EDLs within the wrapper. (See Figure 2.) This flexibility means that we can play different language variants of a program from a single file that may have different video edits in addition to the audio tracks and captioning data.

Exploiting the flexibility of MXF requires a new approach to the automation, asset management and associated systems that implement and manage the file workflows. These systems now need to work with compound, hierarchical file structures and support multiple workflows for discrete essence and data files.

To illustrate the potential and problems, consider the challenge of creating a file that can be played out in two languages, with each language version having different opening and closing credits. We could start with a straightforward ingest of a program (e.g. English) into an MXF wrapper with the video and audio essence stored as separate files within it. The ingest process also needs to create metadata that defines the playback of the files. For simplicity, let's consider in and out points for the video file. Typically, after ingest, the file would be moved to an archive system.

As a subsequent process, we want to create the second language version (e.g. Spanish) of the program. To do this, we only have to access video files for the opening and closing credits and a new audio track. Once created, these essence files need to be added to the MXF in the archive, along with new metadata that describes how to play the new essence out with components of the existing video essence to create the Spanish program.

While conceptually simple, the above approach introduces some demands on transmission infrastructure, for example servers and archive systems that can combine discrete files into single wrappers, as well as create and manage the metadata that describes the playout of different versions.

In a world where end-to-end workflow solutions would appear to be ubiquitous, if you believed some of the marketing campaigns presented to us, the reality is that the full potential of MXF is yet to be exploited. However, the concepts have been proven and demonstrated in multivendor systems. Several other companies within the Advanced Media Workflow Association have cooperated to show workflows such as this multilanguage example and to then subsequently work on standardization within the MXF framework to facilitate better interoperability.

Efficiencies within reach

Bringing efficiencies to multilanguage operations is just one of the challenges faced by broadcasters today. From the readily available solutions such as track tagging through to the more advanced approaches MXF can facilitate, it is possible to take significant time and cost out of the processes, while at the same time achieve more sophisticated results on-air.

Brian Kane is the technical marketing manager for Snell.