3-D broadcast production is struggling on its way.

Although the jury is still out on whether 3-D entertainment delivery has reached a mainstream threshold, there is no doubt that the implementation of electronic distribution of 3-D content — in a form that is compatible with today’s digital transmission systems — is a requirement for keeping all business paths viable. Several initiatives are now under way that are together forming a new 3-D ecosystem.

3-D file and content delivery is becoming formalized. On the production side, X3D (ISO/IEC 19775) is an XML-based file format that has been developed to integrate network-enabled 3-D graphics and multimedia. Each X3D application is a 3-D, time-based space that contains graphic and aural objects that can be loaded over a network and dynamically modified through a variety of mechanisms. X3D does not define physical devices, transformations or parameters, such as screen resolution and input devices, but rather it provides for the interpretation and implementation of 3-D functionality.

Each X3D application establishes a definition and composition of a set of 3-D and multimedia objects, and a coordinate space for those objects. X3D can specify hyperlinks to other files and applications, can define programmatic or data-driven object behaviors, and can connect to external modules or applications via programming and scripting languages. X3D was designed to be broadcast-ready, supporting all manner of fixed and mobile devices.

Transmission compatibility

Both the ATSC and DVB are developing “frame-compatible” add-ons to their respective DTV systems. In order to be transmission-compatible (i.e., fit within an existing broadcast signal), a decimated version of the left and right pictures is transmitted in a manner that fits into a 2-D broadcast. This results in several possible
3-D structures:

  • Top-and-Bottom is composed of two stereoscopic pictures — left and right, which are sub-sampled to one-half vertical resolution.
  • Side-by-Side is composed of two stereoscopic pictures — left and right, which are sub-sampled to one-half horizontal resolution.

Of course, these structures, although they are backwards-compatible in the broadcast sense, cannot be properly decoded and rendered by a legacy 2D receiver, which would show the two images next to each other on the screen, and not separated into right and left components.

The ATSC is developing a service-compatible hybrid-coded (SCHC) 3-D system, which is one particular case of service-compatible real-time delivery (SCRT). The stereoscopic 3-D video is transmitted as two independent video elementary streams, where one of them is compatible with the legacy 2-D TV service.

The ATSC 3-D Specialist Group T3-S12 is also working on hybrid delivery of additional view video by broadband or non-real-time (NRT) broadcast. Because the RF channel has limited bandwidth, this method uses a separate broadband channel to deliver additional view video to minimize the impact to the on-air service. This part also includes the use of NRT for pre-download of additional view video as an NRT object via an RF channel or broadband side channel (e.g., Internet). S12 is also studying hybrid delivery of additional view video by M/H, which proposes the use of video delivered over ATSC-M/H (Mobile DTV) as a secondary view video.

The DVB has published a set of commercial requirements directed to the ability to provide a 3-DTV service utilizing the existing HDTV infrastructure. The DVB 3-DTV specification requires the L and R images to be arranged in a “spatial multiplex” (frame-compatible format), so that the resulting signal is backwards-compatible with receivers that process conventional HDTV signals. Allowed formats include Side-by-Side and Top-and-Bottom. Numerous progressive and interlaced formats are included in the specification.

Research has shown that reducing the quality of one of the left/right images (to some degree) does not cause eye discomfort — a postulate borne out by some contact lens wearers who prefer to correct their vision unequally in each eye, with one for distance and one for reading. But because all viewers do not have the same balance of acuities, the DVB recommends that the largest public interest is best served by providing equal-quality images to each eye. As some researchers have proposed backwards-compatible systems that use different paths for the left and right signals, with different bandwidths, resulting in different left/right image resolution, it could be that a number of different approaches emerge in different regions.

Playback environment

Receivers and the playback environment are getting more sophisticated. On the receiver side, new interfaces have been defined to carry the 3-D signals over HDMI to a display. In the 1.4a version of the specification, the 3-D video format is indicated using a Video Identification Code (VIC) in the AVI InfoFrame (indicating the video format of one of the 2-D pictures), in conjunction with a 3-D_Structure field in the HDMI Vendor-Specific InfoFrame. Top-and-Bottom and Side-by-Side are two of the supported HDMI 3-D video format structures; others include L + depth and L + depth + graphics + graphics-depth. Additional 3-D video formats may be specified in a future version.

As for audio, several companies are now working on enhanced audio reproduction that goes beyond 5.1 channels. With the advent of 3-D video, developers want a new “virtual” sound placement to augment the 3-D experience. In order to reproduce a sound-image placement at an arbitrary position in a room, a speaker layout is needed in three dimensions: height, width and depth. This means that the 3-D sound-space requires using at least eight speakers, positioned at the eight corners of a solid. When incorporating a center and LFE (low-frequency effects) channel, the smallest full 3-D sound speaker layout is a 9.1-channel configuration. Higher numbers of speakers have been proposed, too, such as 11.1 channels and even 22.2.

Although it may be impractical for most home viewers to support more than 5.1 channels, larger systems are already appearing in commercial theaters, and we should expect them to show up in exotic home theaters as well. In order to provide maximum compatibility and re-purposing, each of these formats can be downmixed to a more-traditional 5.1-channel package. Different production and downmix approaches have been developed to account for arbitrary speaker placement in the final user environment, too.

Figure 1. Based on the work of researchers at NHK, international standardization of a 22.2 multichannel audio format is under way.

Research engineers at NHK years ago proposed a “High-Presence Audio Format,” intended to be used with Super Hi-Vision (now known as Ultra HD). Based on their work, international standardization of a 22.2 multichannel audio format is under way (See Figure 1.) The researchers enumerated various requirements for such a system. It must:

  • be able to localize an audio image anywhere on the screen;
  • be able to reproduce sound coming from all directions surrounding the viewing position;
  • be able to reproduce a natural, high-quality 3-D acoustic space;
  • have an enlarged optimal listening zone;
  • be compatible with existing multichannel audio formats; and
  • support live recording and live broadcasting.

Coming attractions

The next step for 3D technologies will be auto-stereoscopy, i.e., 3-D displays without the need for special viewing glasses. The current state of the art produces such a display, but with a narrow audience viewing angle. By some accounts, this is the limiting factor that prevents mass acceptance of the technology, but researchers are working on solutions, so we may not have to wait long — given sufficient demand, enough content and the right business models.

Aldo Cugnini is a consultant in the digital television industry.