Beyond HD

Now that HD is commonplace, where to next? NHK has long been a pioneer in future technology. Back when broadcast engineers were getting used to the new digital interfaces, NHK was looking at HDTV systems, which were defined as having about twice the number of lines of the 525-line system they were using at the time. It took something like 20 years before HD took off. Around 2005, the advent of the flat-screen display as a replacement for the CRT finally helped propel the adoption of HDTV in large numbers by consumers.

NHK's current project is for an Ultra High Definition Television (UHDTV) format dubbed Super Hi-Vision (SHV). Just as HDTV was a long-term project, this is expected to continue over the next decade. It has a raster of 7680 × 4320 pixels — 16 times the size of the common image format of 1920 × 1080 — and it is also known as 8K.

In 2007, SMPTE issued Standard 2036, Ultra High Definition Television — Image Parameter Values for Program Production. This defined two standards: UHDTV1 and UHDTV2, or 4K and 8K. UHDTV1 has twice the number of pixels of 1080 HD, and UHDTV2 has four times. (See Table 1.)

UHDTV2 has an uncompressed bit rate of 24Gb/s and can be carried by 16 HD-SDI 1.5Gb/s circuits. Current technology does not have a means to deliver this in a compressed form to the home. Trials have used 320Mb/s links, more than can be carried by existing terrestrial or satellite broadcast systems. However, MPEG video encoder development continues, and who knows what will be possible by 2020?

Viewing angle

The resolution of SDTV is such that for the line structure not to be visible, the subtended viewing angle of the screen from the viewer should be less than 10 degrees. One of NHK's goals for HDTV was to have a display of such resolution that the viewing angle would be 30 degrees, giving a wider window on the world. Human vision can perceive close to 180 degrees in the horizontal plane, although resolution falls off steeply outside the fovea (the optical center). (See Figure 1.)

Although HDTV achieves the aim of high resolution in the area of detailed vision, peripheral vision is lacking. Increasing the system resolution to the UHDTV2 format allows greater peripheral vision, with that a heightened sense of “being there.” UHDTV2 is designed to deliver a viewing angle of 100 degrees. If the viewer sits farther away, then the resolution of the system is wasted.

Such high resolution can be used for other applications like 3-D television, where spatial resolution is traded for depth perception.

3-D imaging

Plano-stereoscopic 3-D (S3D) is just the start of the journey in the development of television systems that can enhance reality through the addition of depth perception to a scene. S3D has advanced little beyond the Victorian toys, save to use digital processing.

S3D has been simple to implement, requiring just the transmission of two regular television rasters to a display device that can display the two images in the same plane in a way that can be delivered separately to each eye. S3D uses depth cues from binocular parallax and convergence. But you cannot see around objects with S3D; this phenomena is called motion parallax. If you move your head to one one side, you just get a slant view of the same thing.

The next steps beyond S3D are multiview coding including integral photography. Beyond that, there are techniques such as holography.

Research groups around the world are looking beyond S3D to systems that can provide full-parallax 3-D independent of the viewer's position. Candidates include integral photography, which NHK is researching, and holography, which the 3D Vivant project is researching with funds from the European Commission.

S3D's drawback is that the interaxial spacing and convergence of the two cameras must be set to maintain the depth budget for comfortable viewing. But these settings vary according to display dimensions. Is it meant for a cinema screen or a small display at home? True 3-D would not have that limitation; a single data stream could be post-processed for stereo, multiview or even holographic display.

Compression

To cope with the very high data rates from UHDTV and 3-D, more efficient codecs are needed. The ISO/IEC working group, MPEG, is active in this field. The initial driver was to develop a format for 3-D Blu-ray discs using an extension to H.264 called multiview coding, or MVC.

A 2-D plus depth encoder called MPEG-C part 3 that is backwards compatible with 2-D is also being developed. This is a candidate for OTA transmission of a 2-D/3-D service.

MPEG has announced plans for high-performance video coding (HVC) to deliver increased efficiency over AVC (MPEG-4 part 10), which would be suitable for encoding systems like UHDTV.

Plenoptic cameras

Stereo and multiview 3-D can be captured with a camera for each view, and for stereo, a right and left camera is the norm. Much research is going on into alternate ways of capturing the scene, specifically as a light field rather than an image map in a 2-D focal plane. To capture a light field, the direction of each ray through the imaging plane as well as intensity must be recorded. (See Figure 2.)

One such device is the plenoptic camera. Reminiscent of the compound eye in insects, these devices use an array of microlenses between the taking lens and the sensor. Rather than imaging at the focal plane as in a conventional camera, the microlenses are placed well before the focal point so that pixels on the image plane sample light rays from different directions but also from different points. From the image data, the original light field can be reconstructed.

It is common to see microlenses directly above the photosites in a CMOS or CCD sensor, which increases sensitivity. In the plenoptic camera, the microlenses are spaced away from the sensor. The data from the plenoptic camera can be processed to form a multiview datastream for binocular viewing. It also allows the point of focus to be adjusted by manipulating the image data, so that is one more thing that can be “fixed in post.”

Continue on next page

Viewing 3-D

The initial systems for viewing television have adopted stereoscopic methods from the cinema with passive glasses, or for switching displays with active glasses. The use of glasses is considered to be an impediment to the acceptance of 3-D in the home. S3D has many other drawbacks beyond the eyeware issue in terms of delivering a fatigue-free, realistic 3-D experience.

Many other types of display exist, but their uses have been in other sectors like design and medical research to view 3-D models. The next step beyond stereo methods is the autostereoscopic display. These don't require glasses and use methods such as lenticular screens to create multiview displays. They are generally limited to around 15 viewing angles. Beyond that are true 3-D techniques such as holography. This has been a popular visual effect in science fiction movies — “Star Wars” amongst many others — but it is just an effect.

Holographic display can create focal points behind and in front of the imaging plane, avoiding the conflicting parallax/convergence cues of an S3D display. One system uses an array of microdisplays illuminating a diffuse screen. This can be rear or front projection. Akin to the plenoptic camera, but in reverse, this creates diverging rays from the virtual objects and can be viewed from any position in front to the screen. It also creates vital clues like motion parallax to the brain for a more realistic 3-D effect. (See Figure 3 on page 39.)

Summary

It could be argued that we have reached the end of phase one in the development of television. Since the NTSC defined a color television system, it has been improved to the HDTV we have today. The next steps present more challenges, not only the technical but social and psychovisual.

The cinema has had widescreen displays, from the three-camera process of Cinerama in the 1950s to IMAX today. The cinema has enthusiastically adopted S3D, and the necessary eyeware and the single view are not great drawbacks. We sit in one place for a viewing, and it lasts 120 minutes in a darkened room. Viewing very wide screens and/or 3-D at home is a different matter.

UHDTV, multiview and light field all promise a more immersive audio-visual experience in the future, but we won't be seeing products at the next CES. These developments are expected to evolve over a decade.

There is another side to this. Take a typical room with a ceiling height of 2.4m. Using a full room-height UHDTV display, 4.26m × 2.4m, the recommended viewing distance would be 1.8m. That would need a display with a screen diagonal of 4.9m or 192in. The 3-D surround sound is typically realized with 36 small speakers arranged in three layers: ceiling, mid and floor, plus two LFE woofers.

This raises many issues: The display would dominate the smaller living rooms common in Europe and Asia. It could be difficult to deliver the display through doorways and stairs. And what of locating all those speakers?

It could well be that 4K, UHDTV1, is more suited to the domestic environment, and 8K for public displays. Because 4K has a quarter of the data rate, this also eases the issues around content distribution.

Would viewers really want to feel as if they were in the set rather than viewing through a window onto the set? Could they feel uncomfortable? These practical and psychological issues could make the systems of minority appeal; only time will tell.

Plano-stereoscopic 3-D is only really the beginning, a stop-gap until better technologies can be perfected. The next step beyond S3D is multiview, where more than two views of the subject are used. The display could typically create eight to 16 views of the subject. In principle, the data rate scales with the number of views, but there is a great deal of redundancy in a multiview system, which can be used to advantage by image compression schemes.

To go beyond multiple planar views, it is necessary to encode the light field, rather than just the intensity of light across the 2-D focal plane of the camera. A light field defines the direction of rays from the object as well as position and color value. This adds huge amounts of data, so current research is looking at ways to use this principle but at manageable data rates. Much relies on optical methods rather than data processing. Many of the ideas are not new; some date back a hundred years. But the implementation of a high-resolution system suitable for television requires modern sensors. Technologies like integral imaging and holography may well lie at the center of future consumer products, but there is much work to do before we see them in the home.

For broadcast engineers, there is a future of continually evolving technology. For content creators, it will mean learning a whole new way to shoot and post-produce. But at the end of the day, just like mobile TV, it is the business model that will determine when and if we see UHDTV and true 3-D as mainstream service delivering content to viewers.

Category Image format Pixel count Frame rates Scanning Aka HDTV 1280 × 720
and
1920 × 1080 921,600
and
2,073,600 23.98, 24,
25, 29.97, 30,
50, 50.94, 60 Interlaced or progressive HD DCDM† 2048 × 1080
4096 × 2160 2,211,840
8,847,360 24 or 48
24 Progressive
Progressive 2K
4K UHDTV1 3840 × 2160 8,294,400 50, 50.94, 60 Progressive 4K UHDTV2 7680 × 4320 33,177,600 50, 50.94, 60 Progressive 8K

† Digital Cinema Distribution Master, SMPTE 428.

Table 1. An overview of video formats beyond HD