Beyond intuition

As I write this article, I'm 33,000ft over Kansas en route home from the Hollywood Post Alliance Technical Retreat — the second technical conference I've attended in two weeks. Writing for a magazine like Broadcast Engineering requires a conscientious effort to see the forest for the trees. After two weeks of immersion in digital intermediate, IPTV, advanced compression and TCP/IP theory, I find it difficult to relate the future of our industry to a single-minded topic. So the task of writing about format scan and conversion looked daunting. Luckily, a company that participated in the conference offered a fresh view of fundamentals of format and scan conversion.

Seeing frames anew

FrameFree is a new technology company to most of us, but its image research roots go 17 years deep in Japan. The company sees video as the movement of content across the image plane rather than as frames. In much the same conceptual view that motion compensation seeks to find matches frame to frame, FrameFree ties critical points in an image to corresponding points in the succeeding frames.

The simplest analogy you may be familiar with is morphing, where a face is slowly converted to another by a gradual process of change. This leaves the similar points, for example, the eyes, in the same position, but gradually changes them to those of the second face. Over the space of many frames, the image slowly becomes the new face, and all similar features (mouth, hairline, etc.) convert to those from the target image.

But if you cease to view a TV signal as a series of pictures and instead view it as samples in time, it is natural to look at changes in the scene as a morphing of reality.

Interpolated reconstruction

Consider what would happen if you took some of the time samples — the frames — away. By relating the remaining samples to each other through finding critical points and figuring out where they have moved to, you might be able to replace the missing frames with an interpolation of the two end point pictures. This is similar to motion-compensated standards conversion, but it is subtly and fundamentally different. But how is frame interpolation related to scan and format conversion?

Format morphing

Think about converting a 525/30 signal to a 625/25 signal. Create mental pictures of the two sequences. They both have the same nature as samples in time of the real scene. But if you could sample the real scene with all 55 samples, you would find that they are all intimately related but simply slightly shifted in time, with an irregular cadence.

For example, look at the sample times in Figure 1, and consider how converting from one to the other can be viewed as an interpolation of the closest two frames from the source samples. Then create the missing time samples for the output format.

By looking at the problem this way, its mathematical nature is clear. If you can manufacture a frame where one does not exist in the source, you have the ability to convert any frame rate to any output frame rate. Now take the same analogy and think about the spatial samples. Then you will realize that the same thing can be done spatially at the pixel level.

Imagine 525- and 625-frame images scaled and overlaid in the real world. Their samples clearly would not lay on top of each other, but converting from one to the other would require to morph, or interpolate, between data points that exist in the source image. Then you can find any intermediate point corresponding to the output image. (See Figure 2.)

Applying the technology

Intuitively, I have understood this process for a long time, but watching FrameFree's technology demonstration made the intuitive much easier to describe. Of course, there is much more to the process. The company decomposes the image into a mathematical mesh using an existing technique called Critical Point Filters.

Applying this technology to modern motion imaging is powerful. FrameFree's only product to date is intended for the graphics and image processing industries, but I think the company is really on to something. The technology could be successful in many other applications.

For example, using this approach in cartoon animation could make much smoother animations for 30-frame television delivery without creating any additional animation cells. Simply interpolating between cells could save the artist time and improve the product.

This process could also be used to smooth out sports replays. By creating many more in-between frames, replays could appear stunningly fluid.

Real-time compression

What if this technique could be used to compress TV images for transmission? If the technology could transmit fewer frames, it would greatly reduce the amount of data. This is essentially what compression systems, such as MPEG, do when they create the much smaller B- and P-frames.

Converting high-resolution computer images to television output and vice versa would be no problem with this process. The same goes for converting 72-frame computer simulation to 25-frame film. Exactly the same processes happen, whether done using Critical Point Filters or any other approach. Interpolating in time and space is the general case problem. And after only 41 years in the business, I am beginning to understand it!

Meeting of the minds

FrameFree is likely a long way from creating a real-time compression engine, or any of the other products, but I hope this helps you to see scan and format conversion in a new way.

John Luff is a broadcast technology consultant.

Send questions and comments to:john.luff@penton.com