Pre-scaling graphics for HD nonlinear editing

The key to implementing HD is to be able to initially use the current SDI workflow. The author shows how to easily integrate HD images into an SD timeline and practice
Publish date:
Social count:

Leitch’s VelocityHD supports both uncompressed and full-raster compressed formats in real-time.

At first glance, the creation and scaling of graphics to the correct size for incorporation with high-definition video in a nonlinear editing system may seem straightforward. However, the adage “looks can be deceiving” comes to mind.

Graphics-related issues such as color space conversions, color sampling and pixel aspect ratios have been covered in detail before, and the full raster sizes for HD video frames are well-known — 1920×1080 and 1280×720. Simply create your graphics to these sizes, and you're ready to import them into your NLE — or so it seems.

One size doesn't fit all

While this simple process is usable for bringing HD graphics into modern editing systems, there often are internal factors within an NLE that can make this workflow less than ideal. For instance, pre-scaling full-size HD graphics to sizes other than the full raster may be advantageous both in terms of quality and productivity. Otherwise, some NLEs may perform some unexpected automatic re-scaling of the graphics that editors would otherwise prefer to control. And, on some NLE systems, pre-scaling the images may result in improved real-time layering and effects performance. Counter-intuitively, in many cases, the best results in both quality and performance may be achieved by pre-scaling graphics to lower than full raster size, before bringing them into an NLE.

The key factor that can make pre-scaling HD graphics desirable relates to the frame size (in pixels) of the graphics, versus the actual frame size at which the NLE processes the HD video. For example, not all 1080i is created equally. While NLE operators may think they're mixing graphics and video of the same frame size, that might not actually be the case.

Figure 1. Common acquisition formats and their native frame sizes. Click here to see an enlarged diagram.

This issue arises when full-resolution (1920×1080 or 1280×720) graphics are mixed with HD video that is being processed natively in a compression format that has been sub-sampled from the full HD raster. While formats such as Panasonic's D-5 HD and Sony's HDCAM-SR support the full raster (as does the HDV format in its 720p variant), many other common HD formats do not. For instance, HDV, as well as Sony's HDCAM, uses 1440 luma samples per line, for a 1440×1080 recorded frame size. DVCPRO HD sub-samples to 1280×1080 in its 1080-line mode, or 960×720 for 720p.

If the nonlinear system works with media in any of these acquisition formats natively, the addition of a full-size graphic in the editing process may result in the mixing of a 1920×1080 graphic with video at a 1280×1080 or 1440×1080 frame size. Various formats and their native frame sizes are shown in Figure 1.

Similarly, as an alternative to processing HD media natively in its acquisition format, some NLE manufacturers offer their own compression schemes, optimized for post-production. While some of these codecs support the full HD raster, others are sub-sampled similarly to the acquisition codecs mentioned above.

Scale up or scale down?

This means that those sub-sampled formats will thus face the same frame size mismatch. Resolving this mismatch will invariably be handled automatically by the NLE. However, there are disadvantages to doing so that may make it preferable to avoid this frame size mismatch in the first place.

There are two fundamental ways to resolve the differing frame sizes of the graphics and the HD video: scaling the graphics down to the size of the video, or scaling the video up to the size of the graphics. The particular method used varies between different NLEs, often influenced by the constraints of their internal pipelines. Both methods, however, have downsides.

In the first case, the NLE (internally, without user intervention) scales the graphic down appropriately (with the associated pixel aspect ratio change) to match the smaller frame size of the video content. On NLE systems that have internal processing limited to the sub-sampled raster size — such as systems that support only specific compressed HD formats — this is the only viable method. Because scaling a graphic is not a computationally intensive process, it can be handled quickly — likely without affecting the NLE's real-time performance. The question then becomes: Why not just let the NLE do it?

The answer is control over quality. Letting the operator control the downscaling of the graphic manually prior to importing into NLE may give you better control of the scaling quality. Many NLEs offer few (if any) options as to the scaling and interpolation methodologies used for reducing the size of these images. These systems may provide operator controls for non-standard image sizes that the user specifies to be scaled, but they seldom provide any adjustments for the internal behind-the-scenes format conversion.

Figure 2. Scaling algorithm examples Left: Section of original 1920x1080 image, zoomed in to 300 percent (top) and 1200 percent (bottom). Middle: Same section downscaled to 50 percent using nearest neighbor interpolation, zoomed in to 600 percent (top) and 2400 percent (bottom). The hard edges are retained, but the edges now appear more jagged. Right: Same section downscaled to 50 percent using bi-cubic interpolation, zoomed in to 600 percent (top) and 2400 percent (bottom). The result is much smoother, without the jagged look of the nearest neighbor algorithm, but the edges are visibly softened. Click here to see an enlarged diagram.

In contrast, dedicated paint and graphics software usually offer a wealth of such choices. Different scaling algorithms offer a variety of results based on adjusting image characteristics, such as hard edges, smooth gradations and overall complexity. Algorithms such as nearest neighbor or simple pixel duplication/removal may be best for preserving hard edges, but can also result in harsh, jagged-looking images. Other image conversion methods such as bi-cubic interpolation offer smoother results and have variation within their implementations, which may help preserve both smoothness and detail, but may result in a visual “softening” of the image.

The advantage of pre-scaling from a graphics application is that the operator gets to visually determine which interpolation method will maintain the best possible quality. Furthermore, graphics software often provide image filters that can reduce some of the undesirable side-effects of downscaling. Some nonlinear editing systems do offer real-time rescaling (hardware- or software-based) that can result in even better quality results than common graphics software. However, as mentioned, this single re-scaling method might not be ideal for all graphics and offers no manual control.

In short, while pre-scaling graphics down to a lower pixel resolution than the full HD raster does lower their overall precision (and thus quality), if an NLE is going to down-scale the graphics anyway, then depending on your NLE's internal scaling methodologies, it may be advantageous to let the operator do it while maintaining control over the results.

The second way that the NLE can resolve the frame size mismatch is to expand the sub-sampled compressed video back to full raster for mixing with the full-size graphic. This has the advantage of maintaining the optimal quality. The downside is that it takes a lot more CPU horsepower to scale the multiple frames per second of HD video up to full raster size than it would a graphic. The net result is that the process can have a negative impact on an NLE's real-time performance, especially when the segment involves multiple video layers, each of which must be scaled up.

Scaling up the video

For example, on one NLE system, superimposing a 1920×1080 graphical overlay over an otherwise real-time segment of two layered video clips captured from DVCPRO HD requires rendering to get the video up-scaled for full-quality output. In contrast, superimposing a 1280×1080 graphical overlay over the same segment can be done in real-time.

Thus, pre-scaling the graphic down to 1280×1080 can save considerable time in the workflow process. This means that if a graphic is used as an overlay that runs the duration of an hour-long program, the short time taken to pre-scale the graphic may save having to render the entire project in the NLE, which is a long process.

Of course, pre-scaling the graphic down to 1280×1080 imposes a quality penalty (relative to letting the NLE process the graphic and up-scale the video at 1920×1080), but at least the user (the graphic artist, editor or producer) now has the ability to make that decision. In high-demand environments, such as near-to-air applications, the need for real-time productivity may outweigh any loss in image quality.

When not to pre-scale

The above discussion outlines how pre-scaling can minimize or eliminate the problems associated with mixing full-resolution graphics with HD compressed video that has been sub-sampled from the full HD raster.

However, some NLE systems support compression formats that can handle the full HD raster. These systems also feature full-raster internal processing, and can easily combine full-size HD graphics with video content in these compressed formats without any internal re-scaling of video or graphics. These systems maintain the same frame size (1920×1080 or 1280×720) throughout the workflow.

Similarly, NLEs that support uncompressed HD editing will also handle the full HD raster when using uncompressed media. Full-size HD graphics can be mixed with uncompressed HD video clips without re-scaling. Working with uncompressed HD video creates other issues. For instance, many of the new affordable HD editing systems offer better real-time layering and effects performance with compressed media than with uncompressed (if they support uncompressed at all). And naturally, working with uncompressed HD video requires far more storage and higher bandwidth than when operating in the compressed domain.

The moral of the story is that it is important to thoroughly understand how an NLE internally processes both graphics and video. This will help operators make the best decisions as to whether it's beneficial to pre-scale graphics before ingesting them into the NLE.

If the NLE provides full-quality, real-time performance on uncompressed video or full-raster compressed formats, then pre-scaling may not be advantageous. If, however, you plan to work in the compressed domain with sub-sampled compression codecs (including native acquisition formats), it may be beneficial in terms of quality or performance to pre-scale first.

Protect your HD

With the enhanced pixel resolution of HD, there's a natural inclination to take advantage of it by using detailed and intricate graphics. However, be careful because the extra detail can end up working against you when the finished HD project is distributed.

As explained above, if the NLE system will be working at less than full raster, the graphics may be down-scaled or sub-sampled, resulting in lower horizontal resolution. It's also important to remember that, for the foreseeable future, a high percentage of HD content will be downconverted to SD for at least some of its distribution, which means a loss in both vertical and horizontal resolution.

Detailed graphics that look exceptional when created at full-raster HD resolution may lose considerable detail (making elements such as text all but unreadable) when converted to SD for playout.

Here are a couple of solutions. If all of the graphics will effectively be used as downstream keys in the NLE (superimposed over other video layers), it's often best to first downconvert a version of the finished HD project without the graphics in place. Then, add the graphical overlays (which have been specifically designed for SD) separately to this downconverted version. This process protects the HD version, while maintaining the best possible graphics quality and readability for the downconverted SD distribution.

This isn't always possible, of course, as graphics are often layered between other elements in the overall project. Even so, keep in mind the potential future downconversions that may occur when creating graphic elements, and you can avoid many of the associated pitfalls.

Mike Nann is the technical marketing manager, Professional Post Production, for Leitch Technology.