Digital file formats

There are many different types of video and audio files in use today within the modern broadcast plant. Some of the more common file types take various forms when transported with their associated data.

The Essentials

To begin, let’s start with some basics about video, audio and how they are stored and transported. (For this tutorial, audio and audio files will only be mentioned as they pertain to the subject at hand.) Video and audio have always been combined when recorded. When these sources were recorded onto videotape, they were laid down as separate tracks but were combined in that they shared the same videotape. As digital video developed, there came a need for the audio and video to be combined within one file or stream.

A stream is any data that is transported and viewed or recorded in real time — examples of this are analog video, SDI video as well as several video formats used on networks such as the Internet. Video files must reside as a complete file on the receiving equipment before being viewed. While a video file transfer may take some time to be completed, it cannot be viewed while transferring.

Special file formats called containers are used to combine the audio and video elements (files) into one file for convenient storage and transport. Some video servers store the audio and video elements separately on their storage systems, while others use container formats to keep these separate elements together. All video servers store both the video clip and a database with information about those clips. When the audio/video files are transferred to another system the data about them also needs to be transferred — that is where metadata comes in. Metadata is the data about the data that makes up the audio and video elements. To combine the container with the audio and video elements along with the metadata, a wrapper is used. A wrapper is a type of container used in professional video to combine the elements (audio and video files) as well as the metadata (see below).

The differences among elements, containers and wrappers can become confusing because some elements and containers share the same name. For example, MEPG-2 is a compression codex for digital video but also a container when the audio is combined with it. The difference is in the file extension used with the file. When the files are contained inside a wrapper, the question of whether it’s an MPEG-2 video element with a separate AIFF audio element or an MPEG-2 stream (container) with the audio element combined can only be answered by the metadata, because all you will see of the file is the wrapper and its file extension. This type of information becomes more important as we combine and wrap the basic elements for easier storage and transport.

Professional file formats

There are many different types of video file formats in use today, and they all have their pros and cons. Here’s a description of some of the most common compressed formats being used today.

MPEG-2 has been one of the most popular file formats used for broadcast video server storage because of its high compression capabilities. MPEG-2 uses the information from the frames before and after the current frame to compress the image (interframe compression). It’s what is called a lossy format, because some information is lost and cannot be recovered. This method is fine for transport and playback but not for archiving and post production. MPEG-2 is a mature format developed more than 15 years ago and is found in everything from DVDs to DTV. The data rates for MPEG-2 can vary between 3.5Mb/s to more than 10Mb/s depending on the amount of detail or motion in a particular segment. The amount of compression and, thus, the date rate can be controlled and limited as needed in a trade-off between picture quality (picture artifacts) and data rate (file size). Some servers use a default MPEG-2 data rate of 25Mb/s, which gives you the highest quality picture but the lowest amount of storage time.

MPEG-1 is the older brother of MPEG-2 and is mostly used for proxy files in video servers. MPEG-1 is limited to a data rate of 1.5Mb/s at its highest quality but shares many of the same compression qualities with MPEG-2, being a lossy compression format. The proxies files are created automatically by the server system at the time of ingest, or just after, as a much smaller copy of the original file. These proxy files are then used when desktop computers access the server to view the files stored there. Being highly compressed, the MPEG-1 proxy files can easily be transported over 100BASE-T or even 10BASE-T Ethernet networks. But due to their high compression, the quality of the images are very low and are displayed in a very small window similar to streaming Internet video, so they are only good for content checks and not quality checks. But it does allow many more people to view the files without high-speed networks or expensive equipment.

M-JPEG (Motion JPEG) is an older and much less compressed format compared to MPEG-1 or MPEG-2. M-JPEG’s main characteristic and advantage is that it only compresses the information within a frame of video (intraframe compression). This means that all the information for every frame (picture) is sent so that no frame is dependent on the frame before or after. When editing, a splice can be made at any point without reference to any other frame. These days, however, M-JPEG is considered an obsolete format only kept alive in legacy systems; it has been replaced with several newer digital video formats that combine smaller file sizes with higher picture quality and editing ability.

MPEG-4 Part 10/H.264/AVC (Advanced Video Coding) is the newest and highest compression standard in use today. Although it has several names, it is one standard from the ITU-T and the Moving Picture Experts Group. MPEG-4 achieves much higher compression ratios than MPEG-2, on average two to three times the compression with equal or better picture quality. MPEG-4 uses several schemes to achieve these high compression ratios, but it basically follows in the footsteps of MPEG-2 — it looks for similarities and differences within a frame and between frames (interframe compression), but it can use different size blocks of pixels in its analyses. Another difference is the amount of processing power required in the decoder. MPEG-4 requires several times that which is required to decode MPEG-2. MPEG-4 is rapidly being deployed in broadcasting because of the high quality images and equally important small file sizes it creates and the lower bandwidth required to transport it.

Panasonic has developed AVC-Intra (AVC-I) for HD recording with its P2 solid-state memory cards. This is a version of MPEG-4 with the main difference that it uses intraframe compression. This makes it somewhat less efficient, but it also eliminates any artifacts caused by motion or rapid changes in scene content.

The DV (Digital Video) format was created in 1996 as a consumer/prosumer format that has evolved into several professional video formats. The original DV format starts with optical low-pass filtering at the lens and goes on to use DCT intraframe compression at a fixed 25Mb/s data rate. DV was originally designed for tape but has found wide acceptance in NLE systems and video servers.

Sony developed DVCAM based on the DV format. It uses a higher tape speed and wider track width, which reduces dropouts, but essentially, the format and compression are the same as DV.

Panasonic developed DVCPRO as a high-end format but still relied on the basic DV format. DVCPRO uses an even faster tape speed and wider track width than DVCAM but still uses a 25Mb/s data rate. Newer versions of this format have increased the data rate to 50Mb/s and 100Mb/s for even higher quality images.

Next time

The next Transition to Digital will cover containers and wrappers and how they factor into today’s broadcast facilities.