Dealing With Multiple Formats in a File-based System

Tuning the components for a file-based media operation could be looked at like arranging the instruments of an orchestra. Each element must be finely tuned so the combined output is effective and harmonious. The complexities of marrying an ever expanding set of disparate file formats, compression schemes, bit-rates, hardware platforms and other elements to a platform that properly handles delivery of real-time baseband video; file aggregation for storage, archive, redundancy and protection; and concatenation or encapsulation for varying delivery services; while still enabling preview, quality analysis, profanity clearing, editing and audio shuffling-in short, is a daunting challenge.

Broadcast, cable, satellite, IPTV, VOD, mobile, Web services and other media entities are searching for a universal platform that can handle routine operational tasks in an agile, adaptable structure that is also ready for the future. Media servers are well-positioned to deal with these requirements; but the platform of the future needs to address many of these tasks through internal systems, i.e., applications in silicon and/or software, instead of several separate devices that are complicated to integrate. The future killer app may address the any-format-in/specific-format-out paradigm.


Until recently, video server manufacturers built multiformat codecs to deal with compression formats, mainly as DV, MPEG-2, and soon AVC (MPEG-4 Part 10 or H.264). Servers capable of wrapping, streaming, and transport stream processing are here; albeit employed mostly at 19.38 Mbps (ATSC) or 34/45 Mbps (E3/DS3). These digital transport solutions are mostly dedicated to DTV applications or for transporting (over satellite or fiber) wider-band high-definition streams to affiliates, and as backhauls for remote events.

At NAB2006, real-time stream splicing for insertion of program or elementary streams into live play-out was shown, offering remedies for the live program chain, but there remain dozens of compliance (and compatibility) issues related to assuring that the compression format, GOP, bit-rates, and more, are normalized for effective for real time play-out.

A trend toward bundling third-party software components capable of managing file and compression format processing, from ingest to storage to play-out, and with data movement and archive management, is on its way. One extension to this solution is to embed those digital asset conversion applications into the server; and may ultimately become the key component in the media server of the future. We are seeing a few server systems embracing embedded conversion in base systems or storage subsystems, but it is slow going for now.

Today's workflows deliver files to a catch server's 'watch folder' and signals the media management system. A translation server with accelerators, integrated with dedicated serving components, i.e., disk storage, CPU and memory; then pulls those files into its store where the number crunching begins. Once converted to the end server's native format, often in greater than real time, the files are pushed to the video server's store and are readied for play-out, editing or long term storage (see Fig 1).

(click thumbnail)Fig. 1: Understanding the elements of file and format conversion (coined as "flipping") starts with a base knowledge of what happens during the process.Video server applications then take the composite (or wrapped) files and disassemble them into the unique requirements of that manufacturer's platform. Once in server's native file format, additional platform-specific operations are performed. At this point in the process, for an external converter to function from the server's disk-storage level, the conversion process must deal with the vendor's native file format (e.g., the Omneon, Grass Valley, Leitch or Pinnacle format) and the compression format, e.g., DV, MPEG, etc. This adds yet another level of complication to the mix of issues in file translation.

Understanding the elements of file and format conversion (coined as "flipping") starts with a base knowledge of what happens during the process.

One of more comprehensible analogies of this process happens when optically scanning print material into an image file. Here the file must be universally recoverable for various operations, such as image processing, general display or viewing, and even optical character recognition. This physical translation process is well understood by even the most elementary users, and the tools that can perform the translations are plentiful. However, that was not the case before programs like Photoshop were conceived. While Photoshop is technically not a translator, its baseline images (in .psd file format) may be saved in various universally accepted formats (EPS, JPEG, GIF, etc.), which are suitable for most fixed image reproduction, Web services and other graphics applications.

Another familiar translation grew from the Windows versus Macintosh operating systems differences. Known issues such as font conversion, word processing, spreadsheet and database conversion, file naming conventions and the like were pretty much ironed out in the early 1990s. Today, for the most part, a file created in either environment is fairly well accepted by the other. While anomalies might exist, the fundamental exchange process has been boiled down to a uniform structure that is industry and marketplace accepted. Ironing out these issues for the professional moving image space would just be nirvana!

For the static image domain, moving from one image format to another, the concepts are fairly straight forward. Color space conversions, pixel depth, aspect ratio changes, and formatting for print are universally accepted throughout industry. But once you venture into the moving image domain, especially spatially and temporally compressed content, the issues get tougher and the complexities grow by orders of magnitude.

Although much of the professional space now deals with releasing content into the consumer space, the tools for MPEG-2, DV/DVCPRO and other compression formats are quite different from those used in just the static or moving image consumer space.

File conversion and translation for workflow improvement purposes necessitates a nearly hands-free operation, and automation is a key enabler for this process. However, additional tool sets that can successfully make this happen need to include the ability to monitor and qualify incoming files, add overlays as audio or graphics, and ascertain what corrective functions need to occur before, during and after conversion. The capability to analyze, modify and restore at the file level-without decoding-will become crucial to the file-based operation as the issues associated with compression and file management cannot be underestimated. When these systems are fully embedded into the media server's platform, we will be much closer to that true end-to-end solution.

Karl Paulsen is chief technology officer for AZCAR. He is a Fellow in the SMPTE and a SBE Life Certified Professional Broadcast Engineer. Contact him at

Karl Paulsen

Karl Paulsen is the CTO for Diversified, the global leader in media-related technologies, innovations and systems integration. Karl provides subject matter expertise and innovative visionary futures related to advanced networking and IP-technologies, workflow design and assessment, media asset management, and storage technologies. Karl is a SMPTE Life Fellow, a SBE Life Member & Certified Professional Broadcast Engineer, and the author of hundreds of articles focused on industry advances in cloud, storage, workflow, and media technologies. For over 25-years he has continually featured topics in TV Tech magazine—penning the magazine’s Storage and Media Technologies and its Cloudspotter’s Journal columns.