Preserving Public Television

For years, PBS has been at the forefront of DTV deployment in the United States, yet the very content that is breaking new ground on air and on the Web is in danger of becoming irretrievable by future generations. The reason: There’s no guarantee that the proprietary formats being used to produce digital television today will be viewable 200 years in the future.

In an effort to address this problem, PBS stations WNET (New York) and WGBH (Boston)—public broadcasting’s two production powerhouses—received funding from the Library of Congress to conduct a three-year study, “Preserving Digital Public Television (PDPTV).” Well into its second year, this study is meant to design an affordable preservation repository for public television.

The PBS study is one of many being funded by the LOC, under its National Digital Information Infrastructure and Preservation Program. The NDIIPP’s goal is to ensure that the digital content of today is safely preserved for the future in ways that are secure, capable of migrating to new storage media as old media becomes obsolete, and easily accessible to researchers and the public alike.

“We need to be able to preserve digital content for 250 years,” said WNET’s Nan Rubin, PDPTV project manager. “Moreover, we have to decide what to keep. There’s so much material collected during the production process that doesn’t go into the program yet is of real value to future historians.”

“Beyond what we save, we have to decide what formats to save the content in,” added Mary Ide, director of WGBH’s archives, already home to half a million items on film and videotape. “We also have to decide what information is stored in the metadata that accompanies the archived video files—not just what the video is and who shot it, but also who has viewed and altered it over the years. That’s known as the item’s ‘provenance,’ in archival terminology.”

THE FORMAT CHALLENGE
Formats, formats everywhere, but none that will necessarily stand the test of time. That’s the current dilemma facing both PBS and the LOC. “There are no standards in place at the moment that really speak to a preservation package for video,” said Rubin. “The issue is to be able to reconstitute and then migrate the material over time because the initial medium becomes unstable or deteriorates.”

Decaying media is one thing, but you have to wonder why transferring video files from one medium to another is such a big deal. After all, isn’t digital video nothing but a bunch of zeros and ones, capable of being reconstructed by any binary-based computer?

Yes and no. It’s not the zeros and ones of digital video that cause the storage and retrieval problems, said Carl Fleischhauer, project coordinator in the LOC’s Office of Strategic Initiatives. “It’s the encoding of these zeros and ones by the digital recording system used, and the decoding process to render these files back into video, that posts the real problem,” he explained. “At the Library of Congress, we worry about the ability to decode and render different kinds of video streams, especially as the number of proprietary digital formats keeps proliferating.”

The obvious solution is to keep digital players on hand for every format that the LOC is intent on preserving, as is being done for its current collection of videotapes. However, such a solution is really no solution at all, because these machines and their stored spare parts are doomed to wear out eventually.

Meanwhile, the idea that the LOC can just store a range of incompatible video files on an unlimited number of servers is just not financially realistic. Then there’s the amount of physical space that such an approach would require, especially when “we already have tens of thousands of 2-inch tapes that are heavy and take up a lot of space, plus hundreds of thousands of other videocassette formats,” said Gregory Lukow, chief of the LOC’s Motion Picture, Broadcasting and Recorded Sound Division.

Clearly, the ideal solution is to develop a standardized digital video format for archiving and retrieval. Investigating what kind of formats—likely open source—would be best suited for long term video archiving is the job of PDPTV partner New York University, specifically the Moving Image Archiving & Preservation program directed by Dr. Howard Besser.

“We’re leading the studies that are being done in terms of digital video standards, wrapper standards, video quality and compression,” Besser explained. “So far we haven’t settled on the format. It will likely be MPEG-2 SD or Motion JPEG.”

STORAGE SOLUTIONS
The wrapper is one of this project’s most interesting elements. As the name suggests, it’s a piece of software attached to the video file that includes its metadata, plus information to aid in its decoding and storage. Although the PDPTV partners are still debating which wrapper format to use, it appears that the MXF is the front-runner, Besser said.

Even if MXF is chosen, many issues remain. “MXF is just a wrapper,” offered David MacCarn, WGBH’s chief technologist and asset management architect. “You still have to deal with the contents within the wrapper, and that’s where the tricky bit is.”

Devising a robust, easily retrievable video archiving format isn’t enough. Servers also have to be created to store these files and protect them from being altered over time.

Complicating matters is the fact that “what’s being used at stations now is an asset management system, not a preservation system,” explained Besser. “An asset management system puts its emphasis on quick input and output. A preservation system is optimized for protection against write-overs and for handling migration or emulation of file formats.”

Ideally, a preservation server system should make it easy not just to retrieve content, but reuse it in new programs on air and the Web. In conducting the PDPTV study, one of the project team’s key goals is to create a working test bed of such a system, both to iron out the operational details and to learn how such a system needs to be financed and managed.