How long will the archive last?

Recently, I attended a seminar on advanced media workflows. The panel talked about the advantages that file-based operations bring to production, especially
Publish date:

Recently, I attended a seminar on advanced media workflows. The panel talked about the advantages that file-based operations bring to production, especially the increased speed of throughput. However, when it came to the final remarks, one of the panel's concerns was the archive. The studios would like to see their product still around in 100 years, just like 35mm film. Black-and-white separations have a long life as the silver image does not fade like color dyes.

The data guys will tell you that by virtualizing files from the storage media, files can similarly last for a long time. By automatic migration from today's favored substrate to tomorrow's yet to be invented format, the files should last forever. But will they?

Film could be considered self-describing; a future librarian uncovering a dusty film canister could open it up, hold the film up to the light and figure out it would need some form of image scanner to replay the content.

To recover content from a file, you need to know how to read the file. What is its structure? What is the codec? This information is generally held as metadata, either embedded in the file, or referenced in an associated digital asset management. But what happens to the metadata over time? Media companies come and go, as do storage vendors. Many broadcasters are on their third file-based system.

Who is going to ensure the data migration systems are running, and who is going to preserve the DAM database through mergers, takeovers and company collapses?

Film canisters may be pulled out of an underground repository decades from now and the content recovered, but imagine trying to read a 30-year-old hard drive or data tape, and then figuring out how to read the file with no metadata because the DAM system has been lost.

Right now there aren't many answers to this problem, other than printing film separations. This may be affordable for important movies, but not for more regular TV programming or news.

Does it matter? We can read eye-witness accounts of great battles from 2000 years ago. We can read the political intrigues of the Roman senate. Through the Middle Ages, artists captured events as paintings. We can imagine these historic events.

In the 20th century, we acquired the ability to capture our history as moving pictures with associated sound. It has become much easier to understand those past events or to be entertained by that historic content. But will humans be able to view those moving images 2000 years from now?

Most broadcasters may not be concerned with such long periods of time, but it would be dead handy if the archive could last a few decades. Right now there is an element of trust when you store precious content on a data storage system. There can be a strong certainty that the file will be around in five years, but it all gets fuzzy as you move further into the future. One only has to think of the 8in floppy to realize how short the lifetime of data storage formats can be.

In a film and videotape archive, each content item is inherently isolated from the rest. One videotape may become unplayable, but the library survives. A catastrophe in a file archive could lose everything.

As broadcasters move to file-based acquisition, content will only exist as a file. The role of the data archive is going to become important in order to preserve programs as future assets and as a historical record. When planning your data archive, I suggest that you explore scenarios for the potential total loss. This is not just deploying disaster recovery (DR), but looking well into the future beyond the life of your current DR system. If future generations are not to be confronted with a mysterious pile of antique data storage equipment, we must act now to design true archive systems.

Send comments