WGBH’s Object Storage-Based Archive Illustrates the Power of Metadata

BOSTON—As one of the nation’s premier public broadcasters, Boston’s WGBH-TV has a strong legacy of programming that it has provided to PBS and other outlets for decades. From the classic children’s series “ZOOM” to “Masterpiece Theater,” “Nova” and “Antiques Roadshow,” WGBH has enriched the nation’s library of television programming since it first signed on the air more than 60 years ago.

That program library contains thousands of hours of film and video in a variety of formats that has become increasingly hard to manage over the years, according to Shane Miner, senior director of technical services, who joined WGBH four years ago. Problems such as increasing rates of ingest, manual processes, no digital rights and inadequate media asset management led the WGBH IT team to search for a better solution.

Old school search

Old school search

“When I started, I noticed what I thought was an inefficiency around our archives,” Miner said. “It was too manual and we were going to struggle to scale as both the amount and size of media increased. I felt like it wasn’t providing producers with what they needed to be able to meet their shorter deadlines and quicker turnarounds.”

Initially WGBH looked into a cloud-only solution but found that it didn’t meet their workflow requirements because of long download times and egress charges for video became too expensive.


The key to managing and maximizing the full potential of WGBH’s vast archive could be summed up in one word: metadata, according to Miner. This led him to consider object storage as a hybrid solution that provides the best of both worlds between a NAS/SAN and all-cloud system.

Today's search

Today's search

“You can’t find any video unless you have a good metadata structure to be able to search, so the idea with object store is that you can marry the metadata to the video as a single discrete object that gives you a ‘self-describing’ video file,” Miner said

In addition to enhanced search and retrieval, other advantages to object storage is that it is limitlessly scalable, and can be deployed in an on-prem and cloud environment. In the end this reduces costs by reducing the amount of time staff has to deal with physical media, an important consideration for a public TV station with a limited budget.

WGBH stored a large amount of its programming on LTO and the time consumed in accessing physical tape was a big consideration in cost reductions, according to Miner. “It’s a much easier process when it comes to people time and energy,” he said. “We don’t have to plan large projects around the changeover [from physical media]. Definitely object storage locally with the second copy in the cloud is cheaper than what we looked at over many, many years than a traditional LTO solution.”

WGBH’s current setup includes 3 petabytes of on-prem storage occupying 15RU in a data center that reduced physical space by 90 percent when compared to its tape library. It uses AWS Glacier for cloud/DR and AWS S3 for extra capacity. On-prem object store is based on the Cloudian platform that provides for editing and archive curation. Final versions are stored in Sony Ci.

WGBH's new production workflow

WGBH's new production workflow


Using object store reduces the need for a media asset management system, Miner said. Whereas search in the traditional archive was limited to just what was in the archive, object store allows content to be searchable as soon as media enters the workflow.

“What we’re hoping to do with object store is actually break apart the MAM,” he said. “Because we can store the metadata directly on the objects and then use the file system to search, you actually don’t need a big heavyweight MAM to handle all the processes.”

When Miner and his team were looking at implementing this new storage system, the station was in the process of launching its Public Media Management cloud-based master control system. Since PMM targeted centralized master control, the model lended itself towards more cloud-only storage since PMM only handled finished shows, according to Miner. For archiving, however, the requirements were different.

[Read: WGBH And Sony Partner On Cloud Workflow]

“The problem we have with our actual archive is that for example, ‘Frontline’’s hour-long show generates hundreds of hours of footage so storing all of that content becomes costly and cumbersome in terms of pull down cost,” he said. “So as we looked at the full archive, we’ll use Sony’s Ci to store our finished program material [for PMM], but that’s such a small subset of the actual full archive that it’s much easier to manage. But the two systems [PMM and the archive] are pretty disparate; central master control has its own archive of broadcast material that specifically focuses on master control and nothing else and this archive is specifically focused on WGBH and nothing else.”

Using metadata as the driver of the transition not only helps reduce physical space and the time station staff need to retrieve content, it also gives producers more capabilities in handling content through the production workflow. When you have archives that stretch all the way back to early Julia Child film reels, this illustrates how the massive potential of metadata can be used to maximize the value of such legacy programming.

“We take our archive to be our mission element,” Miner said. “We believe that maintaining our archive is a public good and that’s something we should do because it’s essential to our mission.”

Tom Butts

Tom has covered the broadcast technology market for the past 25 years, including three years handling member communications for the National Association of Broadcasters followed by a year as editor of Video Technology News and DTV Business executive newsletters for Phillips Publishing. In 1999 he launched digitalbroadcasting.com for internet B2B portal Verticalnet. He is also a charter member of the CTA's Academy of Digital TV Pioneers. Since 2001, he has been editor-in-chief of TV Tech (www.tvtech.com), the leading source of news and information on broadcast and related media technology and is a frequent contributor and moderator to the brand’s Tech Leadership events.