Protecting Digital Data Resiliency

Imagine
(Image credit: Imagine)

When an organization has the ability to bounce back from a disruption, e.g., a power outage, natural disaster, or a cyberattack or ransomware attack, that ability to recover is referred to as being “data resilient.” Typically, data resiliency is framed within a disaster recovery (DR) plan and is best protected when its data is backed up regularly and stored in multiple locations.

Examples of data center (or private cloud) resiliency might include having server power supply redundancy, where each server’s power supply is duplicated to protect from failure of its primary supply; or at the extreme where there is a duplicated server (i.e., a “secondary” server) that is active or in a hot-standby mode and automatically takes over should the primary server fail.  

IDC predicts the Global Datasphere will grow from 33 ZB in 2018 to 175 ZB by 2025."

On a larger scale, data center redundancy is another resiliency option, with the same redundancy concept holding true at the level of the data center facility itself—that is, a portion (or all) of the data center is replicated on site or at an alternative location including the cloud.

Co-Lo Options
Colocation (also called “co-lo”), whereby organizations that support “hot sites” (readily primed to take over in the event the primary site is compromised), is another methodology for digital data resiliency.  

Data center colocation, aka “future-proof” allows your digital footprint to be replicated. In turn, this concept serves multiple purposes but most importantly, it allows the organization to scale faster and easier while providing flexibility and the ability to adapt quickly. It further enables you to take advantage of new opportunities without depending upon other “non-controlled” means (as in a short-term application using the cloud). 

Critically Safe
Your IT infrastructure is everything. Keeping it safe from natural disasters, threats and bad actors falls into the category of “critical services.” Modernizing a hybrid IT infrastructure not only improves performance (as in applications and services) but it allows the organization to unlock innovation, efficiency and cost savings. When critical services are finetuned through practices associated with resiliency, then connectivity is improved, reliability increases and faster responsiveness is achieved.

No matter where the data center is located, where you are or where you’re going, colocation can enable you to adapt to an ever-changing digital landscape. Flexibility constraints are reduced while resiliency is enhanced. Services become instantly scalable, enabling a flexible infrastructure with high-speed connectivity and proximity to partners and carriers.

International Data Corp. has identified some rationale as to why organizations should store more of the data it creates.

First, data is essential to any organization’s efforts to establish digital resiliency. Defined as “the ability for an organization to rapidly adapt to business disruptions by leveraging digital capabilities,” this applies not only to the restoration of operations, but it allows the organization to capitalize on changing conditions. Sometimes framed in part as a “digital operating model”—having an efficient environment enables digital resiliency because businesses are dependent on their data.

Second, digitally transformed companies (those who have adopted a digital operating model) use their data to develop innovative solutions for their future. Companies are quickly discovering that having more data not only helps affirm the direction they are heading, but also creates opportunities to launch new revenue streams.

Today, there are needs in the organization to monitor the pulse of their employees, customers and partners in order to retain a high level of trust and empathy that ensure customer satisfaction and loyalty. Data is the source for this pulse and those entities believe there is latent, potentially unmined value from analyzing both current and older data. 

However, the flip side is that the cost to store more (or all) data holds organizations back from modifying their data retention policies. Media is certainly a part of that “video hording” model, especially news organizations that never know when that one lone, exclusive story content would propel them forward. A deep “glacial” archive that could take hours to days to access is not the answer as the news mandate demands accessibility, given that no one knows when that “special clip” might need to be recovered, fast!

Never Enough Storage
There is a cost to pay for these capabilities, as IDC reported in a May 2022 forecast, which stated: “worldwide, the (circa 2021) global StorageSphere forecast for 2022–2026 would produce a base of 7.9 zettabytes (1 zettabyte=1 trillion gigabytes) of storage capacity at a base install cost of $370 billion.” However you read this, IDC sees total storage in excess of 175 ZB by 2025. Predictors speculated in 2021 that having this much storage “may still not be enough.” Furthermore, there may never be a known ending point for storage to stop growing, and no one knows what those numbers may really be.

Organizations are just now beginning to show a positive ROI on data analytics initiatives, especially with older data—lending to the need for a well-protected and resilient data management agenda. In support of this, a proven ROI (on analytics initiatives) would only amplify the need for storing more data or retaining data longer. Leveraging AI for active search and recovery of older information will certainly depend on the ability to access this older data—the question looming being: “How does this get paid for?” But that is a topic for another time…

The summation of all this data, whether it is created, captured or replicated, is called the “Global Datasphere,” and it is experiencing tremendous growth. IDC predicts that the Global Datasphere will grow from 33 ZB in 2018 to 175 ZB by 2025. Ironically, it is estimated that >30% of the 175 ZB of global data in 2025 will be generated in real time. In 2017, about 15% of that DataSphere was already real-time data. 

To keep up with the storage demands stemming from all this data creation, IDC forecasts that over 22 ZB of storage capacity must ship across all media types from 2018 to 2025, with nearly 59% of that capacity supplied from the hard disk drive (HDD) industry.

Equivalency is mathematically stated as 1,024 ZB = 1 Yottabyte (YB) with a yottabyte of storage taking up a data center the size of the states of Delaware and Rhode Island. A yottabyte is the largest unit approved as a standard size by the International System of Units (SI).

Enterprise Renaissance
The enterprise is fast becoming the world’s data steward… again. In the recent past, consumers were responsible for much of their own data, but their reliance on and trust of today’s cloud services, especially from connectivity, performance and convenience perspectives, continues to increase while the need to store and manage data locally continues to decrease. 

Moreover, businesses are looking to centralize data management and delivery (e.g., online video streaming, data analytics, data security and privacy) as well as to leverage data to control their businesses and the user experience (e.g., machine-learning, machine-to-machine only communication, IoT, persistent personalization profiling). The responsibility to maintain and manage all this consumer and business data supports the growth in provider cloud datacenters. 

As a result, the enterprise’s role as a data steward continues to grow, and consumers are not just allowing this, but expecting it. 

As recent as 2019, more data would be stored in the enterprise core than in all the world’s existing endpoints. (Reference: “The Digitization of the World – from edge to core”: an IDC white paper in November 2018). The diagram above shows the Data Hierarchy, adapted from the SNIA Dictionary 2023.

M&E Jump Start
According to IDC, “mankind is on a quest to digitize the world,” and one of the key drivers of growth in the core is the shift to the cloud from traditional datacenters. As companies continue to pursue the cloud (both public and private) for data processing needs, cloud datacenters are becoming the new enterprise data repository. In essence, the cloud is becoming the new core. In 2025 IDC predicts that 49% of the world’s stored data will reside in public cloud environments.

Not all industries are prepared for their digitally transformed future. So, to help companies understand their level of data readiness, IDC developed a DATCON (DATa readiness CONdition) index, designed to analyze various industries regarding their own Datasphere, level of data management, usage, leadership and monetization capabilities.

It examined four industries as part of its DATCON analysis: financial services, manufacturing, healthcare and media and entertainment.

Manufacturing’s Datasphere is by far the largest—given its maturity, investment in IoT and 24×7 operations, manufacturing and financial services are the leading industries in terms of maturity, with media and entertainment most in need of a jump start.

A Data Explosion
Every geographic region has its own Datasphere size and trajectories that are impacted by population, digital transformation progress, IT spend and maturity, and many other metrics. China’s Datasphere is on pace to becoming the largest in the world and is expected to grow 30% on average over the next seven years; by 2025, it will be the largest Datasphere of all regions (compared to EMEA, APJxC, U.S., and the rest of world) as its connected population grows and its video surveillance infrastructure proliferates. (APJxC includes AsiaPac countries, including Japan, but not China.)

Consumers are addicted to data, and more of it in real time, i.e., active or stored video from entities such as TikTok, Facebook, Instagram and more.  

You cannot hide from the data. More than 5 billion consumers interact with data every day; and by 2025, that number may be 6 billion or more. That equates to 75% of the world’s population. So, by 2025, each connected person will have at least one data interaction every 18 seconds. Many of these interactions are because of the billions of IoT devices connected across the globe, which by themselves, are expected to create over 90 ZB of data in 2025.

Karl Paulsen

Karl Paulsen is the CTO for Diversified, the global leader in media-related technologies, innovations and systems integration. Karl provides subject matter expertise and innovative visionary futures related to advanced networking and IP-technologies, workflow design and assessment, media asset management, and storage technologies. Karl is a SMPTE Life Fellow, a SBE Life Member & Certified Professional Broadcast Engineer, and the author of hundreds of articles focused on industry advances in cloud, storage, workflow, and media technologies. For over 25-years he has continually featured topics in TV Tech magazine—penning the magazine’s Storage and Media Technologies and its Cloudspotter’s Journal columns.