Making the Cloud Invisible

What does it take to make the cloud “invisible?” Put another way, in the context of an end user running a media-focused application, what parameters create an environment such that the user cannot discern if the app is running locally or in a remote cloud? For example, using a video editor app and doing the classic jog/shuttle function across the timeline, can a user tell by the “feel of the app” that the runtime code is local or remote? If the app feels local in all aspects then the cloud is invisible to the end user. For SaaS apps in particular it’s good to aim for this goal; users will demand it.

For sure, it’s not easy to create an invisible cloud environment. There are many aspects of “Quality of Service” that determine the user experience. The main quality domains are:

• Transport QoS from premise to cloud including the lossey and delay prone Internet;

• Compute QoS including deterministic latency of a short operation and speed of a long operation (e.g., transcode);

• Storage QoS including IOPS, bandwidth, deterministic latency and other storage related parameters;

• Availability QoS including uptime percentage, access latency to a service or resource. This is tightly bound to systems reliability, and

• Security QoS including access control, authentication, encryption, DDoS attacks prevention and more.

This column is the first of several to explore the QoS of the cloud from the perspective of a user or system component at a remote facility. Let’s start by examining transport QoS (No. 1 in list). Fig. 1 outlines the salient aspects of transport QoS.

THE BIG FOUR QOS METRICS
Transport QoS is measured using four main parameters; bandwidth (data rate), latency, packet jitter and packet loss. Internet marketing has corrupted the meaning of “bandwidth” to mean data rate so I use this term reluctantly. Of course, availability (uptime) is also a measure of transport QoS, but for now let’s treat availability separately. Fig. 1 shows that the end-to-end QoS is divided across three areas; facility premise, Internet (or direct connect) and cloud provider. Each of these areas contributes to the QoS either positively or negatively.

Here is a brief summary of the effects of each of the big four contributors:

• Data rate (bandwidth); sufficient to meet the simultaneous and peak needs of all the apps and services required at the premise. Some apps will require continuous streaming bandwidth and others can use variable file transfer rates. Key is not to starve any media streams.

• Round trip latency; the lower the better. Delay is the enemy of reliable file transfer. TCP (FTP and HTTP use) is very delay sensitive and can operate ~80 times slower in the presence of large (200ms) RT delays compared to small delays (10ms). There are practical ways to circumvent the slowdown using “transfer acceleration” techniques.

• Jitter; this is the time variation in latency. For most apps and modest jitter (±25ms), this metric is not critical. Large jitter values ruins TCP transfer rates.

• Packet loss; lower the better. The raw Internet has a packet loss of about .1 percent although this can vary widely depending on traffic conditions. Also, loss is often not directionally symmetric and this can adversely affect transfer rates due to TCP’s performance. TCP-based data rate is reduced by about a factor of 3 when loss increases by a factor of 10.

THE QOS CHAIN
There are three links in the overall QoS chain. Let’s consider the middle link first. Fig. 1 shows two paths from premise to the cloud. The most common connection is the “best effort” internet, path A. No carrier will guarantee internet delay, loss or bandwidth; you get what you get. Of course, your facility connection to the internet (the so-called “last mile”) has some QoS guarantees but this is a small contributor to the overall Internet QoS. So, purchasing a 100 Mbps clean pipe to the Internet does not guarantee 100 Mbps data end-to-end transfers by any means.

An alternative to Internet connectivity is to purchase a direct connection to your cloud vendor, (path B in Fig. 1). Amazon, Terremark and other cloud providers offer this option. An example of this relies on the famous One Wilshire telecom hub in Los Angeles. It has direct paths to many cloud vendors, bypassing the Internet. It’s possible to link from a media facility to One Wilshire using Metro Ethernet (for example) thus creating an Internet bypass and guaranteeing an excellent QoS for the facility-to-cloud transport chain.

For sure, paths A or B are big contributors to the overall transport QoS, but next in line is usually the local premise QoS. Running apps and other services in the cloud puts strict demands on the in-house networking and some facilities are not geared to support the required QoS. This could make the facility the weak link in the transport chain. Don’t assume “all will be OK” with in-house networks. Run tests, measure performance and don’t assume anything; measure it.

The good news is that facility managers have total control over local transport QoS so you have a good chance of an excellent overall end-to-end QoS. Finally, the cloud-provider portion should offer the best QoS performance in the three link chain.

It has been shown that the cloud can be invisible if transport QoS parameters are sufficiently defined, measured and managed on a daily basis. I have personally used a video editor app executing 4K RT miles from the UI and the cloud was invisible for my tests. My first experience of this made me a believer in the promise of the cloud.

Transport QoS is only one aspect of the total “quality of experience.” In the next column other cloud QoS metrics will be considered.

Al Kovalick is the founder of Media Systems consulting in Silicon Valley. He can be reached via TV Technology.