Briefing: Cloud storage performance metrics

by Joseph K. Clark

About 50% of business data is now stored in the cloud – and the volume stored using cloud technologies is higher when private and hybrid clouds are factored in.

Cloud storage is flexible and potentially cost-effective. Organizations can pick from the hyperscalers – Amazon Web Services, Google’s GCP, and Microsoft Azure – and local or more specialist cloud providers.

But how do we measure the performance of cloud storage services? When storage is on-prem, numerous well-established metrics allow us to keep track of storage performance. In the cloud, things can be less precise.

That is partly because choice brings complexity when it comes to cloud storage. Cloud storage comes in a range of formats, capacities, and performances, including file, block, and object storage, hard-drive-based systems, VM storage, NVMe, SSDs, and even tape, as well as technology that works on a “cloud-like” basis on-premise.

This can make comparing and monitoring cloud storage instances harder than on-premise storage. As well as conventional storage performance metrics, such as IOPS and throughput, IT professionals specifying cloud systems need to account for cost, service availability, and even security criteria.

Cloud storage

Conventional storage metrics

Conventional storage metrics also apply in the cloud. But they can be somewhat harder to unpick. Enterprise storage systems have two main “speed” measurements: throughput and IOPS. Throughput is the data transfer rate to and from storage media, measured in bytes per second; IOPS measures the number of reads and writes – input/output (I/O) operations – per second.

In these measurements, hardware manufacturers distinguish between reading and write speeds, with reading speeds usually faster. Hard disk, SSD, and array manufacturers distinguish between sequential and random reads or write.

These metrics are affected by the movement of reading/write heads over disk platters and the need to erase existing data on flash storage. Random read-and-write performance is usually the best guide to real-world performance.

Hard-drive manufacturers quote revolutions per minute (rpm) figures for spinning disks, typically 7,200rpm for mainstream storage, sometimes 12,000rpm for higher-grade enterprise systems, and 5,400rpm for lower-performance hardware. These measures do not apply to solid-state storage, however.

So, the higher the IOPS, the better performing the system. Spinning disk drives usually reach the 50 IOPS to 200 IOPS range.

Solid-state systems are significantly faster. A high-performance flash drive can reach 25,000 IOPS or even higher on paper. However, real-world performance differences will be more minor once storage controller, network, and other overheads, such as the use of RAID and cache memory, are factored in.

Latency is the third key performance measure to factor in. Latency is how quickly each I/O request is carried out. For an HDD-based system, this will be 10ms to 20ms. For SSDs, it is a few milliseconds. Latency is often the most crucial metric to determine whether storage can support an application.

Cloud metrics

But translating conventional storage metrics into the cloud is rarely straightforward. Usually, buyers of cloud storage will not know precisely how their capacity is provisioned. The exact mix of flash, spinning disk, and even tape or optical media depends on the cloud provider’s service levels.

Most large-scale cloud providers operate a blend of storage hardware, caching, and load-balancing technologies, making raw hardware performance data less useful. Cloud providers also offer different storage formats – block, file, and object – making comparing performance measurements even harder.

Measures will also vary with the types of storage an organization buys because the hyperscalers now offer several tiers of storage based on performance and price. Then there are service-focused offerings, such as backup and recovery, and archiving, which have metrics, such as recovery time objective (RTO) or retrieval times.

The most accessible area for comparisons, at least between the large cloud providers, is block storage. For example, Google’s Cloud Platform lists the maximum sustained IOPS and maximum sustained throughput (in MBps) for its block storage. This is further broken down into reading and writing IOPS and throughput per GB of data and instance. But as Google states: “Persistent disk IOPS and throughput performance depends on disk size, instance vCPU count, and I/O block size, among other factors.”

Google also lists a helpful comparison of its infrastructure performance against a 7,200rpm physical drive. Microsoft publishes guidance aimed at IT users that want to monitor its Blob (object) storage, which serves as a helpful primer on storage performance measurement in the Azure world.

AWS has similar guidance based on its Elastic Block Store (EBS) offering. Again, this can guide buyers through the various storage tiers, from high-performance SSDs to disk-based cold storage.

Cost, service availability… and other valuable measures

As cloud storage is a pay-as-you-use service, the cost is always a critical measurement. Again, all the leading cloud providers have tiers based on price and performance. AWS, for example, has gp2 and gp3 general-purpose SSD volumes, io1 and io2 performance-optimized volumes, and st1 throughput-focused HDD volumes aimed at “large, sequential workloads”. Buyers will want to compile their cost and performance analysis to make like-to-like comparisons.

Some very cheap long-term storage offerings can become very expensive when retrieving data. But there is more to cloud storage metrics than cost and performance. The price per GB or instance needs to be considered alongside other fees, including data ingress and especially data egress or retrieval.

A further measure is a usable capacity: how much of the purchased storage is available to the client application, and when will utilization impact real-world performance? Again, this might differ from figures for on-premise technology.

CIOs will also want to look at service availability. Storage component and sub-systems reliability is traditionally measured in the mean time between failures (MTBF), or for SSDs, the newer terabytes written over time (TBW).

But for large-scale cloud provision, availability is a more common and practical measure. Cloud providers increasingly use datacentre or telecoms-style availability or uptime measures, with “five nines” often the best and most expensive SLA.

Even then, these metrics are not the only factors to consider. Buyers of cloud storage will also need to consider geographical location, redundancy, data protection and compliance, security, and even the cloud provider’s financial robustness. Although these are not performance measures in the conventional sense, if a provider falls short, it could be a barrier to using its service at all.

Related Posts