Understanding and Selecting Storage for IBM Cloud

A go-to guide for how to select storage in IBM Cloud.

IBM Cloud supports a range of different compute options for executing the programs that make up a cloud architecture, and each compute option has a collection of storage options available. There are three basic technologies for data storage:

Block storage: Random read/write access for fixed-size “blocks” of data (e.g., the disk drive on your laptop). Operating systems provide file system abstractions layered on block storage and an API for direct access to the blocks. The file system is available to programs hosted on the operating system. A few specialized programs, like databases, use the block storage API directly. Block storage is low latency and high throughput. A block storage device is generally dedicated to a single computer at a time.
Network file storage: These are network-accessible directory (hierarchical) file systems. A popular API is the Network File System, NFS. A network file storage instance can be shared by multiple computers at the same time. The directory structure and files provided are integrated into the operating system’s directory system.
Cloud object s torage: These are streams of bytes (objects) accessed by a name (key). This system is designed to store buckets of object for retrieval with a high degree of durability and deliver objects over a worldwide network at web browser latencies. Network size (data center, region, cross-region) is selected for a bucket that controls the data residency and also influences performance and potentially durability. The smaller the network, the better performance for applications in the smaller network. An outage or localized disaster will affect the availability of the objects. Relaxing latency (e.g., “near line” or “off line” access) can reduce the price per terabyte stored. The IBM Cloud Object Storage (COS), is an object storage system configured to suit your price, performance, connectivity, availability and durability requirements.

For a deeper dive and comparison of these types of storage, see “Object vs. File vs. Block Storage: What’s the Difference?“

In addition, the IBM Cloud Catalog has storage services (e.g., database, streaming, logs, messaging, etc.). Once provisioned, the resources are accessed over the private or public TCP/IP network.

Compute options and associated storage options

It is useful to consider the compute options in the IBM Cloud and their associated storage options.

IBM Cloud Virtual Private Cloud (VPC) server instances

Instance storage is block storage that has the same lifetime as the associated compute instance and is only accessible by the instance. It will generally have the highest throughput and IOPs performance.
Volume is block storage that can be attached to one instance at a time but can be detached and then attached to a different instance.
File storage is network file storage.
Cloud object storage is accessible using the network API.
Storage services are accessible using the network API.

VPC Storage

Red Hat OpenShift on IBM Cloud

Block storage options are available through the volumes of the worker nodes (VPC server instances) in the cluster.
OpenShift Data Foundation (see diagram below) provides file storage, block storage and object storage.
Portworx is third-party software that layers on top of block storage to provide high availability, aggregation, file storage and lifecycle management.
File storage on COS provides file system access to the contents of a bucket. See Kubernetes Persistent Volumes Backed by IBM Cloud Object Storage Buckets.
Cloud object storage is accessible using the network API.
Storage services are accessible using the network API.

IBM Cloud Kubernetes Service

Block storage options are available through the volumes of the worker nodes (VPC server instances) in the cluster
Portworx is third-party software that layers on top of block storage to provide high availability, aggregation, file storage and lifecycle management.
File Storage on COS provides file system access to the contents of a bucket. See Kubernetes Persistent Volumes Backed by IBM Cloud Object Storage Buckets.
Cloud object storage is accessible using the network API.
Storage services are accessible using the network API.

IBM Cloud Code Engine

Cloud object storage is accessible using the network API.
Storage services are accessible using the network API.

IBM Cloud for VMware Solutions

Block storage and network file storage are an integral part of the IBM Cloud for VMware Solutions. A rich set of options are described in the documentation. See Storage to use with VMware Systems.
Cloud object storage is accessible using the API.
Storage services are accessible using the API.

IBM Cloud Satellite

Satellite storage templates are available for fully supported IBM storage systems and third-party systems for block storage and network file storage. The physical devices can be the current Red Hat Local and OpenShift Data Foundation systems currently in your environment. Templates can also address IBM-specific storage systems like IBM Spectrum Scale and third-party systems.
OpenShift Data Foundation provides file storage, block storage and object storage (see diagram above).
IBM Cloud Satellite is available on third-party clouds. Satellite storage templates native to those clouds should be used, like AWS EBS for block storage and AWS EFS for file storage.
Satellite Link endpoints for cloud storage services:
- Cloud object storage is accessible using the API.
- Storage services are accessible using the API.
Storage services provisioned within Satellite:
- Cloud object storage is accessible using the API.
- Storage services are accessible using the API.

Choosing storage

As you map architectural components onto the compute options listed above, consider the storage requirements. For example, if the component requires block storage, it cannot be hosted on IBM Cloud Code Engine.

If you need a storage service like database, logging, message handling, etc., look in the IBM catalog. The flexibility of a fully managed service may be a good fit, and you can let IBM handle some of the grunt work.

If you must host an application (like a database) that requires direct access to a block device, it will require block storage. Volume is more accessible and durable. Instance storage is higher performance.

If the application has the need for file storage, there is a tradeoff between sharing and performance. A file system layered on block storage can be mounted into your compute instance and will provide a high level of performance but no simultaneous sharing of files with other server instances. A single network file storage device can be mounted on multiple compute instances and allows file sharing.

Another tradeoff is between latency and price. Cloud object storage has inexpensive options for storing globally accessible, web latency content. Buckets can be configured with higher latencies — like cold vault storage — for pennies per GB, per month. You can store video, audio, logs, backups, etc. at a fraction of the cost of block storage or file storage.

Durability and latency are generally tradeoffs, as well. Cloud object storage is durable, and server instance storage is low latency.

Personally, as I decompose an architecture into executable components, I use this decision tree to help me with storage decisions:

Note: If the software has a file system dependency, it can not be hosted on Code Engine.

The typical flow is the thick arrows. Let me explain:

If it is a third-party storage system, I go straight to the IBM Cloud Catalog. IBM handles the configuration, security, elasticity, etc. I leave that to the professionals.
Software I control will use objects for images, documents, compressed files, static html, etc. The objects are persisted in COS, which is cheap and durable. Only use the file system for object access that is latency-sensitive.
Some software I do not control. I just host it in the IBM Cloud. Databases typically use block devices directly. Most other software layers on a file system.

Comparison of storage systems:

*IBM Cloud Kubernetes Service and OpenShift instance storage – Classic only

Prices can be accurately determined in the IBM Cloud cost estimator:

Instance storage (you must select a profile, like bx2d-2×8, that has instance storage)
Volume
File storage
COS

Conclusion

The IBM Cloud catalog has a large choice of cloud storage services. These can be provisioned from the IBM Cloud Console or via automation. For those new to the cloud, it seems almost magical when resources are provisioned in the cloud, but that does not imply 100% availability and durability. Storage management issues like backup and restore will continue to be important considerations. On the IBM Cloud, you pay only for what you use, so experiment to find your optimal solution and test to ensure operational success.

Explore the IBM Cloud capabilities discussed in this post:

Was this article helpful?

YesNo

Powell Quiring

Offering Manager

A go-to guide for how to select storage in IBM Cloud.

Compute options and associated storage options

IBM Cloud Virtual Private Cloud (VPC) server instances

Red Hat OpenShift on IBM Cloud

IBM Cloud Kubernetes Service

IBM Cloud Code Engine

IBM Cloud for VMware Solutions

IBM Cloud Satellite

Choosing storage

Conclusion

More from Cloud

How a US bank modernized its mainframe applications with IBM Consulting and Microsoft Azure

The power of the mainframe and cloud-native applications

Modernize your mainframe applications with Azure

IBM Newsletters