Compressed volumes

When you create volumes, you can specify compression as a method to save capacity for the volume. With compressed volumes, data is compressed as it is written to disk, saving more space. When data is read to hosts, the data is decompressed.

Compression is available through data reduction support as part of the system. If you want volumes to use compression as part of data reduction support, compressed volumes must belong to data reduction pools. Data reduction pools also support reclaiming unused capacity automatically after mapped hosts no longer need the capacity for operations. These host issue SCSI unmap commands and the released capacity is reclaimed by the data reduction pool for redistribution. For compressed volumes in data reduction pools, the used capacity before compression indicates the total amount of data that is written to volume copies in the storage pool before data reduction occurs. Compressed volumes are also supported in standard pools, but these pools do not support reclaiming unused capacity. If you have existing compressed volumes in standard pools, the following values help determine capacity for each compressed volume:
Real capacity
Indicates the extent capacity that is allocated from the standard pool. The real capacity is set when the compressed volume is created and can be expanded or shrunk down to the used capacity.
Virtual capacity
Indicates the capacity that is available to hosts. The virtual capacity is set when the compressed volume is created and can be expanded or shrunk afterward.
Used capacity
Indicates the amount of real capacity that is used to store customer data and metadata after compression.

You can also monitor information on compression usage to determine the savings to your storage capacity when volumes are compressed. To monitor system-wide compression savings and capacity, select Monitoring > System. You can compare the amount of capacity that is used before compression is applied to the capacity that is used for all compressed volumes. In addition, you can view the total percentage of capacity savings when compression is used on the system. You can also monitor compression savings across individual pools.

Benefits of compression

Using compression reduces the amount of physical storage across your environment. You can reuse free disk space in the existing storage without archiving or deleting data.

Compressing data as it is written to the volume also reduces the environmental requirements per unit of storage. After compression is applied to stored data, the required power and cooling per unit of logical storage is reduced because more logical data is stored on the same amount of physical storage. Within a particular storage system, more data can be stored which reduces overall rack unit requirements.

Compression can be implemented without impacting the existing environment and can be used with other storage processes, such as mirrored volumes and Copy Services functions.

Compressed volumes provide an equivalent level of availability as regular volumes. Compression can be implemented into an existing environment without an impact to service and existing data can be compressed transparently while it is being accessed by users and applications.

Common uses for compressed volumes

Compression can be used to consolidate storage in both block storage and file system environments. Compressing data reduces the amount of capacity that is needed for volumes and directories. Compression can be used to minimize storage utilization of logged data. Many applications, such as lab test results, require constant recording of application or user status. Logs are typically represented as text files or binary files that contain a high repetition of the same data patterns. Database information is stored in table space files. It is common to observe high compression ratios in database files.

By using volume mirroring, you can convert an existing fully allocated volume to a compressed volume without disrupting access to the original volume content. The management GUI contains specific directions on converting a basic volume to a compressed volume.

Planning for compressed volumes

Attention: Do not compress volumes on an AE3 storage enclosure. These volumes are already hardware-compressed.

Before you implement compressed volumes on your system, assess the current types of data and volumes that are used on your system. Do not compress data that is already compressed as part of its normal workload. Data such as video, compressed file formats (.zip files), or compressed user productivity file formats (.pdf files), is compressed as it is saved. It is not effective to spend system resources for compression on these types of files since little extra savings can be achieved. Encrypted data also cannot be compressed.

There are two types of volumes to consider homogeneous and heterogeneous. Homogeneous volumes are typically better candidates for compression. Homogeneous volumes contain data that was created by a single application and these volumes store the same kind of data. Examples of homogeneous volumes include database applications, email, and server virtualization data. Heterogeneous volumes are volumes that contain data that was created by several different applications and contain different types of data. Since different data types populate such volumes, there are situations where compressed or encrypted data are stored on these volumes. In such cases, system resources can be spent on data that cannot be compressed. Avoid compressing heterogeneous volumes, unless the heterogeneous volumes contain only compressible, unencrypted data.

If your system currently does not use compression, the system automatically analyzes your configuration to determine the potential storage savings if compression is used. The management GUI incorporates the Comprestimator utility that uses mathematical and statistical algorithms to create potential compression savings for the system. The analysis for potential savings can be used to determine whether purchasing a compression license for the system is necessary to reduce cost of extra storage devices. To estimate compression savings in the management GUI, select Volumes > Actions > Space Savings > Estimate Compression Savings. For example, you can run the analyzevdisk command on a single volume. You can also use the analyzevdiskbysystem command to analyze all of the volumes that are on the system. Any volumes that are created after the compression analysis completes can be evaluated individually for compression savings. Ensure that volumes to be analyzed contain as much active data as possible rather than volumes that are mostly empty of data. Analyzing active data increases accuracy and reduces the risk of analyzing old data that is already deleted but can still have traces on the device.

After the analysis completes, you can download a savings report that shows estimated savings for all the volumes with enough data to be analyzed. This report lists all currently configured volumes on the system and their potential compressions savings. To download a report, select Volumes > Volumes > Actions > Space Savings > Download Savings Report. You can also display the results by using the lsvdiskanalysis command. You can display results for all the volumes or single volumes by specifying a volume name or identifier for individual analysis.

Various configuration items affect the performance of compression on the system. To attain high compression ratios and performance on your system, ensure that the following guidelines are met:
  • If you have only a small number (10 - 20) of compressed volumes, configure them on one I/O group and do not split compressed volumes between different I/O groups.
  • For larger numbers of compressed volumes on systems with more than one I/O group, distribute compressed volumes across I/O groups to ensure that access to these volumes is evenly distributed among the I/O groups.
  • Identify and use compressible data only. Different data types have different compression ratios, and it is important to determine the compressible data currently on your system. You can use tools that estimate the compressible data or use commonly known ratios for common applications and data types. Storing these data types on compressed volumes saves disk capacity and improves the benefit of using compression on your system. The following table shows the compression ratio for common applications and data types.
    Table 1. Compression ratio for data types. Table 1 describes the compression ratio of common data types and applications that provide high compression ratios.
    Data Types/Applications Compression Ratios
    Databases Up to 80%
    Server or Desktop Virtualization Up to 75%
    Engineering Data Up to 70%
    Email Up to 80%
  • Ensure that you have an extra 10% of capacity in the pools that are used for compressed volumes for the additional metadata and to provide an error margin in the compression ratio.
  • Use compression on homogeneous volumes.
  • Avoid using any client, file system, or application based-compression with the system compression.
  • Do not compress encrypted data.
The following planning guidelines apply to compressed volumes in standard pools only:
  • Compression requires dedicated hardware resources within the node which are assigned or de-assigned when compression is enabled or disabled. Compression is enabled whenever the first compressed volume in an I/O group is created and is disabled when the last compressed volume is removed from the I/O group.
  • As a result of the reduced hardware resources available to process non-compressed host-to-disk I/O, you should not create compressed volumes if the CPU utilization of node in an I/O group is consistently above certain values. Performance might be degraded for existing non-compressed volumes in the I/O group if compressed volumes are created. Use Monitoring > Performance in the management GUI during periods of high host workload to measure CPU utilization.

Size limits

If you are using compressed volumes in standard pools, these volumes have the following size limits. If a new or existing compressed volume in a standard pool approaches the maximum size, the system issues an alert. Compressed volumes in data reduction pools do not monitor size of the volumes.

96 TB
Maximum virtual size of a new, individual compressed volume. You cannot create a new compressed volume that exceeds this size. In addition, you cannot increase the size of an existing compressed volume beyond this value. If one or more compressed volumes in a system exceed this limit, you receive an alert. To reduce the risk of losing or corrupting data, you must take action soon to remove data from the compressed volume.
120 TB
Maximum virtual size of an existing compressed volume in a system. If any compressed volumes in the system approach or exceed this value, the system issues an alert.
Important: Immediate action is required to remove all data from the compressed volume and prevent the loss of data.
128 TB
Maximum physical size of a compressed volume.

For information about how to move data off a compressed volume in a standard pool, view the topic on the IBM Support portal website for your product. Search for your product, then select the Flashes, alerts and bulletins link under Documents on the support page for your product.