TS7700 Cache thresholds and removal policies

This topic describes the boundaries (thresholds) of free cache space in a TS7700 and the policies that can be used to manage available (active) cache capacity in a grid configuration.

Cache thresholds for a disk-only TS7700 Cluster

There are three thresholds that define the capacity of cache partition 0 (CP0) in a TS7700 Tape Attach and the active cache capacity in a disk-only TS7700. (Because a disk-only TS7700 does not attach to a physical backend library, all of its virtual volumes are stored in the cache.)

These thresholds determine the state of the cache as it relates to remaining free space. In ascending order of occurrence, they are:
Automatic removal
By default this state occurs when the cache is 3 TB below the out-of-cache-resources threshold. In the automatic removal state, the TS7700 automatically removes volumes from the disk-only cache to prevent the cache from reaching its maximum capacity. This state is identical to the limited-free-cache-space-warning state unless the Temporary Removal Threshold is enabled.
Note:
  • To perform removal operations in a TS7700 Tape Attach Cluster, the size of cache partition 0 (CP0) must be at least 10 TB. You can disable automatic removal within any given disk-only TS7700 Cluster by using the following library request command:
    LIBRARY REQUEST library-name,CACHE,REMOVE,{ENABLE|DISABLE}
  • The default automatic removal threshold can be changed from the command line, by using the following library request command:
    LIBRARY REQUEST library-name,CACHE,REMVTHR,{VALUE}
Limited free cache space warning
This state occurs when there is less than 3 TB of free space left in the cache. When the cache passes this threshold and enters the limited-free-cache-space-warning state, write operations can use only an additional 2 TB before the out-of-cache-resources state is encountered. When a disk-only TS7700 Cluster enters the limited-free-cache-space-warning state, it remains in this state until the amount of free space in the cache exceeds 3.5 TB. Messages that can be displayed on the management interface during the limited-free-cache-space-warning state include:
  • HYDME0996W
  • HYDME1200W
For more information about each of these messages, see the Related information section.
Note: Host writes to the TS7700 Cluster and inbound copies continue during this state.
Out of cache resources
This state occurs when there is less than 1 TB of free space left in the cache. When the cache passes this threshold and enters the out-of-cache-resources state, it remains in this state until the amount of free space in the cache exceeds 3.5 TB. When a disk-only TS7700 Cluster is in the out-of-cache-resources state, volumes on that cluster become read-only and one or more out-of-cache-resources messages are displayed on the management interface. These messages can include:
  • HYDME0997W
  • HYDME1133W
  • HYDME1201W
For more information about each of these messages, see the Related information.
Note: New host allocations do not choose a disk-only TS7700 Cluster in this state as a valid tape volume cache candidate. New host allocations that are issued to a disk-only TS7700 Cluster in this state choose a remote tape volume cache instead. If all valid clusters are in this state or unable to accept mounts, the host allocations fail. Read mounts can choose the disk-only TS7700 Cluster in this state, but modify and write operations fail. Copies inbound to this cluster are queued as deferred until the cluster exits this state.
Table 1 displays the start and stop thresholds for each of the active cache capacity states defined.
Table 1. Active cache capacity state thresholds. Table describes thresholds for active cache capacity states.
State Enter state
(free space available)
Exit state
(free space available)
Host message displayed
Automatic removal <4 TB >4.5 TB CBR3750I when automatic removal begins
Limited free cache space warning (CP0 for a TS7700 Tape Attach) ≤3 TB or ≤15% of the size of cache partition 0, whichever is less >3.5 TB or >17.5% of the size of cache partition 0, whichever is less CBR3792E upon entering state
CBR3793I upon exiting state
Out of cache resources (CP0 for a TS7700 Tape Attach) <1 TB or ≤5% of the size of cache partition 0, whichever is less >3.5 TB or >17.5% of the size of cache partition 0, whichever is less CBR3794A upon entering state
CBR3795I upon exiting state
Temporary removal1 <(X + 1 TB)2 >(X + 1.5 TB)2 Console message
Notelist:
  1. When enabled
  2. Where X is the value given by the TS7700 Temporary Removal Threshold for the given cluster.

Volume removal policies in a grid configuration

Removal polices determine when virtual volumes are removed from the cache of a TS7700 Cluster in a grid configuration. These policies provide more control over the removal of content from a TS7700 Cache as the active data reaches full capacity. To perform removal operations in a TS7700 Tape Attach Cluster, the size of cache partition 0 (CP0) must be at least 10 TB.

A TS7700 Tape Attach Cluster can have up to 7 tape-attached partitions. When a new partition is created, the resulting partition 0 (CP0) must have 2 TB free space. The temporary removal function, working with the volume removal policies you have defined, can be used to release the required space in CP0 (at least 2 TB plus the size of the new partition).

To guarantee that data will always reside in a TS7700 or will reside for at least a minimal amount of time, a pinning time must be associated with each removal policy. This pin time in hours will allow volumes to remain in a TS7700 tape volume cache for at least x hours before it becomes a candidate for removal, where x is between 0 and 65,536. A pinning time of zero assumes no minimal pinning requirement. In addition to pin time, three policies are available for each volume within a disk-only TS7700 and for cache partition 0 (CP0) within a TS7700 Tape Attach. These policies are as follows:
Pinned
The copy of the volume is never removed from this TS7700 Cluster. The pinning duration is not applicable and is implied as infinite. When a pinned volume is moved to scratch, it becomes a priority candidate for removal similarly to the next two policies. This policy must be used cautiously to prevent TS7700 Cache overruns.
Prefer Remove - When Space is Needed Group 0 (LRU)
The copy of a private volume is removed if an appropriate number of copies exists on peer clusters, the pinning duration (in x hours) has elapsed since last access, and the available free space on the cluster has fallen below the removal threshold. The order of which volumes are removed under this policy is based on their least recently used (LRU) access times. Volumes in Group 0 are removed before the removal of volumes in Group 1 except for any volumes in scratch categories, which are always removed first. Archive and backup data would be a good candidate for this removal group since it will not likely be accessed once written.
Prefer Keep - When Space is Needed Group 1 (LRU)
The copy of a private volume is removed if an appropriate number of copies exists on peer clusters, the pinning duration (in x hours) has elapsed since last access, the available free space on the cluster has fallen below a threshold, and LRU group 0 has been exhausted. The order of which volumes are removed under this policy is based on their least recently used (LRU) access times. Volumes in Group 0 are removed before the removal of volumes in Group 1 except for any volumes in scratch categories, which are always removed first.

Prefer Remove and Prefer Keep are similar to cache preference groups PG0 and PG1 with the exception that removal treats both groups as LRU versus using the volume size.

In addition to these policies, volumes that are assigned to a scratch category that have not been previously delete-expired are also removed from cache when the free space on a cluster has fallen below a threshold. Scratch category volumes, regardless of what their removal policies are, are always removed before any other removal candidates in volume size descending order. Pin time is also ignored for scratch volumes. Only when the removal of scratch volumes does not satisfy the removal requirements will Group 0 and Group 1 candidates be analyzed for removal. The requirement for a scratch removal is that an appropriate number of volume copies exist elsewhere. If one or more peer copies cannot be validated, the scratch volume is not removed.

These new policies are visible within the management interface only when all TS7700s within a grid are operating at microcode level 8.7.0.xx or later. All records creations before this time should maintain the default Removal Group 1 policy and be assigned a zero pin time duration.

Note: As of microcode level 8.7.0.xx, there is no automatic method to re-introduce a consistent instance of a previously removed volume into a TS7700 Cache simply by accessing the volume. Only when the copy override Force Local Copy or the volume is modified will a consistent version of a previously removed volume be re-introduced into a TS7700 Cache as a result of a mount operation.
Removal policy settings can be configured by using the TS7700 Temporary Removal Threshold option on the Actions menu available on the Grid Summary page of the TS7700 Management Interface. These settings include:
(Permanent) Removal Thresholds
Note: The Removal Threshold is not supported on the TS7740.
The default, or permanent, Removal Threshold is used to prevent a cache overrun condition in a TS7700 Cluster that is configured as part of a grid. By default it is a 4 TB value (3 TB fixed plus 1 TB) that, when taken with the amount of used cache, defines the upper size limit for a TS7700 Cache or for a TS7700 Tape Attach CP0. Above this threshold, virtual volumes begin to be removed from a TS7700 Cache. Virtual volumes are removed from a TS7700 Cache in this order:
  1. Volumes in scratch categories
  2. Private volumes least recently used, using the enhanced removal policy definitions
Once removal begins, the TS7700 continues to remove virtual volumes until the Stop Threshold is met. The Stop Threshold is a value that is the Removal Threshold minus 500 GB.
A particular virtual volume cannot be removed from a TS7700 Cache until the TS7700 verifies that a consistent copy exists on a peer cluster. If a peer cluster is not available, or a volume copy has not yet completed, the virtual volume is not a candidate for removal until the appropriate number of copies can be verified later.
Note: The default removal threshold can be changed from the command line, by using the following library request command:
LIBRARY REQUEST library-name,SETTING,CACHE,REMVTHR,{VALUE}
Temporary Removal Thresholds
Note: The Temporary Removal Threshold is not supported on the TS7740.
The Temporary Removal Threshold lowers the default Removal Threshold to a value lower than the Stop Threshold in anticipation of a Service mode event.
The Temporary Removal Threshold value must be equal to or greater than the expected amount of compressed host workload written, copied, or both to the TS7700 during the service outage. The Temporary Removal Threshold is 4 TB providing 5 TB (4 TB plus 1 TB) of free space exists, but you can lower the threshold to any value between 2 TB and full capacity minus 2 TB.
All TS7700 Clusters in the grid that remain available automatically lower their Removal Thresholds to the Temporary Removal Threshold value defined for each. Each TS7700 Cluster may use a different Temporary Removal Threshold. The default Temporary Removal Threshold value is 4 TB or an additional 1 TB more data than the default removal threshold of 3 TB. Each TS7700 Cluster will use its defined value until any cluster in the grid enters Service mode or the temporary removal process is canceled. The cluster that initiates the temporary removal process will not lower its own removal threshold during this process.