Cloud-container storage pools FAQs

Question & Answer

Question

This document discusses frequently asked questions (FAQs) about cloud-container storage pools used by IBM® Spectrum Protect™. For the FAQs about directory-container storage pools, see the Directory-container storage pools FAQs.

What type of cloud environments can I use for my cloud-container storage pool?
Does my cloud environment require any special configuration?
How do I optimize data throughput capability for my cloud-container storage pool?
How can I set up self-signed certificates for my object storage system?
Why do I create one or more storage pool directories after I define a storage pool?
Can I define a cloud-container storage pool on my Linux® server?
What kind of performance can I expect with a cloud-container storage pool?
What do I use for the identity and cloud url values when I define my cloud-container storage pool?
When I configure cloud-container storage pools to use Amazon Web Services with S3, which URL do I use?
How do I configure cloud-container storage pools to use IBM Cloud Storage off-premises with S3?
On a storage pool that uses an on-premise IBM Cloud Object Storage appliance, is it possible to use multiple Accessers?
How many Accesser® endpoints do I define for my on-premises IBM® Cloud Object Storage System?
When you use an IBM Cloud Object Storage appliance with IBM Spectrum Protect, can I use multiple vaults?
On a storage pool that uses the S3 protocol, how is the default bucket or vault name determined?
Can IBM Spectrum Protect share a bucket with other applications?
How do I avoid ANR1880W server messages during node replication?
Which API calls does IBM Spectrum Protect run against S3?
I see both damaged and orphaned data when I audit my cloud storage pool. What is the difference?
How can I remove orphaned data extents from the cloud?
If I run the AUDIT CONTAINER command while cloud services are not active, what can I expect?
Can I replicate my cloud-container storage pool by using the new PROTECT STGPOOL command?
I do not see a NEXTSTGPOOL parameter when I use the DEFINE STGPOOL command to define my cloud-container storage pool. Is failover for a cloud-container storage pool not supported?
Is there any client data types that should not be stored in a cloud storage pool?
Is there an option to prevent directory objects from being tiered to a cloud-container storage pool?

Note: Cleversafe is now known as IBM Cloud Object Storage.

Answer

Q: What type of cloud environments can I use for my cloud-container storage pool?

A: For information about the types of cloud object storage environments that you can use for cloud-container storage pools, see IBM Spectrum Protect cloud object storage support.

When you use the DEFINE STGPOOL command, set the CLOUDTYPE parameter specific to each provider. For more information, see the following topics in the IBM Knowledge Center:

Q: Does my cloud environment require any special configuration?

A: The Configuring a cloud-container storage pool for data storage page includes instructions on finding this information.

In addition to those instructions, on OpenStack systems, it is highly recommended that your cloud provider is configured to have replicas of the data that is written to its object storage. If an object is lost by the cloud provider, there is no way for IBM Spectrum Protect to recover the lost data. It is possible to configure OpenStack Swift to not have replicas. If a disk device goes bad in that configuration, there is no way to recover the cloud objects. Therefore, your data that is backed up to IBM Spectrum Protect is lost and unrecoverable.

Q: How do I optimise data throughput capability for my cloud-container storage pool?

A: When you size an IBM Spectrum Protect solution that uses cloud storage, it is important to determine the data ingestion throughput capability of the cloud-container storage pool. The following two characteristics factor into this capability:

The input/output operations per second (IOPS) of the cloud-container storage pool accelerator disk cache
The throughput performance of the object storage system and network that support the cloud-container storage pool

Both of these characteristics must meet certain performance thresholds to optimize the data ingestion capability of the cloud-container storage pool.

Tools are available to help you with sizing and benchmarking so that you can configure an optimal IBM Spectrum Protect solution. Attached to the IBM Spectrum Protect Cloud Blueprints page is a document, "Cloud Cache and Object Storage Benchmarking," which explains how to use the tools provided within the "Cloud benchmarking tools" package. For more information, see the IBM Spectrum Protect Cloud Blueprints page.

Q: How can I set up self-signed certificates for my object storage system?

A: You can set up secure communications (HTTPS) between IBM Spectrum Protect and your object storage system by using a self-signed certificate. For more information about configuring secure communication, see the Securing communications topic.

The following steps provide a method for importing certificates:

Part 1: Get the Certificate

Use a web browser to get a copy of the certificate that is used by the object storage system. The following steps apply to Mozilla Firefox. Other browsers provide similar functionality. Refer to your preferred browser’s instructions for exporting certificates.

Type the URL for your object storage system in the browser address bar and press Enter. Use either the keystone server URL for OpenStack or the Accesser node URL for IBM Cloud Object Storage.
Tip: If you use IBM Cloud Object Storage as your object storage system, log in to IBM Cloud Object Storage and click the Security tab. In the dsNet Fingerprint section, click dsNet certificate authority and copy the certificate information into a certificate file for Part 2.
Accept any warnings that are displayed by the browser.
Click the lock icon in the browser address bar, and then click More Information in the pop-up window.
Click View Certificate in the Page Info window.
Click the Details tab on the Certificate Viewer page, and then click Export.
Save the exported file to your preferred location.

Part 2: Add the Certificate to the Keystore

After you get the certificate file, add the certificate file to the Java^TM default keystore. The following steps assume that your client nodes are on Linux, and your server is running on Linux.

There are two options for adding your certificate to the Java keystore:

The first option is preferred. You can add your certificate to a directory and the server automatically imports your certificate to the Java keystore. Java certificates that are delivered in the keystore with the latest version will be picked up and your certificate will not have to be manually imported after every jvm upgrade.
If you change the keystore password, you must use the second option. By using the second option, you can add your certificate to the keystore provided in the jvm. You can also relocate the keystore and specify the new location in the server options file.

Part 2, Option 1 (use with IBM Spectrum Protect Version 8.1.9 and later)

Create a certs directory in your instance home directory.
Copy the certificate file from Part 1 into the certs directory that you created.
Restart the server. The server automatically adds the certificate files from the certs directory to the Java keystore.

Note: The server must be restarted when new certificates are added to the directory.

Part 2, Option 2

Open a terminal and change the directory to the IBM Spectrum Protect jre/bin directory. The default installation location for this directory is /opt/tivoli/tsm/jre/bin.
Make a backup copy of the Java cacerts keystore file by running the following command:
```
cp ../lib/security/cacerts ../lib/security/cacerts.original
```
Run the following command:
```
./keytool -import -keystore ../lib/security/cacerts -alias some_alias -file your_file
```
where:
some_alias is a unique alias you chose to identify this certificate in the keystore, which is important if you have more than one certificate.
your_file is the path and file name of the certificate from Part 1 of these instructions.
When you are asked for the password, type changeit. If you changed your password from the default password, type your current password.
When you are asked to trust this certificate, type yes. If the certificate was added, the message "Certificate was added to keystore" is displayed.
Restart IBM Spectrum Protect.
(Optional) Copy the keystore to another location and specify the new location in the server options file to prevent the upgrade from overwriting the keystore with your imported certificates.

dsmserv.opt:

jvmTrustStore full_path_to_keystore

Additional notes:

By default, each IBM Cloud Object Storage Accesser has its own certificate. Add the certificate for each Accesser that you use to the keystore, and use a different alias for each certificate.
On Windows, the following paths apply:
- Location of cacerts keystore: install_dir\jre\lib\security\
- Location of keytool: install_dir\jre\bin\
The default certificates have a short expiration window. When they expire, you might lose access to the object storage until you update the certificates. You can create your own certificates and use them. Creating and installing these certificates on object storage systems is outside the scope of this document.
Additional syntax options are available in the Java keytool documentation.

Q. Why do I create one or more storage pool directories after I define a storage pool?

A. With IBM Spectrum Protect V8.1.5 and later, you must use storage pool directories with cloud-container storage pools, except storage pools that are only used as a tiering storage pool. If data is sent to the storage pool from client nodes or by replication, the storage pool directory is required.

IBM Spectrum Protect temporarily stores data in one or more local storage pool directories during the data ingestion before it moves the data to the cloud. By using local storage in this manner, you can improve data backup and archive performance.

After you create a cloud-container storage pool, use the DEFINE STGPOOLDIRECTORY command to define one or more local storage pool directories for that storage pool. If you use the Operations Center, define the storage pool directories after you create the storage pool with the Add Storage Pool wizard.

For more information about storage pool directories, see Optimizing performance for cloud object storage.

Q: Can I define a cloud-container storage pool on my Linux® server?

A: You can define cloud-container storage pools on the Linux x86_64 and Linux on Power Systems™ (little endian) operating systems. Cloud-container storage pools cannot be defined on Linux on IBM System z® and Linux on Power Systems™ (big endian) operating systems.

Q: What kind of performance can I expect with a cloud-container storage pool?

A: The performance of a cloud-container storage pool largely depends on the network capabilities between the server and the cloud.

Starting with IBM Spectrum Protect V7.1.7, backup operations from the server to the cloud, and restore operations from the cloud to the server occur more quickly than in earlier releases.

For best performance, use S3 and Azure protocols.

Q: What do I use for the identity and cloud url values when I define my cloud-container storage pool?

A: The Configuring a cloud-container storage pool for data storage page includes instructions on finding this information.

Q: When I configure cloud-container storage pools to use Amazon Web Services (AWS) with S3, which URL do I use for the Region or CLOUDURL value?

A: In most cases, you can use the region that is closest to your physical location, based on the AWS Regions and Endpoints page. Because an AWS bucket exists in only one region, you can specify only one endpoint URL for a region.

If you require a GovCloud region, specify a URL from the AWS GovCloud (US) Endpoints page. You must have GovCloud credentials to use a GovCloud region.

If you use the Operations Center and you select Other, specify a region endpoint URL in the URL field, and include the protocol, usually https://.

Warning: Be sure to use only the AWS endpoint URL for the CLOUDURL or Region value, such as https://s3-us-west-1.amazonaws.com. Do not use the static website hosting URL for this value. For more information, see Preparing to configure cloud-container storage pools for Amazon Web Services.

Q: How do I configure cloud-container storage pools to use IBM Cloud Storage off-premises with S3?

A: You can set up cloud storage pools to use IBM Cloud Object Storage off-premises with the S3 protocol. The off-premises implementation of IBM Cloud Object Storage is managed through IBM Cloud. In this setup, only the owner of the IBM Cloud account can create buckets and administrators.

Use the credentials from your IBM Cloud account when you configure the storage pools in the Operations Center or with the DEFINE STGPOOL command. For more information, see the IBM Cloud Object Storage page. To use this configuration, select IBM Cloud Object Storage - S3 API from the IBM Cloud Order Object Storage page.

If you use the Operations Center, select IBM Cloud Object Storage - S3 API as the Cloud type. If you use the DEFINE STGPOOL command, specify S3 for the CLOUDTYPE parameter. Also, only one cloud provider endpoint is needed with this configuration. If all of your servers are inside the IBM Cloud network, you can use a private authentication endpoint.

Q: On a storage pool that uses an on-premise IBM Cloud Object Storage appliance, is it possible to use multiple Accessers?

A: Yes, you can use multiple Accessers or a load balancer for optimal performance. Normally, when you define a storage pool to use on-premise IBM Cloud Object Storage, you would set the CLOUDURL parameter to the address of the Accesser you want to use. To use more than one Accesser, list the Accesser addresses separated by a vertical bar (|), with no spaces. For example: CLOUDURL=<accesserURL1>|<accesserURL2>|<accesserURL3>. Note: You cannot exceed 870 characters in the CLOUDURL parameter.

If you use the Operations Center, type an Accesser IP address in the URL field of the Add Storage pool wizard, and then press Enter to add additional IP addresses.

For information about how to locate the required values for on-premise IBM Cloud Object Storage, see Preparing to configure a cloud-container storage pool for IBM Cloud Object Store.

Q: How many Accesser® endpoints do I define for my on-premises IBM® Cloud Object Storage System that is used with IBM Spectrum Protect™?

A: Generally, to optimize performance, define exclusive access for the following number of Accessers for small, medium, and large Blueprint systems:

Small system: 1 Accesser
Medium system: 2 Accessers
Large system: 3-4 Accessers

The previous list provides general guidelines. Based on client data ingestion and simultaneous data transfer to object storage, you might not require as many dedicated Accessers as are listed in the guidelines. If you would like to calculate a more precise number of Accessers, consider the following factors:

• IBM Spectrum Protect daily data ingestion rates and data reduction achieved through deduplication and compression, as these factors affect the requirements for back-end throughput to the IBM Cloud Object Storage System.

• The Information Dispersal Algorithm (IDA) setting for the IBM Cloud Object Storage vault

• The network bandwidth for the Accessers

------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

Use the following two examples to help determine Accesser requirements.

Example A:

Requirement: An IBM Spectrum Protect server ingesting 10 TB of data/hour, with a ratio of 4:1 data reduction (deduplication and compression).

What is the required throughput in megabytes per second to the IBM Cloud Object Storage System?

10 TB/hour = 10485760 MB/hour = 2912.7 MB/second x 1/4 (data reduction) = 728.2 MB/second. This is the required throughput after data reduction.

Requirement: The IDA setting is 12.5 (Slicestor® node width=12, read threshold=5).

What is the required throughput in megabytes per second?

To allow reading data from a minimum of 5 Slicestor nodes, each Accesser must add parity to every 5 units of incoming data to create a total of 12 IBM Cloud Object Storage data slices. This parity information is distributed to each Slicestor node. Assuming an expansion factor of 2.4, the throughput to the Slicestor nodes is calculated as shown:

728 MB/second x 2.4 = 1747.2 MB/second

Requirement: Each Accesser has 1 active 10 GB/second front-end Ethernet port and 1 active 10 GB/second back-end Ethernet port.

What is the required throughput in megabytes per second?

Typically, dual-port Accessers use an active-passive bonding configuration. Even if you have 2 front-end and 2 back-end Ethernet ports, only 1 of each is active at a time. For example, if you have 10 GB/second = 10000000000 bits/second = 1250000000 bytes/second = 1192.1 MB/second, the overhead for the TCP/IP protocol will reduce this amount to 1000 -1100 MB/second of obtainable throughput. For this requirement, a maximum of 1100 MB/second throughput per Accesser is possible.

Final calculation: 728 MB/s x 2.4/1100 MB/s = 1.58 Accessers (rounded up to 2 Accessers)

Example B:

Requirement: An IBM Spectrum Protect server ingesting 10 TB of data/hour, with a ratio of 8:1 data reduction (deduplication and compression).

What is the required throughput in megabytes per second to the IBM Cloud Object Storage system?

10 TB/hour = 10485760 MB/hour = 2912.7 MB/second x 1/8 (data reduction) = 364 MB/second. This is the required throughput after data reduction.

Requirement: The IDA setting is 12.8 (Slicestor node width=12, read threshold=8).

What is the required throughput in megabytes per second?

To allow reading data from a minimum of 8 Slicestor nodes, each Accesser must add parity to every 8 units of incoming data, to create a total of 12 IBM Cloud Object Storage data slices. This parity information is distributed to each Slicestor node. Assuming an expansion factor of 1.5, the throughput to the Slicestor nodes is calculated as shown:

364 MB/second x 1.5 = 546 MB/second

Requirement: Each Accesser has 1 active 10 GB/second front-end Ethernet port and 1 active 10 GB/second back-end Ethernet port.

What is the required throughput in megabytes per second?

Final calculation: 364 MB/s x 1.5/1100 MB/s = 0.50 Accessers (rounded up to 1 Accesser)

Q: When you use an IBM Cloud Object Storage appliance with IBM Spectrum Protect, can I use multiple vaults?

A: You can specify one vault per storage pool by using the new BUCKETNAME parameter in the UPDATE STGPOOL and DEFINE STGPOOL commands for cloud-container storage pools. Use the BUCKETNAME parameter to specify the name of an IBM Cloud Object Storage vault (or an S3 bucket) to use with that storage pool. This configuration allows one server to use multiple vaults.

Restriction: You cannot change the bucket or vault if any cloud containers exist in this storage pool. For more information, see the following commands:

Q: On a storage pool that uses the S3 protocol, how is the default bucket or vault name determined?

A: If you want to use a specific name other than the default bucket or vault name, use the BUCKETNAME parameter when you run the DEFINE STGPOOL command on that storage pool. You can update the default name of the bucket or vault by using the BUCKETNAME parameter when you run the UPDATE STGPOOL command on that storage pool, but you can only update the default name if no data is stored in the storage pool.

If IBM Spectrum Protect creates a new Amazon Web Services (AWS) bucket or IBM Cloud Object Storage vault (with a default bucket or vault name) by using the DEFINE STGPOOL command, you can view the bucket or vault name by running the QUERY STGPOOL FORMAT=DETAILED command and looking for Bucket Name in the output.

The ability to specify the bucket name or vault name allows multiple storage pools and multiple servers to use the same cloud storage bucket or vault.

Q: Can IBM Spectrum Protect share a bucket with other applications?
A: Yes, the bucket can be shared with other IBM Spectrum Protect servers. It is safe to use for cloud storage pools and database backups. The bucket can also be shared with other applications that do NOT interfere with data that is written by IBM Spectrum Protect. In most cases, it is best not to share the bucket with other applications, but it is possible. If the bucket is shared with other applications, be aware of the following things:

Uncommitted parts of a multipart upload that are older than 3 days are subject to deletion. IBM Spectrum Protect will delete older uncommitted parts from the bucket regardless of what application wrote them.
Other applications should not write into "directories" written by IBM Spectrum Protect. Any data written in these paths could be deleted.

Q: How do I avoid ANR1880W server messages during node replication?

A: If you are replicating data from a source server in the cloud and frequently get an ANR1880W server message on the target server, lower the value of the REPLSIZETHRESH option on the source server.

Q: Which API calls does IBM Spectrum Protect run against S3?

A: IBM Spectrum Protect uses the following S3 operations:

GET Bucket (List Objects)
HEAD Bucket
PUT Bucket (Optional. This operation can be restricted if the bucket is predefined in Amazon Web Services and configured in IBM Spectrum Protect that uses the BUCKETNAME parameter)
DELETE object
Delete multiple objects
GET object
HEAD object
PUT object
Multipart uploads (and associated operations, such as initiate, upload part, complete, and abort)

For safety reasons, IBM Spectrum Protect never deletes buckets, so the delete operation might be one of the main operations to restrict in your policy. For more information, see Configuring Amazon bucket policy for IBM Spectrum Protect.

Q: I see both damaged and orphaned data when I audit my cloud storage pool. What is the difference?

A: A damaged data extent is a file that has references in the server database, but has missing or corrupted data on the cloud.

An orphaned data extent is an object stored in a cloud service provider that does not have a reference in the server database.

Q: How can I remove orphaned data extents from the cloud?

A: Running an audit with ACTION=REMOVEDAMAGED will delete both damaged and orphaned data extents.

Q: If I run the AUDIT CONTAINER command while cloud services are not active, what can I expect?

A: Depending on what action you use, the audit might or might not work as expected when there is no connection to the cloud:

Running the audit with ACTION=SCANALL will fail because the cloud connection cannot be made.
Running the audit with ACTION=SCANDAMAGED will fail if there are any damaged data extents to be scanned.
Running an audit with ACTION=MARKDAMAGED does not require any communication with the cloud, so it will work as expected.
Running an audit with ACTION=REMOVEDAMAGED will mark any damaged data as orphaned, allowing those extents to be automatically deleted when the connection to cloud services is restored. Because orphaned extents cannot be removed from the cloud environment without a connection to cloud services, this command does not remove references to these extents from the server database, unless the FORCEORPHANDBDEL parameter in the AUDIT CONTAINER command is set to YES. Be aware that running the AUDIT CONTAINER command with ACTION=REMOVEDAMAGED and FORCEORPHANDBDEL=YES removes references from the server database, but could leave extents physically stored in the cloud. Depending on the amount of data involved, this command might result in storage charges from the cloud provider.

Q: Can I replicate my cloud-container storage pool by using the PROTECT STGPOOL command?

A: No, the PROTECT STGPOOL command is not supported for cloud-container storage pools. You cannot use any of the following functions with cloud-container storage pools:

Replication of a cloud-container storage pool with the PROTECT STGPOOL command
Migration
Reclamation
Aggregation
Collocation
Simultaneous-write operations
Storage pool backup operations
Use of virtual volumes

Q: I do not see a NEXTSTGPOOL parameter when I use the DEFINE STGPOOL command to define my cloud-container storage pool. Is failover for a cloud-container storage pool not supported?

A: The ability to set the NEXTSTGPOOL parameter on a cloud-container storage pool does not apply because it cannot be determined when the cloud is full. Therefore, the overflow capability is not available.

Q: Is there any client data types that should not be stored in a cloud storage pool?

A: In general, do not store client data types in a cloud container storage pool for which there are recommendations that already exist for not storing the data into removable media storage pools. Two specific data types of client data to avoid storing in cloud pools are Data Protection for VMware control files and Data Protection for SQL metadata files (for legacy SQL backups). For more information, see:

Q: Is there an option to prevent directory objects from being tiered to a cloud-container storage pool?
A: Yes. Use the dirmc client option to specify the directory objects that you want to store in a non-tier storage pool.

[{"Business Unit":{"code":"BU058","label":"IBM Infrastructure w\/TPS"},"Product":{"code":"SSEQVQ","label":"IBM Spectrum Protect"},"Component":"","Platform":[{"code":"PF002","label":"AIX"},{"code":"PF016","label":"Linux"},{"code":"PF033","label":"Windows"}],"Version":"All Versions","Edition":"","Line of Business":{"code":"LOB26","label":"Storage"}}]

Tips

Cloud-container storage pools FAQs

Question & Answer

Question

Answer

Was this topic helpful?

Document Information

UID

Share your feedback

Need support?