Backing up and restoring components

Backup and recovery of data files and databases is an essential operation for any business system, particularly for data and applications that are running in production environments. Create and follow a plan for backing up and recovering the data for all components of your Cloud APM infrastructure.

For disaster recovery planning, consider backing up the data on your Cloud APM server regularly and saving the backup file to a mounted drive or copying it to another server. If your Cloud APM server uses a remote Db2® server, you should also complete backups on the Db2 server. For more information, see Backing up components.

The Cloud APM server stores configuration data in local files and also in local internal Derby databases. The Cloud APM server Db2 databases contain the following data:
  • DATAMART and prefetch (WAREHOUS) databases contain transaction tracking and resource monitoring metric data. If you changed the historical data retention values, the WAREHOUS database also contains configuration data.
  • The SCR32 database contains resource group and application definitions.
It is important to keep the Cloud APM server local configuration data synchronized with the configuration data in the Db2 databases when the Db2 server is remote.

If you have a very recent backup file for the Cloud APM server and you need to recover the local configuration data on the Cloud APM server only, then you can complete a recovery procedure that just restores data on the Cloud APM server and not on a remote Db2 server. Otherwise, follow one of the following recovery scenarios that restore configuration data on the Cloud APM server and on the remote Db2 server.

The Cloud APM server where the backup is complete is called the source server. The Cloud APM server where the recovery is completed is called the target server. If there is a hardware issue with the source server, the target server can be on a different computer system from the source server. Otherwise, the source and target server can be the same computer system. The target server must be at the same release and interim fix level as the source Cloud APM server, and use the same type of database (local or remote), and database version.

If a remote Db2 server is configured for the source server, the target server can either use the same Db2 server, instance, and databases or different ones depending on whether the Cloud APM Db2 databases require recovery.

Recovery is a disruptive action. The Cloud APM server components are stopped and started multiple times throughout the process. Choose one of the following recovery methods to restore the configuration data from a backup file.

Table 1. Recovery scenarios, approaches, and procedures to follow.
Note: You should install IBM® Cloud Application Performance Management Private, V8.1.4.0 interim fix 6 or later so that you have the fixes that are required to complete disaster recovery if the Cloud APM server is operational and to run the restore.sh script on the remote Db2 server when you are logged in as the Db2 instance user.
Note: A Cloud APM server is operational if you can use the Cloud APM console to log in and view monitoring data, custom views, events, thresholds, resource groups, synthetic scripts, and role based access control policies.
Recovery scenario Recovery approach Recovery procedures to complete
The Cloud APM server is configured to use a local Db2 server and the Cloud APM server is operational. If the Cloud APM server is operational and you want to restore a previous version of the configuration data, run the restore.sh script on the Cloud APM server. The restore script restores the Cloud APM server configuration data and the data in the local Db2 databases. See Restoring components for disaster recovery.
The Cloud APM server is configured to use a local Db2 server and the Cloud APM server is not operational.

If the Cloud APM server is not operational and requires complete recovery, you must reinstall the Cloud APM server, and then run the restore script to restore the configuration data and Db2 databases.

If there is no hardware issue, you can either reinstall the Cloud APM server on the source server or you can install the Cloud APM server on another computer system.

If you must install the Cloud APM server on another computer system, then you should configure the target Cloud APM server to use the same host name and IP address as the source Cloud APM server. Otherwise, Cloud APM is not fully functional after the disaster recovery completes until you reconfigure your managed systems to point to a different Cloud APM server and inform your Cloud APM console users to log in with a different URL.

Complete these procedures in the following order:
  1. See Installing the Cloud APM server for disaster recovery.
  2. See Restoring components for disaster recovery.
  3. If the target server is not the same system as the source server, and you cannot change the host name and IP address of the target server, then see Changing the server IP address and host name.
The Cloud APM server is configured to use a remote Db2 server and the Cloud APM server is operational. If the Cloud APM server is operational and you want to restore a previous version of the configuration data, run the restore.sh script on the remote Db2 server if the databases require recovery and then run the restore script on the Cloud APM server. See Restoring components for disaster recovery.
The Cloud APM server is configured to use a remote Db2 server, the Cloud APM server is not operational, and the remote Db2 databases do not require recovery.

If the Cloud APM server is not operational but the Db2 databases on the remote Db2 server are in a good state, then you must reinstall the Cloud APM server, and run the restore script on the Cloud APM server only.

Because the Cloud APM server installation resets the database schema, you must install the Cloud APM server to use a local Db2 server. Then, reconfigure it use the existing remote Db2 server after installation completes. Finally, run the restore script on the Cloud APM server.

If there is no hardware issue, you can either reinstall the Cloud APM server on the source server or you can install the Cloud APM server on another computer system.

If you must install the Cloud APM server on another computer system, then you should configure the target Cloud APM server to use the same host name and IP address as the source Cloud APM server. Otherwise, Cloud APM is not fully functional after the disaster recovery completes until you reconfigure your managed systems to point to a different server and inform your Cloud APM console users to log in with a different URL.

Complete these procedures in the following order:
  1. See Installing the Cloud APM server for disaster recovery
  2. See Cataloging Db2 databases after changing the Db2 server
  3. See Updating the Cloud APM server configuration for Db2 server changes
  4. See Restoring components for disaster recovery.
  5. If the target server is not the same as the source server and you cannot change the host name and IP address of the target server, see Changing the server IP address and host name.
The Cloud APM server is configured to use a remote Db2 server, the Cloud APM server is not operational, and the remote Db2 databases also require recovery.

For this scenario, you can either use the same remote Db2 server and databases that are configured for the source server or use a different Db2 server or databases. If you can use the same Db2 server and databases, you must drop the existing remote Db2 databases.

If there is no hardware issue, you can either reinstall the Cloud APM server on the source server or you can install the Cloud APM server on another computer system. Before installing the Cloud APM server, you must create the databases on the remote Db2 server. After the installation is complete, run the restore.sh script on the remote Db2 server and on the Cloud APM server.

If you must install the Cloud APM server on another computer system, then you should configure the target Cloud APM server to use the same host name and IP address as the source Cloud APM server. Otherwise, Cloud APM is not fully functional after the disaster recovery completes until you reconfigure your managed systems to point to a different server and inform your Cloud APM console users to log in with a different URL.

Complete these procedures in the following order:
  1. See Installing the Cloud APM server for disaster recovery, which also refers you to Connecting to a remote Db2 server.
  2. See Restoring components for disaster recovery.
  3. If the target server is not the same as the source server and you cannot change the host name and IP address of the target server, see Changing the server IP address and host name.

A high availability solution that uses backup and restore is also available for your Cloud APM server. For more information, see the APM_High_Availability_V<version>.pdf document in the IBM-cloud-apm-samples github repository, where <version> is the latest version published, only the latest version is available in the github repository.

For information about backing up and restoring when you are upgrading your Cloud APM server, see Upgrading your server.