Backing up and restoring the analytics database

The analytics database can be backed up and restored from an S3 repository. S3 compatible object storage is required, for example, IBM Cloud Object Storage.

Before you begin

If you configure Analytics for offloading and your endpoint requires a particular certificate, add trust certificates to the Analytics subsystem before attempting to back up or restore the Analytics database. For information on adding certificates to Analytics for back-up and restore, see Adding certificates for Analytics for back-up and restore.

Tip: The Procedure section includes examples that illustrate how the back-up and restore commands work.

About this task

These commands apply to OVA, pure Kubernetes, and IBM Cloud Private (ICP) deployments. To back up and restore the analytics database, you will need S3 compatible object storage. Backups are created on an as-needed basis and cannot be automated in API Connect.

Important: All arguments are required. Replace an argument with empty quotes ("") to use the default setting.
Command Values/Definition
apicup subsys exec <ANALYTICS_SUBSYS> create-s3-repo Create the S3 repository to store analytics backups. These settings are identical to those used when creating a repository in Elasticsearch. The arguments are:
  • REPO_NAME - Name of repository to be created.
  • REGION - The region where bucket is located. Defaults to US Standard.
  • BUCKET - The name of the bucket to be used for snapshots.
    Note: Analytics backup and restore supports virtual-host style bucket access, such as bucket.s3-example.com, but does not support path style bucket access such as s3-example.com/bucket.
  • ENDPOINT - The endpoint to the S3 API.
  • ACCESS_KEY - The access key to use for authentication.
  • SECRET_KEY - The secret key to use for authentication.
  • BASEPATH - The path name within the bucket where backup information is stored. The default is the root path. For S3 storage, you must include the port using the following format: BASEPATH:PORT.
  • COMPRESS_TRUE_FALSE - Default is true. Determines whether metadata files are stored in compressed format.
  • CHUNK_SIZE_GB - Default is 1GB. Large files can be stored as chunks when the snapshot is created. This setting specifies the size of the chunks as GB, MB, or KB.
  • SERVER_SIDE_ENCRYPTION_TRUE_FALSE - Default is false. Determines whether files are encrypted. When set to true, files are encrypted on the server side using AES256.

If the create-s3-repo command results in the following error, complete the steps in Adding certificates for Analytics for back-up and restore to add the certificate to the Analytics subsystem and then run the command again:

PKIX path building failed: sun.security.provider.certpath.SunCertPathBuilderException: unable to find valid certification path to requested target

apicup subsys exec <ANALYTICS_SUBSYS> list-repos List repositories for analytics backups. There are no arguments or defaults for the list-repos command.
apicup subsys exec <ANALYTICS_SUBSYS> delete-repo Delete specified analytics backup repository.

The argument is:

  • REPO_NAME - Defaults to the existing repo if there is only one. The repository must be specified if there is more than one.

Examples:

  • If only one repo exists: apicup subsys exec <ANALYTICS_SUBSYS> delete-repo ""
  • If multiple repos exist: apicup subsys exec <ANALYTICS_SUBSYS> delete-repo REPO_NAME
apicup subsys exec <ANALYTICS_SUBSYS> backup Perform a backup of analytics data on an as-needed basis.
  • INDICES - Default is all - All indices will be backed up.

    The INDICES arguments can consist of a series of index names, a single argument of comma-separated indices, or one or more keywords. Multiple keywords must be comma-separated. Values with whitespace must be enclosed in double-quotes. If no indices are specified, then all indices will be backed up. The keywords are mapped to indices or aliases in Elasticsearch. The keywords are as follows:

    • all - Backup all data.
    • apievents - Backup all analytics data.
    • ui - Backup all UI visualizations and dashboards.
    • config - Backup all configuration information, such as retention period.
  • BACKUP_NAME - Enter the name of the backup.
  • REPO_NAME - Defaults to the existing repo if there is only one. The repository must be specified if there is more than one.
  • IGNORE_UNAVAILABLE_TRUE_FALSE - Default is false. If false, the restore will fail if an index is missing. If true, missing indices will be skipped and the restore will continue.
apicup subsys exec <ANALYTICS_SUBSYS> list-backups List analytics backups per S3 repo.

The argument is:

  • REPO_NAME - Defaults to the existing repo if there is only one. The repository must be specified if there is more than one.
apicup subsys exec <ANALYTICS_SUBSYS> details-backup Get details on specified analytics backup

The arguments are:

  • BACKUP_NAME - Defaults to backup-<indices>-<time date> if not explicitly set. The name must be all lowercase.

    For example, if the backup command was run as apicup subsys exec <ANALYTICS_SUBSYS> backup ui apievents then the backup name would be backup-ui-apievents-2018-10-10t16:04:25z. If more than three indices are specified, the backup name is truncated to backup-<time date>.

  • REPO_NAME - Sets the name of the repository where the backups will be stored. Defaults to the existing repository if there is only one. The repository must be specified if there is more than one.
apicup subsys exec <ANALYTICS_SUBSYS> delete-backup Delete specified analytics backup

The arguments are:

  • BACKUP_NAME - Enter the name of the backup to be deleted. You can view the names of backups using apicup subsys exec <ANALYTICS_SUBSYS> list-backups.
  • REPO_NAME - Defaults to the existing repository if there is only one. The repository must be specified if there is more than one.
apicup subsys exec <ANALYTICS_SUBSYS> restore Restore analytics data
  • INDICES - Default is all - All indices will be restored.

    The INDICES arguments can consist of a series of index names, a single argument of comma-separated indices, or one or more keywords. Multiple keywords must be comma-separated. Values with whitespace must be enclosed in double-quotes. If no indices are specified, then all indices will be backed up. The keywords are mapped to indices or aliases in Elasticsearch. The keywords are as follows:

    • all - Restore all data.
    • apievents - Restore all analytics data.
    • ui - Restore all UI visualizations and dashboards.
    • config - Restore all configuration information, such as retention period.
  • BACKUP_NAME - Enter the name of the backup to be restored.
  • REPO_NAME - Defaults to the existing repository if there is only one. The repository must be specified if there is more than one.
  • IGNORE_UNAVAILABLE_TRUE_FALSE - Default is false. If false, the restore will fail if an index is missing. If true, missing indices will be skipped and the restore will continue.
  • OVERRIDE_TRUE_FALSE - Default is false. If set to true, then the restore operation will override existing indices.
apicup subsys exec <ANALYTICS_SUBSYS> restore-status Displays analytics status for determining the progress of the restore process.
Note: The restore-status command is available in Version 2018.4.1.7 or later.
Attention: Naming conventions for indices and back ups follow the Elasticsearch rules:
  • Lowercase only
  • Cannot include \, /, *, ?, ", <, >, |, ` ` (space character), ,, #
  • Colons (:) are not supported in 7.0+ (indices prior to 7.0 could contain a colon)
  • Cannot start with -, _, +
  • Cannot be . or ..
  • 255 byte limit (multi-byte characters will reach the 255 limit sooner)

Procedure

This section contains example commands with output. The name of the analytics subsystem in the examples is analytics.

  • How to create a backup
    1. Create the S3 repository. The example creates a repository with the following values:
      • REPO_NAME - myrepo
      • REGION - US
      • BUCKET - bucket
      • ENDPOINT - myrepo.s3repo.com
      • ACCESS_KEY - access_key
      • SECRET_KEY - secret_key
      • BASEPATH - my_folder
      • COMPRESS_TRUE_FALSE - "" uses default of true
      • CHUNK_SIZE_GB - "" uses default of 1GB.
      • SERVER_SIDE_ENCRYPTION_TRUE_FALSE - "" sets to the default of false.
      apicup subsys exec analytics create-s3-repo myrepo US bucket myrepo.s3repo.com access_key secret_key my_folder "" "" "" 
      OUTPUT:
      Creating repository myrepo. List repos to see if creation completed, or check logs for errors.
      Note:

      With virtual host style repositories, the Analytics backup and restore process connects to the hostname formed by combining the values for BUCKET and ENDPOINT. For example, using the values given in this step, the host is bucket.myrepo.s3repo.com.

    2. List the repositories. For example:
      apicup subsys exec analytics list-repos
      OUTPUT:
      Name     Repo Type   Bucket    BasePath   Region   Endpoint             Chunk Size   Compress   Server Side Encryption
      myrepo   s3          bucket    my_folder  US       myrepo.s3repo.com     1gb          true                             
    3. Create a backup with the following values:
      • INDICES - all (or "" for the default of all)
      • BACKUP_NAME - mybackup
      • REPO_NAME - myrepo (use "" if only one repo exists)
      • IGNORE_UNAVAILABLE_TRUE_FALSE - "" sets to the default of false.
      apicup subsys exec analytics backup all mybackup myrepo ""
      OUTPUT:
      Successfully created backup mybackup.
    4. List backups in the repo named myrepo.
      apicup subsys exec analytics list-backups myrepo
      OUTPUT:
      Name       Start Time                 End Time                   State
      mybackup   2019-02-20T17:05:56.415Z   2019-02-20T17:06:09.833Z   SUCCESS
      
    5. Display details for a backup named mybackup in the repo named myrepo.
      apicup subsys exec analytics details-backup mybackup myrepo
      OUTPUT:
      Backup Name: mybackup
      State: SUCCESS
      Failures: 
      Shards Failed: 0
      Shards Successful: 13
      Start Time: 2019-02-20T17:05:56.415Z
      End Time: 2019-02-20T17:06:09.833Z
      Version: 5.6.8
      Indices: .export-status, apic-api-2019.02.19-1, apic-api-2019.02.20-000002, .apic-config, .kibana-6
    6. Delete a backup named mybackup in the repo named myrepo.
      apicup subsys exec analytics delete-backup  mybackup myrepo
      OUTPUT:
      Deleting backup mybackup. List backups to see the status, or check logs for errors.
  • How to restore a backup
    Note: Restoration requires a functioning Analytics subsystem. In a disaster recovery scenario, you might need to install the Analytics subsystem before you can restore the backed-up data. To install, refer to Installing the Analytics subsystem into a Kubernetes environment.
    1. Restoring a backup
      • INDICES - all (or "")
      • BACKUP_NAME - mybackup
      • REPO_NAME - myrepo (or "")
      • IGNORE_UNAVAILABLE_TRUE_FALSE - "" sets to the default of false.
      • OVERRIDE_TRUE_FALSE - true
      apicup subsys exec analytics restore all mybackup myrepo  "" true
      
    2. Check the status of the restore process. Enter the following command:
      apicup subsys exec analytics restore-status
      Note: The restore-status command is available in Version 2018.4.1.7 and later.

      Following is example output:

      Status   Active Primary Shards   Active Shards   Initializing Shards   Unassigned Shards
      green    104                     312             0                     0
      

      The restore process is successful when the status is green and there are no unassigned shards and no initializing shards. Note that the results are different if you are running in dev mode, or in standard mode with less than three nodes. For dev mode or standard mode with less than three nodes, the restore process will be finished when the initializing shards drops to 0. However, the status will remain yellow and unassigned shards will not drop to 0.

  • Troubleshooting

    Look for information in the apicup logs:

    • Locate the analytics-operator pod with: kubectl get pods
    • To view the logs, run: kubectl logs <analytics-operator-pod>