IBM Support

QRadar EDR (On-Premise CP4S: 3.12.5.0): Restoration of EDR data terminates with exit code 1

Troubleshooting


Problem

EDR data restoration command terminates with exit code 1.

Symptom

Cassandra restore ends with following:
Keyspace not containing the number of keys expected

The log traces will show the following:

{"level":"error","ibm_datetime":"202<redacted>37:40.516Z","caller":"executor/<redacted>.go:177","message":"cassandra","error":"exit status 2","stacktrace":"git<redacted>processResults\n\t/opt<redacted>/<redacted>.go:177\ngit<redacted>.Restore\n\t/opt<redacted>.go:116\nmain.main\n\t/opt<redacted>.go:106\nruntime.main\n\t/usr/lib/<redacted>.go:250"}
{"level":"info","ibm_datetime":"20<redacted>:40.734Z","caller":"misc/<redacted>.go:316","message":"Application scaled","Name":"cp4s-foundations-operator","Replicas":1}
command terminated with exit code 1

Cause

The Cassandra restore process is incorrectly determining that a table should exist post-restore.

Environment

Afftected Version:

QRadar EDR On-Premise : 3.12.5.0
Release information:
This issue is planned to be fixed in upcoming release 3.12.6.0

Diagnosing The Problem

Post running the restoration command, log traces on the screen shows following:
 
{"level":"info","ibm_datetime":"20<redacted>7:40.734Z","caller":"misc/<redacted>.go:316","message":"Application scaled","Name":"cp4s-foundations-operator","Replicas":1}
command terminated with exit code 1

Resolving The Problem

To mitigate this issue please follow steps mentioned below:
Log in to your Red Hat OpenShift Container Platform cluster as a cluster administrator by typing any one of the following commands, where <openshift_url> is the URL for your Red Hat OpenShift Container Platform environment.
  1. Using a username and password:
    1. oc login <openshift_url> -u <cluster_admin_user> -p <cluster_admin_password>
      
  2. Using a token:
    1. oc login --token=<token> --server=<openshift_url>
  3. Run the following command:
    1. oc edit deployment cp4s-backup-restore -n <namespace>
    2. Edit  "readOnlyRootFilesystem: true" and set it to false.Once you save and exit, wait for  cp4s-backup-restore* pod to restart.
      1. You can confirm the pod restart using the following command:
        1. image-20240402171956-1
    3. Run the following command:
      1. oc exec -n <namespace> <backup-pod> -it -- bash
      2. Using VI editor, edit the following file: /opt/ansible/cassandra/roles/restore/files/restore_cassandra_sstableloader.sh
        1. Search for "check_restore" and remove it. Save file and exit.
      3. Using VI editor, edit the following file: /opt/ansible/cassandra/roles/restore/tasks/restore_keyspace.yaml
        1. Search for line "failed_when: restore_result.rc != 0 or 'Restore successful!!!' not in restore_result.stdout'"
        2. Remove following part from the line : "or 'Restore successful!!!' not in restore_result.stdout"
        3. The line should look like this:
          1. image-20240402174620-1
        4. Exit the pod and run the restoration command
        5. Repeat step 2 and set the "readOnlyRootFilesystem" back to true.

Document Location

Worldwide

[{"Type":"MASTER","Line of Business":{"code":"LOB24","label":"Security Software"},"Business Unit":{"code":"BU059","label":"IBM Software w\/o TPS"},"Product":{"code":"SSOO77","label":"IBM Security QRadar EDR"},"ARM Category":[{"code":"a8m3p000000PCQ2AAO","label":"OpenShift-\u003EConfiguration"},{"code":"a8m3p000000PCPsAAO","label":"Support"}],"ARM Case Number":"","Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"All Versions"}]

Product Synonym

ReaQta

Document Information

Modified date:
16 April 2024

UID

ibm17145702