IBM Support

QRadar: Docker services do not start when 7.2.8 or earlier Console appliances are updated to 7.5.0 UP2 IF2 or later (APAR IJ41796)

News


Abstract

QRadar® SIEM development identified a defect where Docker services fail to start on QRadar appliances that were originally installed at version 7.2.8 or earlier, then upgraded to 7.5.0 Update Pack 2 Interim Fix 2 and later. This service issue prevents applications from starting as the original Red Hat Enterprise (RHEL) install for QRadar 7.2.8 or earlier sets XFS to ftype=0. When QRadar is upgraded to 7.5.0 Update Pack 2 Interim Fix 2 or later, the Docker service expects ftype=1, preventing services from starting as expected. This issue only affects the QRadar Console appliance, all other appliance types are unaffected.

Content

Technical note updates


  • 19 June 2023 4:00 PM ET: Added a note for newer versions that are affected by this issue, such as 7.5.0 Update Package 5 and Update Package 6 that were upgraded from 7.2.8 versions.
  • 25 January 2023 4:00 PM ET: Added a note that the hostname of the new Console and old Console must be identical, including the case. Administrators reported issues where deploy changes did not complete after a migration when the hostnames were not identical.
  • 20 January 2023 2:00 PM ET: Updated article to add a process for installing and migrating data to a new Console to resolve the Red Hat Docker issue described in APAR IJ41796.
  • 22 September 2022 2:45 PM ET: Updated this technical note to list that all versions of QRadar SIEM 7.5.0 UP2 IF2 and 7.5.0 UP3 and later are affected when the system started at 7.2.8, then upgraded.
  • 10 September 2022 3:00 PM ET: Updated this technical note as this issue also impacts QRadar 7.5.0 Update Pack 3.
  • 12 August 2022 4:00 PM ET: Initial release of the flash notice to users.

Urgency


Important: Several users reported a critical Docker service issue in QRadar SIEM 7.5.0 Update Pack 2 Interim Fix 2 and later versions where applications were not running after a software update. Users with QRadar appliances originally installed at 7.2.8 or earlier must confirm their file system ftype setting before you install QRadar 7.5.0 Update Pack 2 Interim Fix 2 and later.

Affected products

QRadar SIEM Console appliances at 7.5.0 Update Pack 2 Interim Fix 2 and later with a file system type that reports ftype=0 as described in this technical note.

QRadar Console appliances at:
  • 7.5.0 Update Package 2 Interim Fix 2 or later. This includes:
    • 7.5.0 Update Package 3
    • 7.5.0 Update Package 4
    • 7.5.0 Update Package 5
    • 7.5.0 Update Package 6
Note: QRadar on Cloud appliances and App Hosts are not affected by this issue. QRadar managed hosts are not affected by this issue as they do not run Docker services.

Am I affected?

Before you update to 7.5.0 Update Pack 2 Interim Fix 2 or later, you must confirm whether you are affected by this known issue.

Procedure
  1. Use SSH to log in to the QRadar Console as the root user.
  2. Type the following command:
    xfs_info /store | grep ftype
    Example output
    naming=version 2 bsize=4096 ascii-ci=0 ftype=0 
  3. Review the output to determine the file system ftype for your QRadar Console appliance:
    • If the output displays ftype=1, you can continue with an upgrade to 7.5.0 Update Pack 2 Interim Fix 2 or 7.5.0 Update Pack 3.
    • If the output displays ftype=0, do NOT upgrade. Console appliances impacted by this issue must migrate the Console appliance to a new version to avoid this Docker service issue.
    • If you upgraded to QRadar 7.5.0 Update Pack 2 IF2 or later and experience application issues, review the stack trace to confirm the error. When this issue occurs, the following error message displays in /var/log/qradar.log:
      Hostname dockerd[9053]: Error starting daemon: error initializing graphdriver: overlay2: the backing xfs filesystem is 
      formatted without d_type support, which leads to incorrect behavior. Reformat the filesystem with ftype=1 to enable 
      d_type support. Backing filesystems without d_type support are not supported. 
      Hostname systemd[1]: docker.service: main process exited, code=exited, status=1/FAILURE 
      Hostname systemd[1]: Failed to start Docker Application Container Engine

      Results
       If you upgraded to 7.5.0 Update Pack 2 Interim Fix 2 or later and confirmed the stack trace for this error, you must migrate your Console to resolve this issue. For more information, see Migrating to a new Console.
     

    Migrating to a new Console

    About migrating your Console
    To resolve the docker ftype issue, administrators must install a new Console appliance from an ISO file and migrate a configuration backup to the new appliance. As this issue exists due to upgrades from older Red Hat versions, such as 7.2.8, reinstalling and migrating resolves the issue. You are not required to remove managed hosts from the old QRadar Console because the new QRadar Console takes over any existing hosts in the deployment. This procedure allows managed hosts in the deployment to continue to receive events while the Console is offline.
     

    Before you begin

    • Write down the network information for the old Console; you must enter this information into the network configuration for the new appliance. Ensure that the old Console and the new Console are in the same network.
    • Save a recent configuration backup from the old Console. The configuration backup is used to restore settings, users, rules, log sources, and more to the new Console.

      Important: Configuration backups can only be restored to the same version of QRadar that they were created with. If you plan to change the overall QRadar version in the deployment, you must create a new configuration backup after any software change and keep these files in a safe place for your hardware migration. Moving from a smaller Console to a larger or newer appliance is supported by the migration or backup process. For example, a 3105 Console's configuration backup can be applied to a 3128 or a 3148 appliance. To view the installed software version for any appliance from the command line, type:
      /opt/qradar/bin/myver
    • It is not necessary to remove managed hosts from the old QRadar Console because the new QRadar Console takes over any existing hosts in the deployment. This procedure allows managed hosts in the deployment to continue to receive events while the Console is offline.
    • Ensure the new Console has remote management configured, such as IMM or DRAC. Certain changes outlined in this procedure cannot be completed from an SSH session, such as changing IP addresses. Access to remote management of the Console is required unless you have physical keyboard access to the Console.
    • If you are using managed WinCollect, ensure that you download and reinstall the WinCollect SFS version that matches the old Console before you migrate. For the latest WinCollect managed versions, see https://getwincollect7.

    1. Prepare your new Console

    In this section, you must complete a QRadar installation on the new Console by using the software version that matches that of the old Console. The installation of the new Console uses a temporary IP address until the old hardware is removed from the deployment.
    1. Rack the replacement Console and connect network connections.
    2. Download the a QRadar ISO from IBM Fix Central that matches your existing Console version.
      Note: Depending on the version on your old Console, you might be required to download an ISO file, plus SFS files or interim fixes. To view the installed software version for any appliance from the command line, type:
      /opt/qradar/bin/myver
    3. Turn on the appliance and log in as a root user.
    4. When the system displays the license agreement (EULA), press Y to continue.
    5. Configure QRadar.
    6. Assign a temporary IP address and network information for the new hardware.
    7. Type a root password for the appliance.
    8. Follow the installation wizard to complete the installation.
    9. Apply any SFS updates or interim fixes to confirm the new hardware as at the same version level as the old Console.
    10. Download any other required software, such as the WinCollect if you have agents managed by your Console.
    11. Click the Admin tab, and then click the Auto Update > Get Manual Updates to ensure all files are at the latest version.

      Results
      When the Console versions of both the new and old Console are at the same version, you are ready to create a configuration backup to migrate settings, such as rules, log sources, and other information to the new Console.

    2. Prepare a configuration backup on your old Console

    Configuration backups can only be restored to the same version of QRadar that they were created with. If you plan to change the overall QRadar version in the deployment, you must create a new configuration backup after any software change and keep these files in a safe place for your hardware migration.
     
    1. Log in to the old Console.
    2. Click the Admin tab, and then click the Backup and Recovery icon.
    3. From the navigation menu, click On Demand Backup.
    4. Type a name and description for the new configuration backup.
    5. Click Run Backup and wait for the configuration backup to complete.
    6. After the backup finishes, click the new configuration backup name that you created to download the file.
    7. Copy the configuration backup from the old QRadar Console to a safe location.
    8. Stop services on the old Console by typing the following commands:
      systemctl stop hostcontext
      systemctl stop tomcat
      systemctl stop hostservices
      systemctl stop tunnel_manager
      Note: These services must be stopped on your old Console before you reassign the IP address for the old Console.

      Results
      A configuration backup file is created for the new Console to use. This file is required later on in the procedure to restore users, rules, log sources, offenses, reports, admin configurations, and other system settings to the new hardware. You are now ready to assign the old Console an IP address in your decommissioned or unused IP range for your network.

    3. Reassign the IP address on the old QRadar Console

    This process is done manually by adjusting the network configuration file directly for the management interface, instead of using the qchange_netsetup utility. You can use this method to change the system's physical IP address to avoid conflicts. If you experience any issues on the new Console, you can easily revert to the IP address back to the original value. After the IP address is changed on the old Console, it cannot affect any changes to the other hosts in the deployment unless the IP address is reverted.
     
    Note: You must use IMM or a physical keyboard to prevent connection and lockout issues when you manually change the IP address for your old Console. If you're used to editing network configuration files in Linux®, you can use SSH and the screen command. Using a direct SSH session with systemctl restart network results in the loss of network connectivity and causes issues with the address change and service restart.
    1. Use IMM for remote access, or the local Console keyboard, to log in to the command line of the old appliance as the root user.
    2. Verify which network interface is the management interface by typing the following command:
      cat /etc/management_interface
      The interface that is listed in this file is the QRadar management interface.
    3. Change the directory to /etc/sysconfig/network-scripts/.
    4. Open the ifcfg-<name> file that was listed in the /etc/management_interface file.
    5. Change the IP address to an unused or decommissioned range by editing the IPADDR= line.
    6. Save the changes to the file.
    7. Restart networking by typing the following command:
      systemctl restart network
      Results
      After the network service restarts, the IP address change is complete, freeing up the old IP address for use on the new Console. Do NOT unrack the old hardware until after you transfer the data to the new appliance and verify functionality. If you experience issues with your new Console, you can revert the IP changes from this procedure.
       

    4. Set IP addresses on the new QRadar Console

    In this section, you can use remote management or the Console's keyboard to set the IP address of your new Console to use the IP address previously assigned to your old Console. This step is important as all managed hosts use this IP address to get configurations and talk to your new Console at the expected IP address.
     
    1. Use IMM for remote access or the local Console keyboard to log in to the command line of the new appliance as the root user.
    2. Change the IP address by typing the following command:
      /opt/qradar/bin/qchange_netsetup
    3. Use the Configuration Wizard to change the IP address of the system to the old Console's IP address.
    4. Save and exit the wizard to complete the address change.

      Results
      The new Console is updated with the IP address required to management the deployment. You are now ready to copy keys and certificates to the new Console.

    5. Set up SSH keys and certificates on the new QRadar Console

    In this section, you can copy certificates and custom-generated key pairs from the old appliance to the new appliance to ensure that log sources and scanners can connect to remote sources. You must also migrate any custom-generated private keys that you have by transferring the /etc/ssh and /root/.ssh directories.
    1. Log in to the old QRadar Console as the root user.
      Note: The SSH session should use the decommissioned IP range. Ensure you are logging in to your old Console.
    2. Copy the data from the old hardware to the new appliance by using the rsync as in the following examples.
      Note: If you use a cross-over cable between your old and new Console, you can use the -av command to improve transfer speed.
      To copy certificates, type:
      rsync -avz /opt/qradar/conf/trusted_certificates/ 
         root@new_appliance:/opt/qradar/conf/trusted_certificates/
      To copy SSH keys, type both commands:
      rsync -avz /etc/ssh/ root@new_appliance:/etc/ssh
      rsync -avz /root/.ssh/ root@new_appliance:/root/.ssh
    3. Wait for the transfer to complete.
    4. If you are using custom SSL certificates, follow these steps:
      1. Copy the certificate or intermediate certificate from the /etc/httpd/conf/certs directory on the old Console to the /tmp directory or your preferred location on the new Console.

        Do not copy the certificate to the /etc/httpd/conf/certs directory on a new Console.

      2. Install the SSL certificate that you copied on the new Console by using /opt/qradar/bin/install-ssl-cert.sh -i and follow the instructions.

        The wizard prompts you for a private key. You might have to copy the private key to the server if it is not stored in the /etc/httpd/conf/certs/ directory. It is usually a best practice not to store the private key on the server itself.

        Important: If the Console on your new appliance has a different certificate authority (CA) certificate than the Console on your old appliance, the CA from your old appliance should be placed under the directory /etc/pki/ca-trust/source/anchors and run the command update-ca-trust.

        Results
        The required certificate and ssh key files are transferred to the new Console. You can now migrate event and flow data from the old Console to the new Console.

    6. Restore your configuration backup on the new QRadar Console

    In this section, you are ready to place your configuration backup file in the inbound directory to allow the user interface to display the new configuration backup.
    1.  Using SCP, copy the configuration backup file that you downloaded previously to the /store/backupHost/inbound/ directory on the new Console.
    2. Log in to the new QRadar Console as an administrator.
    3. Click the Admin tab and select the Backup and Recovery icon.
    4. Select the configuration backup that you copied to the Console and click Restore.
    5. In the restore options list, check Select All Configuration Items and Select All Data Items.
    6. Click Restore to start the configuration restore process.
      Note: The restore process is a full database update and might take a while to complete.
    7. After the restore process is complete, log in to QRadar.
    8. From the Admin tab, click Advanced > Deploy Full Configuration.
      Note: The hostname of the new Console must match the value of the old Console appliance you are replacing, including capitalization. If the hostname differs when you install the new appliances, you might experience issues with the deploy after you restore the configuration backup.
    9. Verify that event or flow sources that reported to the original host are now processed in QRadar.

      Results
      After the host is added back to the QRadar deployment, the deployment process ensures that the required configuration is regenerated on the new appliance. Verify that log source data is pulled and that flow data is received by the new Console. Any log sources that are not collecting data might require certificates to be moved to the new host.
       

      When the configuration is finished restoring on the new console, you might receive an error that indicates that the console license keys expired. You can add the new licenses to resolve this error.

    7. Transfer event and flow data to the new QRadar Console

    The data transfer can be a lengthy process. You can use cross-over cables to quicken the transfer of event and flow information if your appliances are located in the same data center. Data is moved in one month intervals to keep the performance impact at a minimum. The syncAriel.sh utility does not move certificates or configurations, only data that is stored in the /store/ariel/ directory. SSH traffic must be allowed to migrate the data. You might be required to accept SSH keys and provide the root password for the target server to start the transfer.

    1. Download syncAriel.sh from this technical note.
    2. Log in to the old QRadar Console as the root user.
    3. Using SCP, copy the syncAriel.sh utility to the old Console.
    4. Navigate to the directory with the syncAriel.sh utility and type the following command to set permissions:
      chmod +x syncAriel.sh
    5. Type the following command:
      screen
      Note: For data transfers, start a screen session to reestablish the connection in case of a minor network outage. To detach from screen to let a data transfer continue, type Ctrl+a+d and log out. To reattach to an existing session, type screen -r to reattach to the screen session.
    6. Run the utility by typing the following command:
      sh syncAriel.sh -i <new_Console's_IPAddress>
    7. Wait for the transfer to complete, then close the screen session.

      Results
      Data is migrated from the /store/ariel directory of the old Console to the new Console. If your connection dropped or a network outage occurred, you can run the syncAriel.sh utility again to migrate data. The syncAriel.sh utility keeps track of files that have been rsync'd to the new appliance and data that has already been transferred is not copied a second time. If the transfer fails or encounters errors, transfer the data manually by using SCP, SFTP, or another file transfer method.

    Post-migration checklist for administrators

    To assess the general health of your deployment, it is helpful to have standard checks to follow in order to make sure core functionality has been restored after updating your software. Administrators can complete this check list to confirm Console migration functionality.

    1. Is the user interface for QRadar accessible?

    Can you log in to the Console user interface from a browser?  
    TIP: It is best to clear your browser cache or attempt to log in using the browser's private or incognito mode after a software update.

    What to review
    1. Attempt to log in to the QRadar command-line using SSH as the root user.
      If a software update is still in progress, the command-line prints, "Patch still in progress - DO NOT REBOOT!". Users might need to wait for the patch update to complete on the Console and services to be restarted before the user interface is available.
    2. If you cannot access the user interface or command line using SSH, then connect using IMM or Hypervisor session log in to the console to see if it is running. If you can connect and the console is running contact your network administrator. 
    3. If you cannot access the user interface, but can access the Console system via SSH, type systemctl status tomcat.
      If status is active or activating, the service is still initializing. In some cases, you may need to wait for 10-15 minutes after starting Tomcat for the user interface to become accessible.

    2. Are all Managed Hosts showing the expected Status?

    If any of the Managed Hosts are not in Active status, check the following:
    1. From the Console command-line interface (CLI), try establishing an SSH connection to each non-Active Managed Host (MH).

      If the connection fails or times out:
    2. Does a Full Deploy task complete successfully for all Manage Hosts?
      • After the Deploy task has completed and returned status for all hosts, for any Managed Hosts that report Timed Out or otherwise fail to deploy:
        • Check for the file /opt/qradar/conf/hostcontext.NODOWNLOAD on the MH. For more information on how to resolve this issue, see Technote 10744001 - Deploy Changes Does Not complete or Times out.
        • Check that all partitions have sufficient free disk space:
          • Connect to the MH via SSH as 'root' and type: df -h. If the percentage of disk space used ('Use%') for each partition is above 85% free up space on the relevant partitions and try again. For additional information on how to resolve space disk issue see our Support 101 page on Troubleshooting disk space issues.

    3. Do all Dashboards populate?

    Dashboards largely rely on ariel queries against accumulated data (also referred to as Global Views (GV) or Aggregated Data Views (ADV).
    • If some Dashboards are failing to populate, but others are working, we're likely seeing a problem with the individual Global Views. To troubleshoot corrupted Global Views, please note which Dashboards are affected.
    • If all Dashboards are failing to populate, check the statuses of the accumulator service as the accumulator should be an active service on all managed hosts.
      • On all hosts the accumulator service should be Active. From the Console command line, type:
          /opt/qradar/support/all_servers.sh -C 'systemctl is-active accumulator'
        For any hosts where accumulator is not Active, try manually restarting the service with the command `systemctl restart accumulator`.
      • The Console and all EP, FP, EP/FP, and Data Node hosts should also have a running ariel service. On the Console ariel runs as ariel_proxy_server. All other relevant hosts will have ariel as ariel_query_server.
        • On the Console run:
          systemctl is-active ariel_proxy_server
        • To check ariel on the remaining Managed Hosts. Run this command from the Console CLI:
          /opt/qradar/support/all_servers.sh -a '16%,17%,18%,21%' 'systemctl is-active ariel_query_server'
        • If the ariel* service is stopped, try starting the relevant service manually with `systemctl restart <service>` on the affected systems.
          • If ariel fails on any system, you may not be able to retrieve event or flow data from that Managed Host, accumulated or otherwise.

    4. Are Offenses generating and updating?

    1. In the 'Last Event/Flow Seen' for the most recently updated Offenses, is the timestamp recent? 
    2. Confirm whether ecs-ep is running on all 31xx, 18xx, 17xx, and 16xx hosts:
           - For 7.2.8: /opt/qradar/support/all_servers.sh -C -a '31%,18%,17%,16%,software' "service ecs-ep status"
           - For 7.3.x: /opt/qradar/support/all_servers.sh -C -a '31%,18%,17%,16%,software' "systemctl is-active ecs-ep"

      A. If service status is failed
      If the service state is 'failed', try restarting the service:
        - For 7.2.8: /opt/qradar/support/all_servers.sh -C "service ecs-ep restart"
        - For 7.3.x: /opt/qradar/support/all_servers.sh "systemctl restart ecs-ep"
      If ecs-ep fails to start on any systems, collect the logs for the Console system and any affected managed host where the ecs-ep is not starting as expected and open a case with QRadar Support.

      B. If service status is running on the managed hosts
      Rule out the possibility that the system is working correctly and no events or flows that are triggering rules to generate offenses. Administrators can confirm offense creation by creating a rule:
      1. In the QRadar UI, click the Offenses tab, then select Rules.
      2. Once the Rules display loads select Actions > New Event Rule.
      3. Identify a Source IP (or IP range) in your environment that is consistently generating events
      4. Configure the new event rule with this criteria, using the address or range identified above as the value for:
        • Apply Offenses Test Rule on events which are detected by the local system and when the source IP is one of the following IP addresses and click Next.
      5. Set the Responses options:
        • Select the Ensure the detected event is part of an offense check box.
        • Respond no more than 1 per 1 minute per source IP.
        • Click Finish.
        • On the Offenses tab, ensure the rule is enabled.
        • On the Log Activity tab, add a filter for the Source IP or range you configured in the rule, and watch for incoming events.
      6. When events show up for this IP or range, you should see that a new Offense is created and the events for the source IP or range are associated with that Offense.
      7. Disable the rule after you confirmed offenses are generated by the Console as the test is complete.

    5. Can you search on and view events or flows?

    1. Do you see new event and flow data in QRadar?
           1A. Click the Log Activity tab and select View > Real Time (streaming).
           1B. Click the Network Activity tab and select View > Real Time (streaming).
    2. Are your Console, Event Processors, and Event and Flow Processors receiving events?
      In Log Activity tab, select Quick Searches > Event Processor Distribution - Last 6 Hours. Verify that your Console (31xx), Event Processors (16xx), and Event and Flow Processors (18xx) appliances are represented on the search results.
    3. Can I run normal searches?

    6. Is your Assets tab populated as expected?

    1. Is the Assets page loading and populating as expected?
    2. Is the Last Modified Time for the most recently modified asset relatively recent?

    7. Can all users log in to the Console user interface?

    1. Can the local 'admin' account login to the user interface?
    2. Can local non-admin Users log in to the user interface?
    3. Can non-local users (LDAP, etc; where applicable) login to the user interface?

    8. Are my apps all loaded and running?

    9. Additional CLI checks

    1. The checks described above should cover most core functionality. You can run the following command to ensure all QRadar processes are started:
        /opt/qradar/upgrade/util/setup/upgrades/wait_for_start.sh
      NOTE: Wait for at least 3 iterations after an upgrade to confirm all processes display a status of running. If any services display as stopped after a few iterations, note the services and press Ctrl +c to stop the utility.
      image-20200217175930-1
    2. If using high availability (HA), administrators can use the ha cstate command to confirm the status and synchronization of HA pairs: Troubleshooting QRadar HA deployments.
    3.   Review System Notifications in the QRadar Dashboard for any Predictive Disk Failure: QRadar: Troubleshooting Disk Failure or Predictive Disk Failure Notifications
    4.   To confirm the Console appliance polls for X-Force Threat Intelligence feeds, type the following command:  tail -f /var/log/dca/sca_server.log

      The Console attempts to poll for data every 3 minutes and must not return errors in the sca_server.log file. When successful, the logs display the new database version for IP reputation (IPR) and URL classifications as they are installed.
      2020-02-17 17:50:22.354 I UPDATE: Result #1 id=20, module=IPR Classification, contentUpdated=true, engineUpdated=false, details=1 0 19614
      2020-02-17 17:50:22.354 I UPDATE: Detail #0: component=ipr_database old version=6.01117773 new version=6.01117774 available=true downloaded=true installed=false return=download or install scheduled 0 19614   

    My migration failed a post-install check, how do I proceed?

    1. Record all troubleshooting steps from any procedure that failed.
    2. If possible, collect logs from the Console. For more information on how to collect logs, see Getting logs from a QRadar deployment
    3. Open a case with IBM QRadar Support. Include logs and migration steps you completed the case.
    4. If your appliances are unavailable or not functional, you can indicate that you have a 'System down' issue.
    5. A QRadar Support representative will contact you using your preferred method of communication



    We apologize for any inconvenience due to this issue. If you have questions about the contents of this technical note, contact QRadar Support.

    - QRadar Support

     

    [{"Type":"MASTER","Line of Business":{"code":"LOB24","label":"Security Software"},"Business Unit":{"code":"BU059","label":"IBM Software w\/o TPS"},"Product":{"code":"SSBQAC","label":"IBM Security QRadar SIEM"},"ARM Category":[{"code":"a8m0z000000cwtdAAA","label":"Upgrade"}],"Platform":[{"code":"PF016","label":"Linux"}],"Version":"7.5.0"}]

    Document Information

    Modified date:
    19 June 2023

    UID

    ibm16612393