IBM Support

QRadar: What is DRBD split-brain?

Question & Answer


Question

What is DRBD split-brain, why is it a concern, and how can it be resolved?

Cause

Split brain can be caused by intervention by cluster management software or human error during a period of failure for network links between cluster nodes, causing both nodes to switch to the primary role while disconnected.

Answer

Split brain occurs when both High Availability nodes switch into the primary role while disconnected. This behavior can allow data to be modified on either node without being replicated on the peer, leading to two diverging sets of data on each node, which can be difficult to merge.

Important: If your system is experiencing split brain, it cannot fail over when a system goes down and data might be lost because it does not properly replicate. This is a serious issue that must be addressed immediately.
How to identify split brain
  1. SSH into the QRadar Console.
  2. Check the HA state by using the following command:
    cat /proc/drbd
  3. If both hosts are in the StandAlone state, or one is in StandAlone while the other is in WFConnection, this might be a split brain situation.
  4. Search the /var/log/message log for the string "Split-Brain detected".
    grep "Split-Brain detected" /var/log/message.log

    Result
    If you see a message similar to the following, you are in a split brain situation:
    Generic-primary kernel: block drbd0: Split-Brain detected but unresolved, dropping connection!
What to do when you have split brain
Contact support and consider raising the case as a Severity 1, and provide logs from both hosts. The team must identify which High Availability node has valid data, which is typically the one that was last active.

[{"Type":"MASTER","Line of Business":{"code":"LOB24","label":"Security Software"},"Business Unit":{"code":"BU059","label":"IBM Software w\/o TPS"},"Product":{"code":"SSBQAC","label":"IBM Security QRadar SIEM"},"ARM Category":[{"code":"a8m0z000000cwtXAAQ","label":"High Availability"}],"ARM Case Number":"","Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"All Versions"}]

Document Information

Modified date:
23 November 2022

UID

ibm16841045