Question & Answer
Question
What is DRBD split-brain, why is it a concern, and how can it be resolved?
Cause
Split brain can be caused by intervention by cluster management software or human error during a period of failure for network links between cluster nodes, causing both nodes to switch to the primary role while disconnected.
Answer
Split brain occurs when both High Availability nodes switch into the primary role while disconnected. This behavior can allow data to be modified on either node without being replicated on the peer, leading to two diverging sets of data on each node, which can be difficult to merge.
Important: If your system is experiencing split brain, it cannot fail over when a system goes down and data might be lost because it does not properly replicate. This is a serious issue that must be addressed immediately.
Important: If your system is experiencing split brain, it cannot fail over when a system goes down and data might be lost because it does not properly replicate. This is a serious issue that must be addressed immediately.
How to identify split brain
- SSH into the QRadar Console.
- Check the HA state by using the following command:
cat /proc/drbd
- If both hosts are in the StandAlone state, or one is in StandAlone while the other is in WFConnection, this might be a split brain situation.
- Search the /var/log/message log for the string "Split-Brain detected".
grep "Split-Brain detected" /var/log/message.log
Result
If you see a message similar to the following, you are in a split brain situation:Generic-primary kernel: block drbd0: Split-Brain detected but unresolved, dropping connection!
What to do when you have split brain
Contact support and consider raising the case as a Severity 1, and provide logs from both hosts. The team must identify which High Availability node has valid data, which is typically the one that was last active.
Related Information
[{"Type":"MASTER","Line of Business":{"code":"LOB24","label":"Security Software"},"Business Unit":{"code":"BU059","label":"IBM Software w\/o TPS"},"Product":{"code":"SSBQAC","label":"IBM Security QRadar SIEM"},"ARM Category":[{"code":"a8m0z000000cwtXAAQ","label":"High Availability"}],"ARM Case Number":"","Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"All Versions"}]
Was this topic helpful?
Document Information
Modified date:
23 November 2022
UID
ibm16841045