Troubleshooting
Problem
Cassandra pods are down and going to CrashLoopBackOff mode
Symptom
ERROR [main] 2023-06-15 12:19:44,673 JVMStabilityInspector.java:124 - Exiting due to error while processing commit log during initialization. org.apache.cassandra.db.commitlog.CommitLogReadHandler$CommitLogReadException: Encountered bad header at position 49377 of commit log /opt/ibm/cassandra/bin/../data/commitlog/CommitLog-6-1681292370141.log, with bad position but valid CRC
Resolving The Problem
To fix this you need to delete that commit log and, to do that, you need to stop the pod from crashing, so edit the stateful set for cassandra, and add a line:
command: ["sh","-c","sleep 1000"]
after the line starting:
image: cp.icr.io/cp/noi/cassandra@sha256:b10b...
command: ["sh","-c","sleep 1000"]
after the line starting:
image: cp.icr.io/cp/noi/cassandra@sha256:b10b...
So, you have, for example:
image: cp.icr.io/cp/noi/cassandra@sha256:37c07e695d2cdd5b765f5829fd9f0eadc479f2ccf8ccc80fbf0890e4e262ee37
command: ["sh","-c","sleep 1000"]
When you save the stateful set the pods should restart.
If not, restart the failing pod.Login (rsh) to the previously failing pod and remove the bad commit log, i.e.:
rm -f /opt/ibm/cassandra/data/commitlog/CommitLog-6-1681292370141.log
rm -f /opt/ibm/cassandra/data/commitlog/CommitLog-6-1681292370141.log
Then re-edit the stateful set and remove the sleep. Restart the pod(s) again if they don't do it automatically.After this, you need to login to the pod once more and run:
/opt/ibm/cassandra/bin/nodetool repair --full
Note: The removal of the commit log will result in some loss of data.
Document Location
Worldwide
[{"Type":"MASTER","Line of Business":{"code":"LOB45","label":"Automation"},"Business Unit":{"code":"BU059","label":"IBM Software w\/o TPS"},"Product":{"code":"SSTPTP","label":"Netcool Operations Insight"},"ARM Category":[{"code":"a8m0z0000001jZTAAY","label":"NOI Netcool Operations Insights-\u003ECNEA Cloud Native Event Analytics"}],"ARM Case Number":"TS013332876","Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"All Versions"}]
Was this topic helpful?
Document Information
Modified date:
22 June 2023
UID
ibm17006093