IBM Support

QRadar SOAR: Diagnosing disk space problems on an IBM Security SOAR appliance

Troubleshooting


Problem

Running out of disk space on your appliance can affect IBM QRadar SOAR and its applications that it relies on.

Symptom

If you see search-related errors in the UI, "internal server" errors and general functionality problems you should consult the server logs as well as checking on free disk.

Diagnosing The Problem

The command sudo df -h outputs the disk file system statistics to check whether you are low on free disk.
Often, this command is not the first thing an administrator checks and often checks the logs (MustGather: Collecting logs for IBM Security SOAR) first.
Here are examples of the messages you might see when a server is low on free disk.
/usr/share/co3/logs/client.log
09:58:01.502 [ForkJoinPool.commonPool-worker-7] ERROR [] o.h.engine.jdbc.spi.SqlExceptionHelper - ERROR: could not extend file "pg_tblspc/16385/PG_9.6_201608131/16386/16714.2": No space left on device   Hint: Check free disk space.
09:58:01.504 [ForkJoinPool.commonPool-worker-7] ERROR [] com.co3.context.Co3ContextRunnable - Exception in runnable
java.lang.IllegalStateException: Unable to commit transaction
/var/lib/pgsql/9.x/data/pg_log/postgresql-<day of week>.log
<  1460 2021-11-15 00:00:01.189 UTC > LOG:  could not open temporary statistics file "pg_stat_tmp/global.tmp": No space left on device
/var/log/elasticsearch/elasticsearch.log
[2021-11-15T06:38:23,035][WARN ][o.e.c.r.a.DiskThresholdMonitor] [lqB90fu] flood stage disk watermark [95%] exceeded on [lqB90fuCQuGwTjJvmADSdw][lqB90fu][/var/lib/elasticsearch/nodes/0] free: 20kb[3.2E-5%], all indices on this node will marked read-only
/var/log/resilient-email/resilient-email.log
07:55:11.902 [Camel (camel-1) thread #18 - Camel Thread #16 - Org 201 Mailbox 4] ERROR v=unknown  c.r.email.EwsServerToJMSRouteBuilder - Unable to connect to the server. Review the connection details provided and try again later.
org.springframework.jms.UncategorizedJmsException: Uncategorized exception occurred during JMS processing; nested exception is javax.jms.JMSException: Batch entry 0 INSERT INTO activemq_msgs(ID, MSGID_PROD, MSGID_SEQ, CONTAINER, EXPIRATION, PRIORITY, MSG, XID) VALUES (147723623, 'ID:Resilient-J.marafiq.com.sa-42469-1635750027318-1:108:1:1', 1, 'queue://email-service.save-email-data', 0, 0, ?, NULL) was aborted: ERROR: could not extend file "pg_tblspc/16385/PG_9.6_201608131/16386/25293": No space left on device
  Hint: Check free disk space.  Call getNextException to see other errors in the batch.
The output from sudo df -h looks like this on a healthy server.
Filesystem                     Size  Used Avail Use% Mounted on
devtmpfs                       2.3G     0  2.3G   0% /dev
tmpfs                          2.3G  4.0K  2.3G   1% /dev/shm
tmpfs                          2.3G   12M  2.3G   1% /run
tmpfs                          2.3G     0  2.3G   0% /sys/fs/cgroup
/dev/mapper/resilient-root      60G   11G   49G  19% /
/dev/sda1                      397M  216M  181M  55% /boot
/dev/mapper/resilient-co3       30G   15G   15G  51% /usr/share/co3
/dev/mapper/resilient-var_log  1.4G  175M  1.3G  13% /var/log
tmpfs                          466M     0  466M   0% /run/user/1001
tmpfs                          466M     0  466M   0% /run/user/992
The exact output of the command differs depending on whether an appliance was upgraded from earlier versions, which had a different partition table or you installed SOAR as stand-alone software on your own licensed RHEL server.

Resolving The Problem

If you determine that the server has run out of spare disk, then you can,
  1. remove unnecessary files
  2. add more disk.
Removal of unnecessary files
Look at the following locations for files that might be old and can be removed. Consider copying the files off the server before deletion.
/usr/share/co3/logs/
The client_access_log<YYYY-MM-DD>.log files are not removed and can accumulate.
/usr/share/co3/logs/daily/
This directory includes rotated logs such as the client.log and monitoring.log.
/usr/share/co3/logs/dumps/
Files in this directory can be large. These files are useful to IBM Support when troubleshooting memory-related problems.
/var/log/elasticsearch/
The files here do not roll and can accumulate.
/tmp
Heap dumps maybe written to this location, which can be rather large.
/crypt/backups
Backup files are saved here.
/
The resPackageLogs utility writes to this location by default.
/home/resadmin
As the home of the default user, many clients scp files to this location for upgrades. Old installers can be removed to free up disk.
A useful command to determine the size of a directory compromising of all its contents is, sudo du -ch /<path to directory>/ | grep total. It returns a value in Megabytes.
Adding more disk
SOAR comes with a predetermined disk that is suitable for many clients but usage might dictate that more disk is required as data is added to the database and attachments saved to disk.
Adding or extending disk involves running operating system commands. Refer to RedHat documentation for assistance. IBM Support is unable to provide support for the operating system.
Other side effects
When the disk gets to 95% used Elasticsearch stops writing to its indices as a protective mechanism. This happens so that Elasticsearch does not attribute to the lack of disk space. If the indices are locked, Elasticsearch indices are locked after a shortage of disk space describes what must be done after disk is freed so that data is indexed.

Document Location

Worldwide

[{"Line of Business":{"code":"LOB24","label":"Security Software"},"Business Unit":{"code":"BU048","label":"IBM Software"},"Product":{"code":"SSA230","label":"IBM Security QRadar SOAR"},"ARM Category":[{"code":"a8m0z0000001h0WAAQ","label":"Installation \/ Upgrade-\u003EOperating System"}],"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"All Versions","Type":"MASTER"}]

Document Information

Modified date:
11 June 2024

UID

ibm16516428