IBM Support

QRadar: Troubleshooting Slow User Interface Response Times

Troubleshooting


Problem

There are certain conditions that can cause applications or other pages in the QRadar User Interface (UI) to become slow or unresponsive. This technote provides steps to check environmental factors such as CPU utilization, available memory, running database queries and more to determine the source of UI performance issues.

Resolving The Problem

CPU Utilization
1. Check the appliance type of the console by running 
cat /opt/qradar/conf/capabilities/hostcapabilities.xml
Example Output:
<?xml version='1.0' encoding='UTF-8' standalone='yes'?>
<HostCapabilities
	isConsole="true"
	IP="10.1xx.xx.xxx"
	applianceType="3178"
	hostName="abc"
	qradarVersion="7.5.0"
	hardwareSerial="d1234-c3456-a678"
	activationKey="XXXXX-XXXXX"
	managementInterface="eth0"
	xmlns="http://www.q1labs.com/products/qradar"
/>
The output shows applianceType="3178"
2. Check the System CPU, execute:
lscpu
Example Output:
image-20230311133407-2
The output shows that 12 CPU cores.  Check the Appliance Type from Step 1. in the System Requirements to confirm there are sufficient CPU resources on the environment.
If the number of CPU cores deployed to the host match the Suggested Number of CPU cores specified in the System Requirements, proceed to the next troubleshooting steps:

Memory

1. Check the amount of available memory on the console command line:
watch -n 7 free -m
Example Output:
image-20230311142804-1
2) Alternatively, use the top command to query what services are consuming the most CPU and memory:
Command:
top
Sample Output:
image-20230311144908-1
This output displays what services are using the most CPU and memory on the host. In the preceding sample output, ecs-ec-ingress and postgres are the highest consumers and might require more attention. 
QRadar installations also contain the threadTop.sh script. This script provides a more granular view of CPU utilization for QRadar specific services, click here for a technote that outlines the use of this script.

Disk Usage

1) Use the command:
df -h
to check the active disk partitions on a host; and also the amount of disk space that is used on each partition. Look out for any partitions where disk utilization is more than 80%.
Sample Output:-
Filesystem                        Size  Used Avail Use% Mounted on
devtmpfs                           63G     0   63G   0% /dev
tmpfs                              63G  708K   63G   1% /dev/shm
tmpfs                              63G  4.1G   59G   7% /run
tmpfs                              63G     0   63G   0% /sys/fs/cgroup
/dev/xvda2                         20G   12G  7.0G  63% /
/dev/xvda1                        2.0G  236M  1.6G  13% /boot
/dev/mapper/rootrhel-tmp          3.0G   53M  3.0G   2% /tmp
/dev/mapper/rootrhel-opt           10G  4.2G  5.9G  42% /opt
/dev/mapper/rootrhel-home        1014M   33M  982M   4% /home
/dev/mapper/rootrhel-var          5.0G  2.5G  2.6G  50% /var
/dev/mapper/rootrhel-varlog        15G  7.4G  7.7G  49% /var/log
/dev/mapper/rootrhel-varlogaudit  3.0G  429M  2.6G  14% /var/log/audit
/dev/mapper/conf                   10G  1.2G  8.9G  12% /opt/qradar/conf
/dev/mapper/storetmp               15G  634M   15G   5% /storetmp
/dev/mapper/store                 7.9T  658G  7.2T   9% /store
Click here for more information on troubleshooting disk space issues on QRadar hosts.
Disk I/O metrics

Command:-

iostat -dmx sda


iostat -dmx sdb

Sample output:

Linux 3.xx.0-xx.xx.1.exx.x86_64 (qradar.csdd.lx)   01/09/2023     _x86_64_       (20 CPU)
Device:        rrqm/s  wrqm/s    r/s    w/s   rMB/s   wMB/s avgrq-sz avgqu-sz  await r_await w_await svctm %util
sda1             0.00    0.00   0.00   0.00    0.00    0.00    8.00    0.00   0.25   0.25   0.00  0.25  0.00
[root@qradar ~]# iostat -dmx sda
Linux 3.xx.0-xx.xx.1.exx.x86_64 (qradar.csdd.lx)   01/09/2023     _x86_64_       (20 CPU)
Device:        rrqm/s  wrqm/s    r/s    w/s   rMB/s   wMB/s avgrq-sz avgqu-sz  await r_await w_await svctm %util
sda             70.02    2.13 793.60  16.71  121.59    1.10  310.09    1.89   2.33   2.35   1.73  0.47 37.94
[root@qradar ~]# iostat -dmx sdb
Linux 3.xx.0-xx.xx.1.exx.x86_64 (qradar.csdd.lx)   01/09/2023    _x86_64_       (20 CPU)
Device:        rrqm/s  wrqm/s    r/s    w/s   rMB/s   wMB/s avgrq-sz avgqu-sz  await r_await w_await svctm %util
sdb            335.46    3.30 3411.21  45.34  653.43    2.11  388.41    0.48   0.14   0.12   1.57  0.23 78.06

Use the r_await and await metrics from the preceding output to monitor current disk read/writes. Values consistently higher than 15 might indicate an issue. If you see consistent r_await, then you have multiple processes waiting for reads.

More information about working with iostat in RHEL can be found here: https://www.redhat.com/sysadmin/io-reporting-linux

  • avgqu-sz - average queue length of a request issued to the device
  • await - average time for I/O requests issued to the device to be served (milliseconds)
  • r_await - average time for read requests to be served (milliseconds)
  • w_await - average time for write requests to be served (milliseconds)

If the avgrq-sz is greater than 200 or avgqu-sz is greater than 20-30, it can be indicative of decreased disk performance. 

Next, use iotop -aoP to track what QRadar services are using the most disk reads.

Command:-

iotop -aoP

Sample Output:

Total DISK READ :    476.90 M/s | Total DISK WRITE :      7.35 M/s
Actual DISK READ:    477.48 M/s | Actual DISK WRITE:    579.87 K/s
  PID PRIO USER    DISK READ DISK WRITE SWAPIN    IO>   COMMAND
 52249 be/7 root        83.57 G    42.30 M 0.00 % 0.84 % java -Dapplication.name=ariel_proxy -Dapp_id=ariel_proxy_server -Dj~tutil-8.3.0.jar:/opt/qradar/jars/fontbox-2.0.4.jar:/opt/qradar/jars/
 58549 be/4 postgres     0.00 B   524.00 K 0.00 % 0.04 % postgres: qradar qradar 127.0.0.1(44058) idle
  125 be/4 root        16.00 K     0.00 B 0.00 % 0.03 % [kswapd0]
130888 be/4 postgres     0.00 B  1120.00 K 0.00 % 0.03 % postgres: walwriter
 24370 be/4 postgres     0.00 B   688.00 K 0.00 % 0.02 % postgres: qradar qradar 127.0.0.1(43475) SELECT     saction
 58541 be/4 postgres     4.00 K   536.00 K 0.00 % 0.02 % postgres: qradar qradar 127.0.0.1(44047) idle in transaction
 54296 be/4 postgres     0.00 B   476.00 K 0.00 % 0.02 % postgres: qradar qradar 127.0.0.1(43764) idle
 2869 be/0 root         0.00 B   552.00 K 0.00 % 0.01 % [loop0]
 58553 be/4 postgres    16.00 K   108.00 K 0.00 % 0.00 % postgres: qradar qradar 127.0.0.1(44060) idle
 57107 be/4 postgres     0.00 B   104.00 K 0.00 % 0.00 % postgres: qradar qradar 127.0.0.1(43992) idle
 1155 be/4 root      1504.00 K     0.00 B 0.00 % 0.00 % [xfsaild/dm-9]
105851 be/4 root         2.74 M     0.00 B 0.00 % 0.05 % defect-inspector -fingerprint /opt/qradar/support/data/inspector/ /var/log/qradar.log
 58586 be/4 postgres    16.00 K    40.00 K 0.00 % 0.00 % postgres: qradar qradar 127.0.0.1(44081) idle
 81830 be/4 root         0.00 B     0.00 B 0.00 % 0.00 % [kworker/15:3]
 19857 be/4 nobody       0.00 B     8.00 K 0.00 % 0.00 % python3.6 /usr/bin/celery beat -A app.celery_worker.beat-config --l~vel=INFO --schedule /tmp/celerybeat.db --pidfile=/tmp/celerybeat.pid
 1266 be/3 root         0.00 B   116.00 K 0.00 % 0.00 % auditd
 58587 be/4 postgres     0.00 B    32.00 K 0.00 % 0.00 % postgres: qradar qradar 127.0.0.1(44080) idle
 35651 be/4 root         0.00 B     0.00 B 0.00 % 0.00 % [kworker/15:1]
 49933 be/4 root        50.49 M    76.62 M 0.00 % 0.00 % java -Dapplication.name=accumulator -Dapp_id=accumulator -Djava.lib~.3.0.jar:/opt/qradar/jars/fontbox-2.0.4.jar:/opt/qradar/jars/fop-2.2
 50112 ?dif root         3.32 M    49.48 M 0.00 % 0.00 % java -Dapplication.name=ecs-ep -Dapp_id=ecs-ep -Djava.library.path=~/ibm/si/services/ecs-ep/current/eventgnosis ecs-ep.ecs 220 noconsole
 9826 be/4 root       168.00 K    16.00 K 0.00 % 0.00 % conman-server --scheme=https --tls-host=:: --tls-port=9000 --tls-ce~nman.key --tls-ca=/etc/conman/tls/conman_ca.crt --write-timeout=900s
 50948 be/4 root       648.00 K     0.00 B 0.00 % 0.00 % java -Dapplication.name=arc_builder -Dapp_id=arc_builder -Djava.lib~.3.0.jar:/opt/qradar/jars/fontbox-2.0.4.jar:/opt/qradar/jars/fop-2.2
 5406 be/4 root        96.00 K     0.00 B 0.00 % 0.00 % conwrap -healthCheckPrefix=HEALTH_CHECK_ -portPrefix=PORT -volumePrefix=VOL -envPrefix=ENV -secretPrefix=SECRET
 50779 ?dif root        24.00 K   112.00 K 0.00 % 0.00 % java -Dapplication.name=ecs-ec-ingress -Dapp_id=ecs-ec-ingress -Dja~/ecs-ec-ingress/current/eventgnosis ecs-ec-ingress.ecs 220 noconsole
 70311 be/4 postgres     0.00 B     8.00 K 0.00 % 0.00 % postgres: qradar qradar 127.0.0.1(46078) idle
 73954 be/4 nobody       0.00 B     4.00 K 0.00 % 0.00 % httpd -DFOREGROUND
 53666 be/4 root         0.00 B    12.00 K 0.00 % 0.00 % qflow -p -r60 -c /opt/qradar/conf/nva.qflow.qflow0.conf -nens224 -t0 -w56 -ndefault_Netflow -t3 -w55
 98740 be/4 nobody       0.00 B     4.00 K 0.00 % 0.00 % httpd -DFOREGROUND
104980 be/4 root         0.00 B     4.00 K 0.00 % 0.00 % bash --login /opt/qradar/perf/systemStabMon -interval 23
105009 be/4 root         0.00 B     4.00 K 0.00 % 0.00 % bash --login /opt/qradar/perf/systemStabMon -interval 23
103036 be/4 nobody       0.00 B     4.00 K 0.00 % 0.00 % httpd -DFOREGROUND
 70292 be/4 postgres     0.00 B     4.00 K 0.00 % 0.00 % postgres: qradar qradar 127.0.0.1(46064) idle
 51869 ?dif root         0.00 B   112.00 K 0.00 % 0.00 % java -Dapplication.name=ecs-ec -Dapp_id=ecs-ec -Djava.library.path=~/ibm/si/services/ecs-ec/current/eventgnosis ecs-ec.ecs 220 noconsole
 2724 be/4 root         0.00 B    16.00 K 0.00 % 0.00 % bash --login /opt/qradar/perf/systemStabMon -interval 23
 70313 be/4 postgres     0.00 B     8.00 K 0.00 % 0.00 % postgres: qradar qradar 127.0.0.1(46080) idle
 98986 be/4 nobody       0.00 B     4.00 K 0.00 % 0.00 % httpd -DFOREGROUND
 84662 be/4 root       140.00 K     0.00 B 0.00 % 0.00 % [kworker/u256:2]
 17116 be/4 nobody       0.00 B    20.00 K 0.00 % 0.00 % coreutils --coreutils-prog-shebang=tee /usr/bin/tee -a /opt/app-root/store/log/startup.log
101093 be/4 nobody       0.00 B     4.00 K 0.00 % 0.00 % httpd -DFOREGROUND
101135 be/4 nobody       0.00 B     4.00 K 0.00 % 0.00 % httpd -DFOREGROUND
115506 be/4 nobody       0.00 B     4.00 K 0.00 % 0.00 % httpd -DFOREGROUND
 88912 be/4 nobody       0.00 B     8.00 K 0.00 % 0.00 % httpd -DFOREGROUND
 3001 be/4 root         0.00 B    20.00 K 0.00 % 0.00 % bash /opt/qradar/perf/runningAvgDStat.sh 20 /var/log/systemStabMon /tmp/runningAvgDStat.tmp
 54222 be/4 postgres     0.00 B     4.00 K 0.00 % 0.00 % postgres: qradar qradar 127.0.0.1(43750) idle
 17372 be/4 nobody       0.00 B     8.00 K 0.00 % 0.00 % nginx: worker process
 86711 be/4 nobody       0.00 B     8.00 K 0.00 % 0.00 % httpd -DFOREGROUND
 72931 be/4 nobody       0.00 B     8.00 K 0.00 % 0.00 % httpd -DFOREGROUND
 81128 be/4 nobody       0.00 B     8.00 K 0.00 % 0.00 % httpd -DFOREGROUND
 93457 be/4 nobody       0.00 B     8.00 K 0.00 % 0.00 % httpd -DFOREGROUND

The preceding sample output shows a QRadar host that is read bound and might have degraded performance.

For further assistance troubleshooting specific services contact QRadar Support


Ariel_proxy_server

To troubleshoot ariel_proxy_server performance it is necessary to check what searches are currently running, how much time they are taking to complete, and how much data is being polled.

Troubleshooting Ariel in the UI:

1. Go to the "Log Activity" tab the click the "Search" drop down and then "Manage Search Results”:

2. Order by duration to see whether any searches have been running a long time. Order by size to see whether there are any searches that are returning a large amount of data.

3. When the largest and longest running searches have been established, this technote can be referenced for pointers on searching efficiently. 

4. If there are no large or long-running searches identified, run a new search. While the search is running, click the “More details” link under “Current Statistics”:

The output shows the current progress of the search on a per host basis:

Search Progress

If a particular host is taking longer to complete than others, proceed to the next set of steps.

Troubleshooting Ariel from the Command Line Interface:

Run the following command on the console to check the Ariel queues:

/opt/qradar/support/jmx.sh -p 7782 -b 'com.q1labs.ariel:application=ariel_proxy.ariel_proxy_server,type=Query server,a1=Queries*'

Run the following command on a Managed Host to check the Ariel queues:

/opt/qradar/support/jmx.sh -p 7782 -b 'com.q1labs.ariel:application=ariel.ariel_query_server,type=Query server,a1=Queries*'

Sample Output:-


ariel:application=ariel_proxy.ariel_proxy_server,type=Query server,a1=Queries*'

com.q1labs.ariel:application=ariel_proxy.ariel_proxy_server,type=Query server,a1                                                                                                                     =Queries,a2=NORMAL,a3=flows,a4=12d45c1234
--------------------------------------------------------------------------------                                                                                                                     -------------------------------------------------------------------
FileStats: compressedDataFileCount=0,compressedDataTotalSize=0,dataFileCount=211                                                                                                                     80,dataTotalSize=52869662764,duration=104117,host=global,indexFileCount=2229,ind                                                                                                                     exTotalSize=46869240379,processedRecordCount=554,progress=28.661197416652122,pro                                                                                                                     gressDetails=null,serialversionuid=1
Duration: 0:01:44.117
QueryParameters: Id:31a-12cd-12345, DB:<flows@/store/ariel                                                                                                                     /flows/records, /store/ariel/flows/payloads>, Time:<23-02-20,13:18:40 to 23-02-2                                                                                                                     8,12:17:37>, Criteria=((((<DomainID:[0,0]> AND <SourceIP:[172.xx.xx.xx,172.xx.x                                                                                                                     20.53]>) AND <PartialMatchList:[100236,100236]>) AND <EndTime:[1677566520000,167                                                                                                                     7566857482]>) AND <EventProcessorId:[8,8],[103,103],[133,133],[165,165],[238,238                                                                                                                     ]>) AND  Predicate=com.q1labs.frameworks.util.predicate.NotPredicate@b9984328[p=                                                                                                                     [mc=ContributesMatchList,e=[100236,100236]]], MappingFactory=com.q1labs.core.typ                                                                                                                     es.flow.mapping.FlowRecordMappingFactory@4ee, prio=NORMAL
ProcessedRecordCount: 554
Id: 31a-12cd-12345
ErrorMessages: <null>
DsStats: accessedTime=0,collectedRecordCount=0,retentionTime=0,sizeOnDisk=0
StartTime: 1677934267107
Status: EXECUTE
Progress: 28.661197416652122

If investigation in the UI or command executed on CLI reveals a long running or large search, restarting ariel services on the effected host might be enough to resolve the issue.


Large Reference Data

Reference sets with a large number of elements and no Time to Live value set can have performance impact, and are candidates for tuning. 

Command to be executed on CLI of console:-

psql -U qradar -c "select name, time_to_live, current_count from reference_data order by current_count DESC;"

Sample Output:-

image-20231010104504-1

In the preceding example (names redacted) there is no Time To Live value set on any of the reference sets and in some cases, the count is in the millions. In order to reduce impact on system performance, it is recommended that a Time To Live value should be set on reference sets with a current_count greater than 100K.

You can refer to this guide on how to set Time To Live values for large reference sets.

It is also possible to use the ReferenceDataUtil.sh script to set a Time To Live value from the command line.

Command to run on CLI:-

/opt/qradar/bin/ReferenceDataUtil.sh update “<Name of reference set from Table>“ -timeoutType=FIRST_SEEN -timeToLive='<Number of days/Months/Year/Hours>'

Database Bloat

Check Database Bloat in the QRadar postgres database by querying the table q_table_bloat.

You can check this by running the command:-

psql -U qradar -c "select * from q_table_bloat;"

Sample output:

            relname                        | n_live_tup | n_tup_upd  | n_dead_tup |  total   |      bloat_pct      |        last_autovacuum  

      |       last_autoanalyze        

-------------------------------------------------------+------------+------------+------------+----------+---------------------+-------------------------

------+-------------------------------

 reference_data_element                                |    2561166 | 2404944201 |      68804 |  2629970 |    2.61615151503629 | 2023-01-09 10:50:32.2474

In the preceding example, there is an n_dead_tup value in the millions, if there are any tables greater than 100k it is advised to perform a Vacuum and Reindex of the postgres instance followed by clearing the Tomcat cache.

What to do next

If the provided steps do not significantly help with your performance issues, you can open a case with QRadar Support. In order to speed up resolution, it is recommended to provide the following logs with your case:
 
  1. The QRadar log files. For more information, see How to collect log files for QRadar support from the user interface.
  2. The threadTop, pg_stat, and qlocks.out outputs. To collect these outputs, run the following command from the Console:
    mkdir -r /store/ibmsupport/refdump; for i in {1..40}; do (date; /opt/qradar/support/threadTop.sh -p 7779 --full)>> /store/ibmsupport/refdump/tomcat.out; (date; psql -U qradar -c "select * from q_locks" )>> /store/ibmsupport/refdump/qlocks.out; (date; psql -U qradar -c "select * from pg_stat_activity where state='active'")>> /store/ibmsupport/refdump/pg_stat.out; sleep 5 ; done
    Note: This command takes about 5 minutes to run and generates three files to /store/ibmsupport/refdump/ that you can add to your case.
  3. Upload all files to your QRadar case.

    Results
    A case is open and assigned to Waiting on IBM. A support representative contacts you to discuss your case. If there is an alternate number or a better contact method, you can add a note to your case with the most recent contact information.

Document Location

Worldwide

[{"Type":"MASTER","Line of Business":{"code":"LOB24","label":"Security Software"},"Business Unit":{"code":"BU059","label":"IBM Software w\/o TPS"},"Product":{"code":"SSBQAC","label":"IBM Security QRadar SIEM"},"ARM Category":[{"code":"a8m0z000000cwtiAAA","label":"Performance"}],"ARM Case Number":"","Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"All Versions"}]

Document Information

Modified date:
06 November 2023

UID

ibm16962425