Troubleshooting
Problem
After installing IBM Resilient AppHost and successfully deploying applications, the AppHost status shows as offline. When checking the status of the pod by running "sudo kubectl get pods -A," the status of the pods shows that they are Evicted.
Symptom
Typical symptoms are:
- App Host shows as offline under Administration Settings - Apps
- Apps might not be running and show that they are in an error state
- Pods are shown as "Evicted"
After checking the status of the pod by running sudo kubectl get pods -A, the status show that the pods have been Evicted:
Kobe-system local-path-provisioner-6d59f47c7-s7vvk 0/1 Evicted 0 5d1h
kube-system coredns-0655855d6-sp7c7 0/1 Evicted 0 5d
kube-system metrics-server-7566d596c8-56r6p 0/1 Evicted 0 5d
Cause
Applications run inside pods, which store their data in /var/lib.
For some stand-alone installations of the software, the file system has too little disk assigned to /var that causes the pods to stop running.
Environment
Stand-alone deployments require the client to add disk, while virtual appliance installs set the /var/lib in the tens of gigabytes. This issue can been seen in instances of stand-alone deployments where the disk is undersized.
Diagnosing The Problem
Following MustGather: Information to Collect when Troubleshooting Issues with IBM Security SOAR AppHost the pods have a status of "Evicted".
$ sudo kubectl get pods -A
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system metrics-server-7566d596c8-4lzrr 1/1 Running 0 168d
8ddc6284-f5da-4521-9486-06c2c4d7acdc 701814f3-7ebc-432a-979a-79e8bec086f2-5546cf8779-d97z9 1/1 Running 10 36d
8ddc6284-f5da-4521-9486-06c2c4d7acdc d9f56acf-381a-4f6b-9f12-04d1e21c742f-587bd4cc9-qx9s9 1/1 Running 0 9d
8ddc6284-f5da-4521-9486-06c2c4d7acdc 52aa6dc0-fde7-4907-9ae7-03f22d2e0e12-6d49657476-5hgl2 1/1 Running 0 9d
8ddc6284-f5da-4521-9486-06c2c4d7acdc deployment-operator-b44c9d9b-zhfdx 0/1 Evicted 0 36d
8ddc6284-f5da-4521-9486-06c2c4d7acdc deployment-operator-b44c9d9b-kgvh7 0/1 Evicted 0 9d
8ddc6284-f5da-4521-9486-06c2c4d7acdc deployment-operator-b44c9d9b-rlwwf 0/1 Evicted 0 9d
8ddc6284-f5da-4521-9486-06c2c4d7acdc deployment-operator-b44c9d9b-2rpv5 0/1 Evicted 0 9d
8ddc6284-f5da-4521-9486-06c2c4d7acdc deployment-operator-b44c9d9b-vnmh8 0/1 Evicted 0 9d
8ddc6284-f5da-4521-9486-06c2c4d7acdc deployment-operator-b44c9d9b-z7sdz 0/1 Evicted 0 9d
8ddc6284-f5da-4521-9486-06c2c4d7acdc deployment-synchronizer-7768954475-6bkmz 0/1 Evicted 0 36d
kube-system coredns-c95899d75-v2l24 1/1 Running 0 91d
Running other commands to see the state of the deployment NodeHasDiskPressure and EvictionThresholdMet are seen.
$ sudo kubectl get events
LAST SEEN TYPE REASON OBJECT MESSAGE
26m Warning EvictionThresholdMet node/tmnl-cp4s-ah Attempting to reclaim ephemeral-storage
24m Normal Starting node/tmnl-cp4s-ah Starting kubelet.
24m Normal Starting node/tmnl-cp4s-ah Starting kube-proxy.
24m Warning InvalidDiskCapacity node/tmnl-cp4s-ah invalid capacity 0 on image filesystem
24m Normal NodeHasSufficientMemory node/tmnl-cp4s-ah Node tmnl-cp4s-ah status is now: NodeHasSufficientMemory
24m Normal NodeHasSufficientPID node/tmnl-cp4s-ah Node tmnl-cp4s-ah status is now: NodeHasSufficientPID
24m Normal NodeNotReady node/tmnl-cp4s-ah Node tmnl-cp4s-ah status is now: NodeNotReady
24m Normal NodeAllocatableEnforced node/tmnl-cp4s-ah Updated Node Allocatable limit across pods
24m Normal NodeReady node/tmnl-cp4s-ah Node tmnl-cp4s-ah status is now: NodeReady
24m Normal RegisteredNode node/tmnl-cp4s-ah Node tmnl-cp4s-ah event: Registered Node tmnl-cp4s-ah in Controller
17m Normal NodeHasNoDiskPressure node/tmnl-cp4s-ah Node tmnl-cp4s-ah status is now: NodeHasNoDiskPressure
16m Normal NodeHasDiskPressure node/tmnl-cp4s-ah Node tmnl-cp4s-ah status is now: NodeHasDiskPressure
14m Warning FreeDiskSpaceFailed node/tmnl-cp4s-ah failed to garbage collect required amount of images. Wanted to free 1587504742 bytes, but freed 0 byte
s
9m36s Warning ImageGCFailed node/tmnl-cp4s-ah failed to garbage collect required amount of images. Wanted to free 1669768806 bytes, but freed 0 byte
s
4m36s Warning EvictionThresholdMet node/tmnl-cp4s-ah Attempting to reclaim ephemeral-storage
Examining the disk the /var directory is low on disk space triggering the disk-related errors seen in the previous output.
$ sudo df -h
Filesystem Size Used Avail Use% Mounted on
devtmpfs 3.9G 0 3.9G 0% /dev
tmpfs 3.9G 0 3.9G 0% /dev/shm
tmpfs 3.9G 410M 3.5G 11% /run
tmpfs 3.9G 0 3.9G 0% /sys/fs/cgroup
/dev/mapper/rootvg-rootlv 2.0G 291M 1.8G 15% /
/dev/mapper/rootvg-usrlv 10G 2.7G 7.4G 27% /usr
/dev/sda2 494M 107M 388M 22% /boot
/dev/sda1 500M 9.9M 490M 2% /boot/efi
/dev/mapper/rootvg-optlv 2.0G 791M 1.3G 39% /opt
/dev/mapper/rootvg-homelv 1014M 75M 940M 8% /home
/dev/mapper/rootvg-tmplv 2.0G 33M 2.0G 2% /tmp
/dev/mapper/rootvg-varlv 8.0G 7.3G 752M 91% /var
/dev/sdb1 16G 45M 15G 1% /mnt
Resolving The Problem
Increase the disk associated with /var.
If the pods do not return to a "Running" state, consider running the following commands.
$ sudo restartAppHost # restart the pods
$ sudo systemctl restart k3s # restart Kubernetes
The product documentation for IBM SOAR specifically App Host Deployment Guide - Prerequisites provides the suggested minimums for disk space.
The resources required by the App Host server are variable due to the requirements of the apps installed. Some apps that operate on files in memory can have extra memory requirements. Apps that perform considerable computations, such as decryption tasks, might need more CPU. Therefore, you might need to increase those resources.
Document Location
Worldwide
[{"Type":"MASTER","Line of Business":{"code":"LOB24","label":"Security Software"},"Business Unit":{"code":"BU059","label":"IBM Software w\/o TPS"},"Product":{"code":"SSIP9Q","label":"IBM Security SOAR"},"ARM Category":[{"code":"a8m0z0000001jTpAAI","label":"Integrations-\u003EAppHost"}],"ARM Case Number":"","Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"All Versions"}]
Was this topic helpful?
Document Information
Modified date:
30 June 2023
UID
ibm17005501