Troubleshooting
Problem
This is a problem determination document to help collect data for Liberty on Kubernetes with the Liberty Operator.
Tab navigation
Environment
Tab navigation
Diagnosing The Problem
Table of contents:
- Gather logs and configuration
- Enable diagnostic trace at runtime
- Enable diagnostic trace at startup
- Change Java options at startup
- Execute a server dump
- Gather data on a performance, hang, or high CPU issue
Gather logs and configuration
Perform the following steps to gather logs and configuration:
- Ensure you are in the right namespace of the target application pods by replacing $NAME with your namespace:
kubectl config set-context --current --namespace=$NAME
- Find the relevant pods:
kubectl get pods
- For each relevant pod, send the standard logs to a local file, replacing $POD twice based on the NAME in the previous step:
kubectl logs --all-containers=true $POD > $POD.txt
- For each relevant pod, replace $POD based on the output above to remote into it (if there are multiple containers, use -c $CONTAINER after $POD to specify the Liberty container):
kubectl exec -it $POD -- /bin/sh
-
Optional: Copy some useful process information into /tmp
cp -R --no-preserve=all --parents /proc/cpuinfo /proc/stat /proc/schedstat /proc/vmstat /proc/meminfo /proc/version /proc/pressure /proc/loadavg /proc/[0-9]*/cgroup /proc/[0-9]*/environ /proc/[0-9]*/cmdline /proc/[0-9]*/smaps /proc/[0-9]*/limits /proc/[0-9]*/stat /proc/[0-9]*/status /proc/[0-9]*/sched /proc/[0-9]*/schedstat /proc/[0-9]*/wchan /proc/[0-9]*/task/*/stat /proc/[0-9]*/task/*/wchan /proc/[0-9]*/task/*/sched /proc/[0-9]*/task/*/status /sys/fs/cgroup/cpu* /sys/fs/cgroup/memory* /sys/fs/cgroup/*/*/cpu* /sys/fs/cgroup/*/*/memory* /tmp/ 2>/dev/null
- Create a compressed file with all logs and configuration:
tar czhf /tmp/liberty_${HOSTNAME}_$(date +%Y%m%d_%H%M%S).tar.gz /logs /config /serviceability/*/${HOSTNAME} /opt/*/wlp/usr/servers/*/logs /opt/*/wlp/usr/servers/*/configDropins /opt/*/wlp/usr/servers/*/*xml /opt/*/wlp/usr/servers/*/*options /opt/*/wlp/usr/servers/*/*env /opt/*/wlp/usr/servers/*/*properties /opt/*/wlp/usr/servers/*/javacore* /opt/*/wlp/usr/servers/*/verbosegc* /opt/*/wlp/usr/servers/*/heapdump* /opt/*/wlp/usr/servers/*/core* ${LOG_DIR} ${X_LOG_DIR} ${WLP_OUTPUT_DIR} ${SERVER_WORKING_DIR} ${VARIABLE_SOURCE_DIRS} ${JAVA_HOME}/jre/lib/security/java.security /tmp/proc /tmp/sys 2>/dev/null
-
Optional: If you performed step 5
Delete only the temporary files produced in step 5:rm -rf /tmp/proc /tmp/sys
- List the compressed file and then exit the container:
cd /tmp; ls *.tar.gz; exit &>/dev/null
- Download the compressed file, replacing $POD with the pod name and $FILE twice from the output of the previous step:
kubectl cp $POD:/tmp/$FILE $FILE --retries=999
- Gather various resource state:
- For each relevant pod, describe the pod, replacing $POD twice with the pod name:
kubectl describe pod $POD > $POD_describe.txt
- For each relevant pod, describe the pod, replacing $POD twice with the pod name:
- Upload:
-
Pod standard logs (step 3)
-
Liberty logs and configuration (step 9)
-
Resource state text files (step 10)
-
Enable diagnostic trace at runtime
Perform the following steps to enable diagnostic trace at runtime:
- If runtime configuration updates are not enabled, then you must enable diagnostic trace at startup.
- If runtime configuration updates are enabled (as they are by default):
- If the operator storage for serviceability is not configured, then follow the Enable diagnostic trace at runtime steps in Liberty on Kubernetes without the Liberty Operator.
- If the operator storage for serviceability is configured, then a WebSphereLibertyTrace custom resource may be used:
- Ensure you are in the right namespace of the target application pods by replacing $NAME with your namespace:
kubectl config set-context --current --namespace=$NAME
- Find the relevant pods:
kubectl get pods
- For each relevant pod, create a local file named trace.yaml, replacing $POD based on the output above and replacing $TRACE with an IBM support-requested trace specification or your desired trace specification:
- If using the IBM WebSphere Liberty Operator:
apiVersion: liberty.websphere.ibm.com/v1 kind: WebSphereLibertyTrace metadata: name: libertytrace1 annotations: day2operation.openliberty.io/targetKinds: Pod spec: license: accept: true podName: $POD traceSpecification: "*=info:$TRACE" maxFileSize: 100 maxFiles: 5 disable: false
- If using the OpenLiberty Operator:
apiVersion: apps.openliberty.io/v1 kind: OpenLibertyTrace metadata: name: libertytrace1 annotations: day2operation.openliberty.io/targetKinds: Pod spec: license: accept: true podName: $POD traceSpecification: "*=info:$TRACE" maxFileSize: 100 maxFiles: 5 disable: false
- If using the IBM WebSphere Liberty Operator:
- Apply the YAML:
kubectl apply -f trace.yaml
- List the Liberty trace resources and verify that TRACING shows true:
- If using the IBM WebSphere Liberty Operator:
kubectl get WebSphereLibertyTrace
- If using the OpenLiberty Operator:
kubectl get OpenLibertyTrace
- If using the IBM WebSphere Liberty Operator:
- Reproduce the problem
- Disable the diagnostic trace:
- If using the IBM WebSphere Liberty Operator:
kubectl delete WebSphereLibertyTrace libertytrace1
- If using the OpenLiberty Operator:
kubectl delete OpenLibertyTrace libertytrace1
- If using the IBM WebSphere Liberty Operator:
- Gather and upload all logs
- Ensure you are in the right namespace of the target application pods by replacing $NAME with your namespace:
Enable diagnostic trace at startup
Perform the following steps to enable diagnostic trace at startup:
- Ensure you are in the right namespace of the target application pods by replacing $NAME with your namespace:
kubectl config set-context --current --namespace=$NAME
- In your current directory, create a local file named tracefromstartup.xml with the following contents and replace $TRACE with an IBM support-requested trace specification or your desired trace specification:
<?xml version="1.0" encoding="UTF-8"?> <server> <logging traceSpecification="*=info:$TRACE" maxFileSize="100" maxFiles="10" /> </server>
- Create a ConfigMap entry based on this local file:
kubectl create configmap tracefromstartup --from-file tracefromstartup.xml
- List the Liberty Operator managed applications:
- If using the IBM WebSphere Liberty Operator:
kubectl get WebSphereLibertyApplication
- If using the OpenLiberty Operator:
kubectl get OpenLibertyApplication
- If using the IBM WebSphere Liberty Operator:
- Edit the relevant Liberty Operator managed application:
- If using the IBM WebSphere Liberty Operator:
kubectl edit WebSphereLibertyApplication $NAME
- If using the OpenLiberty Operator:
kubectl edit OpenLibertyApplication $NAME
- If using the IBM WebSphere Liberty Operator:
- In the spec section, add or edit a volumes section that mounts the ConfigMap and a volumeMounts section that places the file into the container; for example:
spec: volumes: - name: tracefromstartup configMap: name: tracefromstartup volumeMounts: - name: tracefromstartup mountPath: /config/configDropins/overrides/tracefromstartup.xml subPath: tracefromstartup.xml
- Save and quit the editor. Ensure that the change succeeded by verifying the output ends with "edited"; for example:
webspherelibertyapplication.liberty.websphere.ibm.com/websphereliberty-app-sample edited
- After a little bit of time, the old pods should be deleted and new pods with the new trace should be created.
- Reproduce the problem
- Gather and upload all logs
Change Java options at startup
The JVM_ARGS Liberty environment variable may be configured at container startup to change JVM options. Perform the following steps to add Java options at startup:
- First, check if this environment variable is already set:
- Ensure you are in the right namespace of the target application pods by replacing $NAME with your namespace:
kubectl config set-context --current --namespace=$NAME
- Find the relevant pods:
kubectl get pods
- For one of the relevant pods, replace $POD based on the output above to search its environment variables (if there are multiple containers, use -c $CONTAINER after $POD to specify the Liberty container):
kubectl exec -it $POD -- /bin/sh -c "cat /proc/[0-9]*/environ | tr '\0' '\n' | grep JVM_ARGS="
- Ensure you are in the right namespace of the target application pods by replacing $NAME with your namespace:
- Append your desired arguments to the value found in the above steps (if any).
- List the Liberty Operator managed applications:
- If using the IBM WebSphere Liberty Operator:
kubectl get WebSphereLibertyApplication
- If using the OpenLiberty Operator:
kubectl get OpenLibertyApplication
- If using the IBM WebSphere Liberty Operator:
- Edit the relevant Liberty Operator managed application:
- If using the IBM WebSphere Liberty Operator:
kubectl edit WebSphereLibertyApplication $NAME
- If using the OpenLiberty Operator:
kubectl edit OpenLibertyApplication $NAME
- If using the IBM WebSphere Liberty Operator:
- In the spec section, add or edit the JVM_ARGS entry in the env section; for example:
spec: env: - name: JVM_ARGS value: -Djavax.net.debug=all
- Save and quit the editor. Ensure that the change succeeded by verifying the output ends with "edited"; for example:
webspherelibertyapplication.liberty.websphere.ibm.com/websphereliberty-app-sample edited
- After a little bit of time, the old pods should be deleted and new pods with the new arguments should be created. You can verify by describing a new pod; for example:
$ kubectl describe pod websphereliberty-app-sample-5bc7bb657f-h9bp9 [...] Environment: JVM_ARGS: -Djavax.net.debug=all
- Reproduce the problem
- Gather and upload all logs
Execute a server dump
Perform the following steps to execute and gather a Liberty server dump.
Warning: These commands will start a new process which will consume some memory (likely in the range of dozens of MB). If your container has a memory limit and it is near its limit, this may cause the container to crash. Alternatively, you may gather logs and configuration manually.
- If the operator storage for serviceability is not configured, then follow the Execute a server dump steps in Liberty on Kubernetes without the Liberty Operator.
- If the operator storage for serviceability is configured, then a WebSphereLibertyDump custom resource may be used to perform a server dump. Note that server dump starts a new process which uses some memory so be careful performing this with a small memory limit.
- Ensure you are in the right namespace of the target application pods by replacing $NAME with your namespace:
kubectl config set-context --current --namespace=$NAME
- Find the relevant pods:
kubectl get pods
- For each relevant pod, create a local file named dump.yaml, replacing $POD based on the output above:
- If using the IBM WebSphere Liberty Operator:
apiVersion: liberty.websphere.ibm.com/v1 kind: WebSphereLibertyDump metadata: name: libertydump1 annotations: day2operation.openliberty.io/targetKinds: Pod spec: license: accept: true podName: $POD include: - thread
- If using the OpenLiberty Operator:
apiVersion: apps.openliberty.io/v1 kind: OpenLibertyDump metadata: name: libertydump1 annotations: day2operation.openliberty.io/targetKinds: Pod spec: license: accept: true podName: $POD include: - thread
- If using the IBM WebSphere Liberty Operator:
- Apply the YAML:
kubectl apply -f dump.yaml
- List the Liberty dump resources:
- If using the IBM WebSphere Liberty Operator:
kubectl get WebSphereLibertyDump
- If using the OpenLiberty Operator:
kubectl get OpenLibertyDump
- If using the IBM WebSphere Liberty Operator:
- Download the dump file, replacing $POD with the name of the pod and $FILE with the value of DUMP FILE in the previous command:
kubectl cp $POD:$FILE libertydump.zip --retries=999
- Ensure you are in the right namespace of the target application pods by replacing $NAME with your namespace:
Gather data on a performance, hang, or high CPU issue
Notes & Tips
- If you are on macOS or Linux (or Cygwin on Windows), then you may use variables to simplify the above commands. For example, various commands use $POD for the target pod, so you may first execute a statement to set the POD variable and then future references of $POD in this terminal window will be replaced with what you specified. In the following example, POD is set to liberty1-5545f8475b-zdwmg and therefore the final command of kubectl logs $POD will use the specified value and thus you can just copy/paste commands from the instructions above without needing to modify them.
$ kubectl get pods NAME READY STATUS RESTARTS AGE liberty1-5545f8475b-zdwmg 1/1 Running 2 16d websphereliberty-app-sample-6f698f5bcb-srn55 1/1 Running 2 16d $ POD=liberty1-5545f8475b-zdwmg $ kubectl logs $POD
Copied!
Copied!
Document Location
Worldwide
[{"Type":"MASTER","Line of Business":{"code":"LOB67","label":"IT Automation \u0026 App Modernization"},"Business Unit":{"code":"BU048","label":"IBM Software"},"Product":{"code":"SSEQTP","label":"WebSphere Application Server"},"ARM Category":[{"code":"a8mKe000000GmbMIAS","label":"IBM WebSphere Liberty-All Platforms-\u003EContainers"}],"ARM Case Number":"","Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"All Versions"}]
Was this topic helpful?
Document Information
Modified date:
22 May 2024
UID
ibm17152478