IBM Support

MustGather: Liberty on Kubernetes without the Liberty Operator

Troubleshooting


Problem

This is a problem determination document to help collect data for Liberty on Kubernetes without the Liberty Operator.

Diagnosing The Problem

Table of contents:


Gather logs and configuration

Perform the following steps to gather logs and configuration:
  1. Ensure you are in the right namespace of the target application pods by replacing $NAME with your namespace:
    kubectl config set-context --current --namespace=$NAME
  2. Find the relevant pods:
    kubectl get pods
  3. For each relevant pod, send the standard logs to a local file, replacing $POD twice based on the NAME in the previous step:
    kubectl logs --all-containers=true $POD > $POD.txt
  4. For each relevant pod, replace $POD based on the output above to remote into it (if there are multiple containers, use -c $CONTAINER after $POD to specify the Liberty container):
    kubectl exec -it $POD -- /bin/sh
  5. Optional: Copy some useful process information into /tmp
    cp -R --no-preserve=all --parents /proc/cpuinfo /proc/stat /proc/schedstat /proc/vmstat /proc/meminfo /proc/version /proc/pressure /proc/loadavg /proc/[0-9]*/cgroup /proc/[0-9]*/environ /proc/[0-9]*/cmdline /proc/[0-9]*/smaps /proc/[0-9]*/limits /proc/[0-9]*/stat /proc/[0-9]*/status /proc/[0-9]*/sched /proc/[0-9]*/schedstat /proc/[0-9]*/wchan /proc/[0-9]*/task/*/stat /proc/[0-9]*/task/*/wchan /proc/[0-9]*/task/*/sched /proc/[0-9]*/task/*/status /sys/fs/cgroup/cpu* /sys/fs/cgroup/memory* /sys/fs/cgroup/*/*/cpu* /sys/fs/cgroup/*/*/memory* /tmp/ 2>/dev/null
  6. Create a compressed file with all logs and configuration:
    tar czhf /tmp/liberty_${HOSTNAME}_$(date +%Y%m%d_%H%M%S).tar.gz /logs /config /serviceability/*/${HOSTNAME} /opt/*/wlp/usr/servers/*/logs /opt/*/wlp/usr/servers/*/configDropins /opt/*/wlp/usr/servers/*/*xml /opt/*/wlp/usr/servers/*/*options /opt/*/wlp/usr/servers/*/*env /opt/*/wlp/usr/servers/*/*properties /opt/*/wlp/usr/servers/*/javacore* /opt/*/wlp/usr/servers/*/verbosegc* /opt/*/wlp/usr/servers/*/heapdump* /opt/*/wlp/usr/servers/*/core* ${LOG_DIR} ${X_LOG_DIR} ${WLP_OUTPUT_DIR} ${SERVER_WORKING_DIR} ${VARIABLE_SOURCE_DIRS} ${JAVA_HOME}/jre/lib/security/java.security /tmp/proc /tmp/sys 2>/dev/null
  7. Optional: If you performed step 5 Delete only the temporary files produced in step 5:
    rm -rf /tmp/proc /tmp/sys
  8. List the compressed file and then exit the container:
    cd /tmp; ls *.tar.gz; exit &>/dev/null
  9. Download the compressed file, replacing $POD with the pod name and $FILE twice from the output of the previous step:
    kubectl cp $POD:/tmp/$FILE $FILE --retries=999
  10. Gather various resource state:
    1. For each relevant pod, describe the pod, replacing $POD twice with the pod name:
      kubectl describe pod $POD > $POD_describe.txt
  11. Upload:
    1. Pod standard logs (step 3)
    2. Liberty logs and configuration (step 9)
    3. Resource state text files (step 10)

Enable diagnostic trace at runtime

Perform the following steps to enable diagnostic trace at runtime:
  1. If runtime configuration updates are not enabled, then you must enable diagnostic trace at startup.
  2. If runtime configuration updates are enabled (as they are by default):
    1. Ensure you are in the right namespace of the target application pods by replacing $NAME with your namespace:
      kubectl config set-context --current --namespace=$NAME
    2. Find the relevant pods:
      kubectl get pods
    3. For each of the relevant pods, replace $POD based on the output above and replace $TRACE with an IBM support-requested trace specification or your desired trace specification (if there are multiple containers, use -c $CONTAINER after $POD to specify the Liberty container):
      kubectl exec -it $POD -- /bin/sh -c "echo '<?xml version=\"1.0\" encoding=\"UTF-8\"?><server><logging traceSpecification=\"*=info:$TRACE\" maxFileSize=\"100\" maxFiles=\"10\" /></server>' > /config/configDropins/overrides/trace.xml"
    4. Reproduce the problem
    5. Disable the diagnostic trace:
      kubectl exec -it $POD -- /bin/sh -c "rm /config/configDropins/overrides/trace.xml"
    6. Gather and upload all logs

Enable diagnostic trace at startup

Perform the following steps to enable diagnostic trace at startup:
  1. Ensure you are in the right namespace of the target application pods by replacing $NAME with your namespace:
    kubectl config set-context --current --namespace=$NAME
  2. In your current directory, create a local file named tracefromstartup.xml with the following contents and replace $TRACE with an IBM support-requested trace specification or your desired trace specification:
    <?xml version="1.0" encoding="UTF-8"?>
    <server>
      <logging traceSpecification="*=info:$TRACE" maxFileSize="100" maxFiles="10" />
    </server>
  3. Create a ConfigMap entry based on this local file:
    kubectl create configmap tracefromstartup --from-file tracefromstartup.xml
  4. List the relevant Liberty Deployment:
    kubectl get deployment
  5. Edit the relevant Liberty Deployment:
    kubectl edit deployment $NAME
  6. In the spec section of the target container template, add or edit a volumes section that mounts the ConfigMap and a volumeMounts section that places the file into the container; for example:
    spec:
      template:
        spec:
          containers:
          - image: [...]
            volumeMounts:
            - name: tracefromstartup
              mountPath: /config/configDropins/overrides/tracefromstartup.xml
              subPath: tracefromstartup.xml
          volumes:
          - name: tracefromstartup
            configMap:
              name: tracefromstartup
    
  7. Save and quit the editor. Ensure that the change succeeded by verifying the output ends with "edited"; for example:
    deployment.apps/libertysample edited
    
  8. After a little bit of time, the old pods should be deleted and new pods with the new trace should be created.

Change Java options at startup

The JVM_ARGS Liberty environment variable may be configured at container startup to change JVM options. Perform the following steps to add Java options at startup:
  1. First, check if this environment variable is already set:
    1. Ensure you are in the right namespace of the target application pods by replacing $NAME with your namespace:
      kubectl config set-context --current --namespace=$NAME
    2. Find the relevant pods:
      kubectl get pods
    3. For one of the relevant pods, replace $POD based on the output above to search its environment variables (if there are multiple containers, use -c $CONTAINER after $POD to specify the Liberty container):
      kubectl exec -it $POD -- /bin/sh -c "cat /proc/[0-9]*/environ | tr '\0' '\n' | grep JVM_ARGS="
  2. Append your desired arguments to the value found in the above steps (if any).
  3. List the relevant Liberty Deployments:
    kubectl get deployments
  4. Edit the relevant Liberty Deployments:
    kubectl edit deployment $NAME
  5. In the spec section, add or edit the JVM_ARGS entry in the env section; for example:
    spec:
      template:
        spec:
          containers:
          - image: [...]
            env:
            - name: JVM_ARGS
              value: -Djavax.net.debug=all
    
  6. Save and quit the editor. Ensure that the change succeeded by verifying the output ends with "edited"; for example:
    deployment.apps/libertysample edited
    
  7. After a little bit of time, the old pods should be deleted and new pods with the new arguments should be created. You can verify by describing a new pod; for example:
    $ kubectl describe pod libertysample--5545f8475b-hzj42
      [...]
        Environment:
          JVM_ARGS:     -Djavax.net.debug=all
    

Execute a server dump

Perform the following steps to execute and gather a Liberty server dump.
Warning: These commands will start a new process which will consume some memory (likely in the range of dozens of MB). If your container has a memory limit and it is near its limit, this may cause the container to crash. Alternatively, you may gather logs and configuration manually.
  1. Ensure you are in the right namespace of the target application pods by replacing $NAME with your namespace:
    kubectl config set-context --current --namespace=$NAME
  2. Find the relevant pods:
    kubectl get pods
  3. For each of the relevant pods, replace $POD based on the output above (if there are multiple containers, use -c $CONTAINER after $POD to specify the Liberty container):
    kubectl exec -it $POD -- /bin/sh -c "/opt/*/wlp/bin/server dump --include=thread"
  4. The output of the command should state where the server dump is written; for example:
    Dumping server defaultServer.
    Server defaultServer dump complete in /opt/ibm/wlp/output/defaultServer/defaultServer.dump-24.05.15_17.11.35.zip.
  5. Download the file, replacing $POD with the pod name and $PATH with the full path from the output of the previous step (without the period at the end):
    kubectl cp $POD:$PATH serverdump.zip --retries=999

Gather data on a performance, hang, or high CPU issue


Notes & Tips

  1. If you are on macOS or Linux (or Cygwin on Windows), then you may use variables to simplify the above commands. For example, various commands use $POD for the target pod, so you may first execute a statement to set the POD variable and then future references of $POD in this terminal window will be replaced with what you specified. In the following example, POD is set to liberty1-5545f8475b-zdwmg and therefore the final command of kubectl logs $POD will use the specified value and thus you can just copy/paste commands from the instructions above without needing to modify them.
    $ kubectl get pods
    NAME                                           READY   STATUS    RESTARTS   AGE
    liberty1-5545f8475b-zdwmg                      1/1     Running   2          16d
    websphereliberty-app-sample-6f698f5bcb-srn55   1/1     Running   2          16d
    $ POD=liberty1-5545f8475b-zdwmg
    $ kubectl logs $POD
    

Document Location

Worldwide

[{"Type":"MASTER","Line of Business":{"code":"LOB67","label":"IT Automation \u0026 App Modernization"},"Business Unit":{"code":"BU048","label":"IBM Software"},"Product":{"code":"SSEQTP","label":"WebSphere Application Server"},"ARM Category":[{"code":"a8mKe000000GmbMIAS","label":"IBM WebSphere Liberty-All Platforms-\u003EContainers"}],"ARM Case Number":"","Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"All Versions"}]

Document Information

Modified date:
22 May 2024

UID

ibm17152481