Troubleshooting

You can use the operator and Ansible containers to retrieve log files of the installed operator.

About this task

The ibm-cp4a-operator locates the Cloud Pak base images and has Ansible roles to handle the reconciliation logic and declare a set of playbook tasks for each component. The roles declare all the variables and defaults for how the role is executed.

The operator deployment creates two containers on your cluster, one for the operator and one for Ansible. By default Ansible uses /etc/ansible/hosts as its inventory file; the inventory represents the host machines that Ansible manages. You can create a different inventory file for each project that you have.

The following diagram shows how the operator watches for events, triggers an Ansible role when a custom resource changes, and then reconciles the resources for the deployed applications.

Operator workflow
Getting the Ansible and operator logs
Note:

 For 20.0.1  and  For 20.0.2  The Ansible container shows the standard Ansible stdout logs. To see the logs of a container, run the following command.

kubectl logs deployment/ibm-cp4a-operator -c ansible > ansible.log

The operator logs contain much more information about the operator than Kubernetes does. To see the logs of the operator container, run the following command.

kubectl logs deployment/ibm-cp4a-operator -c operator > operator.log
If the operator log does not provide the level of detail that you need, you can gather more details by adding an annotation like the following example to your custom resource YAML:
metadata:
 ...
 annotations:
  "ansible.operator-sdk/verbosity": "3"
spec:

For verbosity value, the normal rules for Ansible verbosity apply, where higher values mean more output. Acceptable values range from 0 (only the most severe messages are output) to 7 (all debugging messages are output).

After you update the custom resource YAML, reapply the YAML for the changes to take effect.

Getting the runtime logs
 For 20.0.1  and  For 20.0.2  For runtime logs, go inside the pod that runs the Ansible container. The runner keeps information about the Ansible run in the container, which is located under /tmp/ansible-operator/runner/<group>/<version>/<kind>/<namespace>/<name>.
Copying the latest logs from the log volume
 New in 20.0.3  For troubleshooting purposes, you can also copy the logs from the log volume /logs/$operator_pod_name/ansible-operator/runner/<group>/<version>/<kind>/<namespace>/<name>/artifacts. The log contains information on the first 10 reconciles and the latest reconcile. The following commands get the operator pod name and make a copy of the logs to a local directory.
deployment_name=$(kubectl get icp4acluster | awk '{print $1}' | grep -v "NAME")
operator_pod_name=$(kubectl get pod|grep ibm-cp4a-operator | awk '{print $1}')
kubectl cp $operator_pod_name:/logs/$operator_pod_name/ansible-operator/runner/icp4a.ibm.com/v1/ICP4ACluster/<namespace>/$deployment_name/artifacts /<local_logpath>
Getting information about pending pods
If some pods are pending, choose one of the pods and run the following command to get more information.
kubectl describe pod <podname> 
Getting information about secrets
Kubernetes secrets are used extensively, so output about them might also be useful.
kubectl get secrets
Getting information about events
Kubernetes events are objects that provide more insight into what is happening inside a cluster, such as what decisions the scheduler makes or why some pods are evicted from a node. To get information about these events, run the following command.
kubectl get events > events.log

You can also add the verbose parameter to any kubectl command.

kubectl -v=9 get pods
Recreating the image pull secret
If your Docker registry secret expires, you can delete the secret and re-create it:
oc delete secret admin.registrykey -n <namespace>
oc create secret docker-registry admin.registrykey --docker-server=image-registry.openshift-image-registry.svc:5000 --docker-username=kubeadmin --docker-password=$(oc whoami -t)
Applying changes by restarting pods
In some cases, changes that you make in the custom resource YAML by using the operator or directly in the environment are not automatically propagated to all pods. For example, modifications to data source information or changes to Kubernetes secrets are not seen by running pods until the pods are restarted.

If changes applied by the operator or other modifications made in the environment do not provide the expected result, restart the pods by scaling the impacted deployments down to 0 then up to the desired number to have Kubernetes (OpenShift) terminate the existing pods and create new ones.

What to do next

The custom resource can be configured to enable and disable specific logging parameters, log levels, log formats, and where these logs are stored for the various capabilities. If you need more information about specific Cloud Pak capabilities, go to the relevant troubleshooting topics.