Checking Kubernetes services status

The Kubernetes infrastructure is used to run the PowerAI Vision application. The kubectl command can be used to check the status of these underlying services, using the --namespace kube-system option.

Using kubectl get pods to check kube-system

The kubectl command is used to show the detailed status of the Kubernetes pods deployed to run the PowerAI Vision application.

Example output
# kubectl get pods --namespace kube-system
NAME                                    READY     STATUS    RESTARTS   AGE
default-http-backend-77c86f88b4-bm9nq   1/1       Running   0          4d
kube-dns-c5b9d46b-fffn7                 3/3       Running   0          4d
nginx-ingress-lb-ppc64le-djkt6          1/1       Running   0          4d
tiller-deploy-5f954f4845-9sr64          1/1       Running   0          4d
Interpreting the output
  • When the Kubernetes system is running correctly, each of the pods should have:
    • In the READY column all pods should be counted - for example, "1/1" or "3/3".
    • A value of "Running" in the STATUS column.
  • A STATUS value other than "Running" indicates an issue with the Kubernetes infrastructure.
  • A non-0, and growing, value in the RESTARTS column indicates an issue with that Kubernetes pod.

Using kubectl describe pods to check kube-system

The kubectl describe pods command provides detailed information about each of the pods that provide Kubernetes infrastructure. If the output from a specific pod is desired, run the command kubectl describe pod pod_name --namespace kube-system.

Example output

The output from the command is verbose, so sample output from only one pod is shown:

# kubectl describe pods --namespace kube-system
...
Name:           tiller-deploy-5f954f4845-9sr64
Namespace:      kube-system
Node:           127.0.0.1/127.0.0.1
Start Time:     Mon, 17 Sep 2018 12:26:33 -0500
Labels:         app=helm
                name=tiller
                pod-template-hash=1951090401
Annotations:    kubernetes.io/created-by={"kind":"SerializedReference","apiVersion":"v1","reference":{"kind":"ReplicaSet","namespace":"kube-system","name":"tiller-deploy-5f954f4845","uid":"d29f4d8c-ba9e-11e8-9e14-98b...
Status:         Running
IP:             172.17.0.4
Created By:     ReplicaSet/tiller-deploy-5f954f4845
Controlled By:  ReplicaSet/tiller-deploy-5f954f4845
Containers:
  tiller:
    Container ID:   docker://f049f19b4180d0406c04fa7d5ca8993ac1ef596a29d8d0096a54eb504182dd0b
    Image:          ibmcom/tiller-ppc64le:v2.6.0
    Image ID:       docker-pullable://docker.io/ibmcom/tiller-ppc64le@sha256:6dc8e12643d0c78b268f221205e00751ae20d37de31d45af2b21065652fca209
    Port:           44134/TCP
    State:          Running
      Started:      Mon, 17 Sep 2018 12:26:38 -0500
    Ready:          True
    Restart Count:  0
    Liveness:       http-get http://:44135/liveness delay=1s timeout=1s period=10s #success=1 #failure=3
    Readiness:      http-get http://:44135/readiness delay=1s timeout=1s period=10s #success=1 #failure=3
    Environment:
      TILLER_NAMESPACE:  kube-system
    Mounts:
      /var/run/secrets/kubernetes.io/serviceaccount from default-token-xs2b4 (ro)
Conditions:
  Type           Status
  Initialized    True 
  Ready          True 
  PodScheduled   True 
Volumes:
  default-token-xs2b4:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  default-token-xs2b4
    Optional:    false
QoS Class:       BestEffort
Node-Selectors:  beta.kubernetes.io/os=linux
Tolerations:     node.alpha.kubernetes.io/notReady:NoExecute for 300s
                 node.alpha.kubernetes.io/unreachable:NoExecute for 300s
Events:          <none>

Interpreting the output

Significant fields providing status of the Kubernetes pods include:

  • The Status field should be "Running" - any other status will indicate issues with the environment.
  • In the Conditions section, the Ready field should indicate "True". Any other value indicates that there are issues with the environment.
  • If there are issues with any pods, the Events section of the pod should have information about issues the pod encountered.