Troubleshooting Calico networks

Identifying and investigating Calico network issues.

Calico network issues might show up during or after IBM® Cloud Private installation. During installation, the installer runs checks to ensure seamless pod-to-pod connectivity in your cluster. However, if you still face issues, the following information might help you to identify the causes and resolve the issues.

Issues during installation of IBM Cloud Private

To avoid Calico network issues during installation, ensure that the following settings are correctly configured.

Issues after installation of IBM Cloud Private

After your cluster is installed, you might see the IP connectivity issues across the pods. Service name resolution issues are a symptom of pods not being able to reach the DNS service, but are not always related to Calico networks.

In such situations, gather the following information from your cluster for troubleshooting. If you contact the support team for assistance, you can provide this information to the team.

  1. Set up the Kubernetes CLI (kubectl). See Accessing your cluster from the Kubernetes CLI (kubectl).
  2. Set up the calicoctl binary file that is available from the IBM Cloud Private installation media. See Installing the Calico CLI (calicoctl).
  3. Get the list of nodes in your cluster.

    kubectl get nodes -owide
    
  4. Collect logs from the calico-node-* pod that runs on the node that is experiencing the mesh problem. For example, complete the following steps to get the logs from calico-node-amd64-48lf9 that runs on node 10.10.25.71.

    1. Get a list of Calico pods.

      kubectl get pods -o wide | grep calico-node
      

      Following is a sample output:

      calico-node-amd64-2cbjh                              2/2       Running   0          7h        10.10.25.70    10.10.25.70
      calico-node-amd64-48lf9                              2/2       Running   0          7h        10.10.25.71    10.10.25.71
      calico-node-amd64-75667                              2/2       Running   0          7h        10.10.25.7     10.10.25.7
      
    2. Retrieve the logs from the calico-node container in the pod.

      kubectl logs calico-node-amd64-48lf9 -c calico-node
      
  5. Diagnose the problem.

    1. Get routing table and interface details. Run these commands on all master nodes and on the nodes that have the pods that are experiencing connectivity issue.

      1. Get routing table details.

        route -n
        
      2. Get interface details.

        ifconfig -a
        
    2. Get the Calico node list. Run the command on any master node.

      calicoctl  get nodes
      
    3. Get all the pods or end points that are on the Calico mesh. Run the command on any master node.

      calicoctl  get workloadendpoints
      
    4. Get Calico node status and diagnostics information. Run these commands on any master node and on the nodes that have the pods that are experiencing connectivity issues.

      calicoctl node status
      calicoctl node diags
      
    5. Check the config.yaml and host files that are on your boot node.