How to Use the IBM Cloud Kubernetes Services Autoscaler

A look at how you can use autoscaler to scale your worker nodes.

If you want to scale the pods in a Kubernetes cluster, this can be done easily through Replicaset within Kubernetes; but what if you want to scale your worker nodes? In this situation, autoscaler can help you avoid having pods in a pending state in your environment due to a lack of computational resources. You can increase or decrease the number of worker nodes in your work cluster automatically based on the resource demand.

Scale-up and scale-down

To start, we need to understand how scale-up and scale-down work, and it’s important to understand the criteria used. The autoscaler works based on the resource request value defined for your deployments/pods and not on the value that is being consumed by the application.

Scale-up: This situation occurs when you have pending pods because there are insufficient computing resources.
Scale-down: This occurs when less than the total compute resources are considered underutilized. The default scale-down utilization threshold is utilization below 50%.

Step-by-step instructions

In this step-by-step guide, we will show you how to install and configure the autoscaler on your IBM Cloud Kubernetes Service cluster and perform a little test to see how it works in practice.

Before start, you’ll need to install the required CLI into your computer: ibmcloud and Helm version 3 (the correct version is important due to the differences in commands between them).

1. Confirm that your credentials are stored in your Kubernetes cluster

kubectl get secrets -n kube-system | grep storage-secret-store

If you do not have credentials stored, you can create one.

2. Check if you worker pool has the required label

ibmcloud ks worker-pool get --cluster <cluster_name_or_ID> --worker-pool <worker_pool_name_or_ID> | grep Labels

If you don’t have the required label, you have to add a new worker pool. If you don’t know the <worker_pool_name_or_ID>, you can get it through this command:

ibmcloud ks worker-pool ls --cluster <cluster_name_or_ID>

3. Add and update the Helm repo into your computer

helm repo add iks-charts https://icr.io/helm/iks-charts

helm repo update

Note: If you try to add a repo and receive an error message that says “Error: Couldn’t load repositories file…”, you have to init the helm. On prompt, type: helm init

4. Install the cluster autoscaler helm chart in the kube-system namespace:

helm install ibm-iks-cluster-autoscaler iks-charts/ibm-iks-cluster-autoscaler --namespace kube-system --set workerpools[0].<pool_name>.max=<number_of_workers>,workerpools[0].<pool_name>.min=<number_of_workers>,workerpools[0].<pool_name>.enabled=(true|false)

workerpools[0]: The first worker pool to enable autoscaling.
<pool_name>: The name or ID of the worker pool .
max=<number_of_workers>: Specify the maximum number of worker nodes.
min=<number_of_workers>: Specify the minimum number of worker nodes.

It is necessary to set the min value of autoscaler to, at least, the current number of worker nodes of your pool because min size does not automatically trigger a scale-up.

If you set up the autoscaler with a min size below the number of current worker nodes, the autoscaler does not initiate and needs to be set to the correct value before works properly:

Note: In this option, we are using all the default values of the autoscaler and just specifying the minimum and maximum number of worker nodes. However, there are several options that you can change by using the –set option.

5. Confirm that the pod is running and the service has been created

kubectl get pods --namespace=kube-system | grep ibm-iks-cluster-autoscaler

kubectl get service --namespace=kube-system | grep ibm-iks-cluster-autoscaler

6. Verify if the status of ConfigMap is in SUCCESS state

kubectl get cm iks-ca-configmap -n kube-system -o yaml

How to check if Autoscaler is working

In this example, we will perform a basic test by increasing the number of pods to check the scale-up and then decreasing it to check the scale-down.

Scale-up

Please remember that the autoscaler is performed based on the pod’s request value, so we do not need to perform a stress test; we just have to increase the number of pods to achieve the worker-node limit.

If your deployment does not have the Requests set up accordingly, autoscaler won’t work as you expect. In our example, we will use a simple Replica Set that deploys four nginx pods.

nginx-test-autoscaler.yaml

kind: ReplicaSet
metadata:
  name: web-autoscaler-replicaset
spec:
  selector:
    matchLabels:
      app: webserver
      tier: webserver
  replicas: 4
  template:
    metadata:
      labels:
        app: webserver
        tier: webserver
    spec:
      containers:
      - name: nginx
        image: nginx:latest
        resources:
          requests:
            cpu: 100m
            memory: 500Mi
        ports:
        - containerPort: 80

Note: Take a look at the requests value—we are requesting 100m for CPU and 500Mi for Memory. This is the value of the request for each pod.

Create a replica set using the yaml file:

kubectl apply -f nginx-test-autoscaler.yaml

Check the pod status—one of them is in the “Pending” state:

kubectl get pods

Let’s take a look at the pod’s events to verify the reason:

kubectl describe pod <podname> | grep Warning

In this case, the pod is in the pending state because there is insufficient memory on worker nodes.

Looking at the Kubernetes cluster inside the IBM Cloud portal, we can verify that a new node is being provisioned. Autoscaler identified that there is no computational resource to start the pod in pending state, so it is automatically scaling up to put the pod in running state:

After the nodes are provisioned, we can check the pod’s status and the number of nodes:

kubectl get nodes

kubectl get pods | grep web

Scale-down

In our test, we do not have workload in our pods so we just need to decrease the number of pods, which will decrease the total Requests. After the autoscaler identifies that the pods that are on a node have a request of less than 50% of its capacity, it will start the scale-down process.

Let’s change the replicaset from four to two pods and confirm when only two pods are running:

kubectl scale rs web-autoscaler-replicaset --replicas=2

kubectl get rs web-autoscaler-replicaset

kubectl get pods

Looking at the Kubernetes cluster inside the IBM Cloud portal, we can verify that one node is being deleted:

Customizing configuration values

You can change the cluster autoscaler configuration values using the helm upgrade command with –set option. If you want to learn more about the available parameters, see the following: Customizing configuration values (–set)

Here are two examples:

Change the scan interval to 5m and enable autoscaling for the default worker pool, with a maximum of five and minimum of three worker nodes per zone:

helm upgrade --set scanInterval=5m --set workerpools[0].default.max=5,workerpools[0].default.min=3,workerpools[0].default.enabled=true ibm-iks-cluster-autoscaler iks-charts/ibm-iks-cluster-autoscaler -i --recreate-pods --namespace kube-system

Change the threshold scale-down utilization to 0.7 and maximum time in minutes before pods is automatically restarted:

helm upgrade --set scaleDownUtilizationThreshold=0.7 --set max-inactivity=5min ibm-iks-cluster-autoscaler iks-charts/ibm-iks-cluster-autoscaler -i --recreate-pods --namespace kube-system

We saw in the examples above that there are options to customize the values, but what if you want to return to the default values? There is a command to reset the settings:
```
helm upgrade --reset-values ibm-iks-cluster-autoscaler iks-charts/ibm-iks-cluster-autoscaler --recreate-pods
```

Conclusion

Autoscaler can help you to avoid having pods in a pending state in your environment due to lack of computational resources by increasing the number of worker nodes and by decreasing them if they are underutilized.

In this article, we gave just one example of how it can be used and explored the customization options to adapt to the best for your environment. It’s also important to take into consideration the time that IBM Cloud can provision a new worker node in order to make adequate the threshold to be used in your environment. In this case, a validation is recommended before going into production.

Learn more

Want to get some free, hands-on experience with Kubernetes? Take advantage of interactive, no-cost Kubernetes tutorials by checking out IBM CloudLabs.

Was this article helpful?

YesNo

Vandiz Vieira Silva

IT Specialist

Carlos Guarany Gomes

IT Architect

A look at how you can use autoscaler to scale your worker nodes.

Scale-up and scale-down

Step-by-step instructions

1. Confirm that your credentials are stored in your Kubernetes cluster

2. Check if you worker pool has the required label

3. Add and update the Helm repo into your computer

4. Install the cluster autoscaler helm chart in the kube-system namespace:

5. Confirm that the pod is running and the service has been created

6. Verify if the status of ConfigMap is in SUCCESS state

How to check if Autoscaler is working

Scale-up

Scale-down

Customizing configuration values

Conclusion

Learn more

More from Cloud

New 4th Gen Intel Xeon profiles and dynamic network bandwidth shake up the IBM Cloud Bare Metal Servers for VPC portfolio

IBM and AWS: Driving the next-gen SAP transformation

Experience unmatched data resilience with IBM Storage Defender and IBM Storage FlashSystem

IBM Newsletters