A look at how you can use autoscaler to scale your worker nodes.

If you want to scale the pods in a Kubernetes cluster, this can be done easily through Replicaset within Kubernetes; but what if you want to scale your worker nodes? In this situation, autoscaler can help you avoid having pods in a pending state in your environment due to a lack of computational resources. You can increase or decrease the number of worker nodes in your work cluster automatically based on the resource demand.

Scale-up and scale-down

To start, we need to understand how scale-up and scale-down work, and it’s important to understand the criteria used. The autoscaler works based on the resource request value defined for your deployments/pods and not on the value that is being consumed by the application.

  • Scale-up: This situation occurs when you have pending pods because there are insufficient computing resources. 
  • Scale-down: This occurs when less than the total compute resources are considered underutilized. The default scale-down utilization threshold is utilization below 50%.

Step-by-step instructions

In this step-by-step guide, we will show you how to install and configure the autoscaler on your IBM Cloud Kubernetes Service cluster and perform a little test to see how it works in practice.

Before start, you’ll need to install the required CLI into your computer: ibmcloud and Helm version 3 (the correct version is important due to the differences in commands between them).

1. Confirm that your credentials are stored in your Kubernetes cluster

kubectl get secrets -n kube-system | grep storage-secret-store

If you do not have credentials stored, you can create one

2. Check if you worker pool has the required label

ibmcloud ks worker-pool get --cluster <cluster_name_or_ID> --worker-pool <worker_pool_name_or_ID> | grep Labels

If you don’t have the required label, you have to add a new worker pool. If you don’t know the <worker_pool_name_or_ID>, you can get it through this command: 

ibmcloud ks worker-pool ls --cluster <cluster_name_or_ID>

3. Add and update the Helm repo into your computer

helm repo add iks-charts https://icr.io/helm/iks-charts
helm repo update

Note: If you try to add a repo and receive an error message that says “Error: Couldn’t load repositories file…”, you have to init the helm. On prompt, type: helm init 

4. Install the cluster autoscaler helm chart in the kube-system namespace:

helm install ibm-iks-cluster-autoscaler iks-charts/ibm-iks-cluster-autoscaler --namespace kube-system --set workerpools[0].<pool_name>.max=<number_of_workers>,workerpools[0].<pool_name>.min=<number_of_workers>,workerpools[0].<pool_name>.enabled=(true|false)
  • workerpools[0]: The first worker pool to enable autoscaling.
  • <pool_name>: The name or ID of the worker pool .
  • max=<number_of_workers>:  Specify the maximum number of worker nodes.
  • min=<number_of_workers>: Specify the minimum number of worker nodes.

It is necessary to set the min value of autoscaler to, at least, the current number of worker nodes of your pool because min size does not automatically trigger a scale-up.

If you set up the autoscaler with a min size below the number of current worker nodes, the autoscaler does not initiate and needs to be set to the correct value before works properly:

Note: In this option, we are using all the default values ​​of the autoscaler and just specifying the minimum and maximum number of worker nodes. However, there are several options that you can change by using the –set option.

5. Confirm that the pod is running and the service has been created

kubectl get pods --namespace=kube-system | grep ibm-iks-cluster-autoscaler
kubectl get service --namespace=kube-system | grep ibm-iks-cluster-autoscaler

6. Verify if the status of ConfigMap is in SUCCESS state

kubectl get cm iks-ca-configmap -n kube-system -o yaml

How to check if Autoscaler is working

In this example, we will perform a basic test by increasing the number of pods to check the scale-up and then decreasing it to check the scale-down.

Scale-up

Please remember that the autoscaler is performed based on the pod’s request value, so we do not need to perform a stress test; we just have to increase the number of pods to achieve the worker-node limit.

If your deployment does not have the Requests set up accordingly, autoscaler won’t work as you expect. In our example, we will use a simple Replica Set that deploys four nginx pods.

nginx-test-autoscaler.yaml

kind: ReplicaSet
metadata:
  name: web-autoscaler-replicaset
spec:
  selector:
    matchLabels:
      app: webserver
      tier: webserver
  replicas: 4
  template:
    metadata:
      labels:
        app: webserver
        tier: webserver
    spec:
      containers:
      - name: nginx
        image: nginx:latest
        resources:
          requests:
            cpu: 100m
            memory: 500Mi
        ports:
        - containerPort: 80

Note: Take a look at the requests value—we are requesting 100m for CPU and 500Mi for Memory. This is the value of the request for each pod. 

Create a replica set using the yaml file:

kubectl apply -f nginx-test-autoscaler.yaml

Check the pod status—one of them is in the “Pending” state:

kubectl get pods

Let’s take a look at the pod’s events to verify the reason:

kubectl describe pod <podname> | grep Warning

In this case, the pod is in the pending state because there is insufficient memory on worker nodes.

Looking at the Kubernetes cluster inside the IBM Cloud portal, we can verify that a new node is being provisioned. Autoscaler identified that there is no computational resource to start the pod in pending state, so it is automatically scaling up to put the pod in running state:

After the nodes are provisioned, we can check the pod’s status and the number of nodes:

kubectl get nodes
kubectl get pods | grep web

Scale-down

In our test, we do not have workload in our pods so we just need to decrease the number of pods, which will decrease the total Requests. After the autoscaler identifies that the pods that are on a node have a request of less than 50% of its capacity, it will start the scale-down process.

Let’s change the replicaset from four to two pods and confirm when only two pods are running:

kubectl scale rs web-autoscaler-replicaset --replicas=2
kubectl get rs web-autoscaler-replicaset
kubectl get pods

Looking at the Kubernetes cluster inside the IBM Cloud portal, we can verify that one node is being deleted:

Customizing configuration values

You can change the cluster autoscaler configuration values using the helm upgrade command with –set option.  If you want to learn more about the available parameters, see the following: Customizing configuration values (–set)

Here are two examples:

  1. Change the scan interval to 5m and enable autoscaling for the default worker pool, with a maximum of five and minimum of three worker nodes per zone:
    helm upgrade --set scanInterval=5m --set workerpools[0].default.max=5,workerpools[0].default.min=3,workerpools[0].default.enabled=true ibm-iks-cluster-autoscaler iks-charts/ibm-iks-cluster-autoscaler -i --recreate-pods --namespace kube-system
  2. Change the threshold scale-down utilization to 0.7 and maximum time in minutes before pods is automatically restarted:
    helm upgrade --set scaleDownUtilizationThreshold=0.7 --set max-inactivity=5min ibm-iks-cluster-autoscaler iks-charts/ibm-iks-cluster-autoscaler -i --recreate-pods --namespace kube-system
  3. We saw in the examples above that there are options to customize the values, but what if you want to return to the default values? There is a command to reset the settings:
    helm upgrade --reset-values ibm-iks-cluster-autoscaler iks-charts/ibm-iks-cluster-autoscaler --recreate-pods

Conclusion

Autoscaler can help you to avoid having pods in a pending state in your environment due to lack of computational resources by increasing the number of worker nodes and by decreasing them if they are underutilized.

In this article, we gave just one example of how it can be used and explored the customization options to adapt to the best for your environment. It’s also important to take into consideration the time that IBM Cloud can provision a new worker node in order to make adequate the threshold to be used in your environment. In this case, a validation is recommended before going into production.

Learn more

Want to get some free, hands-on experience with Kubernetes? Take advantage of interactive, no-cost Kubernetes tutorials by checking out IBM CloudLabs.

Was this article helpful?
YesNo

More from Cloud

New 4th Gen Intel Xeon profiles and dynamic network bandwidth shake up the IBM Cloud Bare Metal Servers for VPC portfolio

3 min read - We’re pleased to announce that 4th Gen Intel® Xeon® processors on IBM Cloud Bare Metal Servers for VPC are available on IBM Cloud. Our customers can now provision Intel’s newest microarchitecture inside their own virtual private cloud and gain access to a host of performance enhancements, including more core-to-memory ratios (21 new server profiles/) and dynamic network bandwidth exclusive to IBM Cloud VPC. For anyone keeping track, that’s 3x as many provisioning options than our current 2nd Gen Intel Xeon…

IBM and AWS: Driving the next-gen SAP transformation  

5 min read - SAP is the epicenter of business operations for companies around the world. In fact, 77% of the world’s transactional revenue touches an SAP system, and 92% of the Forbes Global 2000 companies use SAP, according to Frost & Sullivan.   Global challenges related to profitability, supply chains and sustainability are creating economic uncertainty for many companies. Modernizing SAP systems and embracing cloud environments like AWS can provide these companies with a real-time view of their business operations, fueling growth and increasing…

Experience unmatched data resilience with IBM Storage Defender and IBM Storage FlashSystem

3 min read - IBM Storage Defender is a purpose-built end-to-end data resilience solution designed to help businesses rapidly restart essential operations in the event of a cyberattack or other unforeseen events. It simplifies and orchestrates business recovery processes by providing a comprehensive view of data resilience and recoverability across primary and  auxiliary storage in a single interface. IBM Storage Defender deploys AI-powered sensors to quickly detect threats and anomalies. Signals from all available sensors are aggregated by IBM Storage Defender, whether they come…

IBM Newsletters

Get our newsletters and topic updates that deliver the latest thought leadership and insights on emerging trends.
Subscribe now More newsletters