A look at how you can use autoscaler to scale your worker nodes.
If you want to scale the pods in a Kubernetes cluster, this can be done easily through Replicaset within Kubernetes; but what if you want to scale your worker nodes? In this situation, autoscaler can help you avoid having pods in a pending state in your environment due to a lack of computational resources. You can increase or decrease the number of worker nodes in your work cluster automatically based on the resource demand.
Scale-up and scale-down
To start, we need to understand how scale-up and scale-down work, and it’s important to understand the criteria used. The autoscaler works based on the resource request value defined for your deployments/pods and not on the value that is being consumed by the application.
- Scale-up: This situation occurs when you have pending pods because there are insufficient computing resources.
- Scale-down: This occurs when less than the total compute resources are considered underutilized. The default scale-down utilization threshold is utilization below 50%.
Step-by-step instructions
In this step-by-step guide, we will show you how to install and configure the autoscaler on your IBM Cloud Kubernetes Service cluster and perform a little test to see how it works in practice.
Before start, you’ll need to install the required CLI into your computer: ibmcloud and Helm version 3 (the correct version is important due to the differences in commands between them).
1. Confirm that your credentials are stored in your Kubernetes cluster
If you do not have credentials stored, you can create one.
2. Check if you worker pool has the required label
If you don’t have the required label, you have to add a new worker pool. If you don’t know the <worker_pool_name_or_ID>
, you can get it through this command:
3. Add and update the Helm repo into your computer
Note: If you try to add a repo and receive an error message that says “Error: Couldn’t load repositories file…”, you have to init the helm. On prompt, type: helm init
4. Install the cluster autoscaler helm chart in the kube-system namespace:
-
workerpools[0]
: The first worker pool to enable autoscaling. -
<pool_name>
: The name or ID of the worker pool . -
max=<number_of_workers>
: Specify the maximum number of worker nodes. -
min=<number_of_workers>
: Specify the minimum number of worker nodes.
It is necessary to set the min value of autoscaler to, at least, the current number of worker nodes of your pool because min size does not automatically trigger a scale-up.
If you set up the autoscaler with a min size below the number of current worker nodes, the autoscaler does not initiate and needs to be set to the correct value before works properly:
Note: In this option, we are using all the default values of the autoscaler and just specifying the minimum and maximum number of worker nodes. However, there are several options that you can change by using the –set option.
5. Confirm that the pod is running and the service has been created
6. Verify if the status of ConfigMap is in SUCCESS state
How to check if Autoscaler is working
In this example, we will perform a basic test by increasing the number of pods to check the scale-up and then decreasing it to check the scale-down.
Scale-up
Please remember that the autoscaler is performed based on the pod’s request value, so we do not need to perform a stress test; we just have to increase the number of pods to achieve the worker-node limit.
If your deployment does not have the Requests set up accordingly, autoscaler won’t work as you expect. In our example, we will use a simple Replica Set that deploys four nginx pods.
nginx-test-autoscaler.yaml
Note: Take a look at the requests value—we are requesting 100m for CPU and 500Mi for Memory. This is the value of the request for each pod.
Create a replica set using the yaml file:
Check the pod status—one of them is in the “Pending” state:
Let’s take a look at the pod’s events to verify the reason:
In this case, the pod is in the pending state because there is insufficient memory on worker nodes.
Looking at the Kubernetes cluster inside the IBM Cloud portal, we can verify that a new node is being provisioned. Autoscaler identified that there is no computational resource to start the pod in pending state, so it is automatically scaling up to put the pod in running state:
After the nodes are provisioned, we can check the pod’s status and the number of nodes:
Scale-down
In our test, we do not have workload in our pods so we just need to decrease the number of pods, which will decrease the total Requests. After the autoscaler identifies that the pods that are on a node have a request of less than 50% of its capacity, it will start the scale-down process.
Let’s change the replicaset from four to two pods and confirm when only two pods are running:
Looking at the Kubernetes cluster inside the IBM Cloud portal, we can verify that one node is being deleted:
Customizing configuration values
You can change the cluster autoscaler configuration values using the helm upgrade command with –set option. If you want to learn more about the available parameters, see the following: Customizing configuration values (–set)
Here are two examples:
- Change the scan interval to 5m and enable autoscaling for the default worker pool, with a maximum of five and minimum of three worker nodes per zone:
- Change the threshold scale-down utilization to 0.7 and maximum time in minutes before pods is automatically restarted:
- We saw in the examples above that there are options to customize the values, but what if you want to return to the default values? There is a command to reset the settings:
Conclusion
Autoscaler can help you to avoid having pods in a pending state in your environment due to lack of computational resources by increasing the number of worker nodes and by decreasing them if they are underutilized.
In this article, we gave just one example of how it can be used and explored the customization options to adapt to the best for your environment. It’s also important to take into consideration the time that IBM Cloud can provision a new worker node in order to make adequate the threshold to be used in your environment. In this case, a validation is recommended before going into production.
Learn more
Want to get some free, hands-on experience with Kubernetes? Take advantage of interactive, no-cost Kubernetes tutorials by checking out IBM CloudLabs.