Adjusting the CPU and memory reservations for integration server or integration runtime pods with the Vertical Pod Autoscaler
The Vertical Pod Autoscaler (VPA) is a Kubernetes resource that automates the process for setting resource limits and requests for the pods in a cluster. You can install the VPA in your cluster and then use it to configure autoscaling for your running integration server or integration runtime pods to ensure that they are not over resourced, or can be scaled up when demand increases.
You configure autoscaling for an integration server or integration runtime by deploying a VPA object that defines minimum and maximum CPU and memory limits for pod containers, with an update policy to automatically modify resource requests. The VPA then monitors CPU and memory usage over an extended period of time, generates recommendations to apply to the integration server or integration runtime pods, and dynamically scales up or scales down as required.
Before you begin
Ensure that your cluster administrator has configured cluster metrics.
Ensure that you have cluster administrator permissions and are logged in to your cluster.
Installing the VPA
The VPA comprises three components (admission-controller, recommender, and updater), which are deployed when you install.
Installing on Red Hat OpenShift
For Red Hat®
OpenShift® clusters, see the Red Hat
OpenShift documentation for
installation details. The VPA is typically installed into the
openshift-vertical-pod-autoscaler
namespace.
Installing on Kubernetes
For traditional Kubernetes clusters, see the Kubernetes documentation for
installation details. The VPA is installed into the kube-system
Kubernetes
namespace by default.
After you install, ensure that the required permissions are assigned to the VPA deployment as described in Configuring role-based access control (RBAC).
Configuring role-based access control (RBAC)
Specify which autoscaling actions can be performed on deployed integration servers or
integration runtimes by assigning RBAC permissions to VPA cluster roles. You can do so by creating
ClusterRole
and ClusterRoleBinding
resources that are linked to
the vpa-recommender
service account that is installed with the VPA.
To configure RBAC, complete the following steps:
- If unsure, identify the namespace of the
vpa-recommender
service account by running the following command. (You will need to specify this namespace in a later step.)oc get sa --all-namespaces | grep vpa-recommender
You should see output similar to this, which identifies the namespace (
openshift-vertical-pod-autoscaler
) in the first column:openshift-vertical-pod-autoscaler vpa-recommender 2 65m
Tip: In a Kubernetes environment, the equivalent kubectl get sa command should most likely identify the namespace askube-system
. - Create the
ClusterRole
object:- From your local computer, create a YAML file (for example,
cr_filename.yaml) by copying the following custom resource:
apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRole metadata: name: ibm-appconnect-vpa-recommender rules: - apiGroups: - appconnect.ibm.com resources: - integrationservers - integrationservers/scale - integrationruntimes - integrationruntimes/scale verbs: - get - list - watch
- From the command line, run the following command to deploy the cluster role:
oc apply -f cr_filename.yaml
- From your local computer, create a YAML file (for example,
cr_filename.yaml) by copying the following custom resource:
- Create the
ClusterRoleBinding
object:- From your local computer, create a YAML file (for example,
crb_filename.yaml) by copying the following custom resource,
where vpaRecommenderNamespace is the namespace of the
vpa-recommender
service account:apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRoleBinding metadata: name: ibm-appconnect-vpa-recommender subjects: - kind: ServiceAccount name: vpa-recommender namespace: vpaRecommenderNamespace roleRef: kind: ClusterRole name: ibm-appconnect-vpa-recommender apiGroup: rbac.authorization.k8s.io
- From the command line, run the following command to deploy the cluster role binding:
oc apply -f crb_filename.yaml
- From your local computer, create a YAML file (for example,
crb_filename.yaml) by copying the following custom resource,
where vpaRecommenderNamespace is the namespace of the
Next, configure autoscaling for your running integration servers or integration runtimes.
Configuring autoscaling for your running integration servers or integration runtimes
After you install the VPA and set ClusterRole
permissions, you can configure the
VPA to target your running integration servers or integration runtimes. To configure autoscaling for
an integration server or integration runtime, you must deploy a
VerticalPodAutoscaler
resource, which identifies the containers, the minimum and
maximum limits, and the update policy for autoscaling.
Complete the following steps for any running integration server or integration runtime:
- From your local computer, create a file with a .yaml extension (for
example, toolkit_vpa_is.yaml for an integration server or
toolkit_vpa_ir.yaml for an integration runtime). Copy the
following
VerticalPodAutoscaler
custom resource into the file, where:- metadata.name is a unique name for this instance of the
VerticalPodAutoscaler
custom resource. (In the examples, the value is shown astoolkit-vpa_is
for an integration server andtoolkit-vpa_ir
for an integration runtime.) - spec.targetRef.name is set to the name of your integration server or
integration runtime. (In the examples, the value is shown as
is-toolkit
for an integration server andir-toolkit
for an integration runtime.) - spec.resourcePolicy.containerPolicies.minAllowed and spec.resourcePolicy.containerPolicies.maxAllowed are set to the lower and upper resource limits for containers that are defined for the integration server or integration runtime.
VPA custom resource for an integration server:apiVersion: "autoscaling.k8s.io/v1" kind: VerticalPodAutoscaler metadata: name: toolkit-vpa_is spec: targetRef: apiVersion: appconnect.ibm.com/v1beta1 kind: IntegrationServer name: is-toolkit resourcePolicy: containerPolicies: - containerName: '*' minAllowed: cpu: 100m memory: 50Mi maxAllowed: cpu: 2 memory: 500Mi controlledResources: ["cpu", "memory"]
VPA custom resource for an integration runtime:apiVersion: "autoscaling.k8s.io/v1" kind: VerticalPodAutoscaler metadata: name: toolkit-vpa_ir spec: targetRef: apiVersion: appconnect.ibm.com/v1beta1 kind: IntegrationRuntime name: ir-toolkit resourcePolicy: containerPolicies: - containerName: '*' minAllowed: cpu: 100m memory: 50Mi maxAllowed: cpu: 2 memory: 500Mi controlledResources: ["cpu", "memory"]
Note: An update policy is not explicitly set in the supplied YAML because by default, the VPA automatically applies resource recommendations by using this spec.updatePolicy.updateMode setting:spec: ... updatePolicy: updateMode: "Auto"
- metadata.name is a unique name for this instance of the
- From the command line, run the following command to deploy a
VerticalPodAutoscaler
object, where filename.yaml is the file to which you saved the VPA custom resource; for example, toolkit_vpa_is.yaml for an integration server or toolkit_vpa_ir.yaml for an integration runtime.oc apply -f filename.yaml
The VPA begins to query for metrics on the pods and after approximately 30 seconds, should begin to generate recommendations that are based on current and previous usage. (Under certain circumstances, this process might take longer.)
- To verify that vertical pod autoscaling is operational, run this command to view detailed
information about the
VerticalPodAutoscaler
object, where vpaObjectName is the metadata.name value of the object:oc describe vpa vpaObjectName
The following example shows the output of the oc describe command for a sample
VerticalPodAutoscaler
object calledtoolkit-vpa_is
. TheStatus
section of the output indicates when the recommendations were last provided, and shows the generated recommendations for containers belonging to an integration server calledis-toolkit
, after querying the resource metrics.oc describe vpa toolkit-vpa Name: toolkit-vpa_is Namespace: ace-cam API Version: autoscaling.k8s.io/v1 Kind: VerticalPodAutoscaler Metadata: ... Spec: Resource Policy: Container Policies: Container Name: * Controlled Resources: cpu memory Max Allowed: Cpu: 2 Memory: 500Mi Min Allowed: Cpu: 100m Memory: 50Mi Target Ref: API Version: appconnect.ibm.com/v1beta1 Kind: IntegrationServer Name: is-toolkit Update Policy: Update Mode: Auto Status: Conditions: Last Transition Time: 2020-11-03T17:46:15Z Status: True Type: RecommendationProvided Recommendation: Container Recommendations: Container Name: is-toolkit Lower Bound: Cpu: 100m Memory: 262144k Target: Cpu: 182m Memory: 272061154 Uncapped Target: Cpu: 182m Memory: 272061154 Upper Bound: Cpu: 2 Memory: 500Mi Events: <none>
A pod with resource requests that are less than the
Lower Bound
or greater than theUpper Bound
recommendations will be recreated with theTarget
(optimal) recommendation. TheUncapped Target
values show the most recent resource recommendations.For more information about the Vertical Pod Autoscaler, see the README.md file in the GitHub repository and Automatically adjust pod resource levels with the vertical pod autoscaler (on Red Hat OpenShift).
Example: Testing autoscaling
If you have integration servers or integration runtimes that are running close to the limits, the VPA should readily spring into action. However, if all your integration servers or integration runtimes are well within the limits, you can test autoscaling by using a sample load. In this example, we illustrate autoscaling for a sample integration server that is running close to the CPU limit.
- Download the attached perf-rating.bar.zip file.
- Extract the contents of the ZIP file to a directory on your local computer. This ZIP file contains a PerfRating.bar file for a Toolkit integration.
- From the App Connect Dashboard, deploy the
PerfRating.bar file to an integration server.
An example of the integration server custom resource is as follows:
apiVersion: appconnect.ibm.com/v1beta1 kind: IntegrationServer metadata: name: vpa-is-toolkit spec: enableMetrics: true license: accept: true license: L-MJTK-WUU8HE use: AppConnectEnterpriseProduction pod: containers: runtime: resources: limits: cpu: 300m memory: 512Mi requests: cpu: 300m memory: 512Mi adminServerSecure: true router: timeout: 120s designerFlowsOperationMode: disabled createDashboardUsers: true service: endpointType: http version: '12.0' replicas: 2 barURL: >- https://db-production-is-dash:3443/v1/directories/PerfRating?b4a71e4c-1eb5-45f5-83ef-55e66917f7cb configurations: []
- Follow the steps in Configuring autoscaling for your running integration servers or integration runtimes to deploy a
VerticalPodAutoscaler
object for the integration server. An example of the VPA custom resource is as follows:apiVersion: autoscaling.k8s.io/v1 kind: VerticalPodAutoscaler metadata: name: perfrating-is spec: resourcePolicy: containerPolicies: - containerName: '*' controlledResources: - cpu - memory maxAllowed: cpu: 4 memory: 1024Mi minAllowed: cpu: 100m memory: 256Mi targetRef: apiVersion: appconnect.ibm.com/v1beta1 kind: IntegrationServer name: vpa-is-toolkit updatePolicy: updateMode: Auto
- Optional. Use Grafana to monitor the CPU usage for the pod. You can enable OpenShift Container
Platform monitoring and implement Grafana as described in Enabling OpenShift Container Platform monitoring and implementing Grafana.
Alternatively, monitor CPU usage from the Red Hat OpenShift web console ( ); for example:
- Run the following command to obtain the HTTP route for the integration server:
oc get routes
In the output, take note of the entry that shows the integration server name appended by
-http
; for example,vpa-is-toolkit-http
. You will need to specify itsHOST/PORT
value in the next step. - To invoke the flow, run the following command, where BASEURL is the
HOST/PORT
value of theintegrationServerName-http
entry:curl --request GET --url 'BASEURL:80/perf/rating?SequenceNumber=50&NumberOfIterations=2&NumberOfThreads=1' --header 'accept: application/json'
On Windows, you can run the command as follows:
curl --request GET --url "BASEURL:80/perf/rating?SequenceNumber=50&NumberOfIterations=2&NumberOfThreads=1" --header "accept: application/json"
When the pod starts requesting resource beyond the lower limit, the VPA will revise the figures and proceed with cycling the pod to apply the updated figures.