Scaling deployments

Scaling is accomplished by changing the number of replicas in a deployment. A replica is a copy of a pod that already contains a running service. By having multiple replicas of a pod, you can ensure that your deployment has the available resources to handle increasing load.

About this task

Restriction: By default, a single Rule Execution Server console is deployed and restarted when necessary. Due to a known limitation, it is not possible to scale up the number of deployments of the console.

All of the ODM services can be scaled up or down in a cluster, except for the Rule Execution Server console. The console is used to deploy new versions of a decision service dynamically and notify all connected rule engines to pick up the newest version. Although you do not need a console for rule execution, it is useful to deploy new versions, notify decision services to execution components, and gather statistics for rule execution.

If a deployment is exposed publicly when you change the number of replicas, the service distributes the traffic to the available pods during the update. An available pod is an instance that can be accessed by users.

Procedure

  1. From the Kubernetes command line, you set the scale value by using the replicas parameter.
    kubectl scale --replicas=2 deployment/odm-instance-odm-decisionrunner
    Where odm-instance-odm-decisionrunner is the name of the deployment you want to scale.

    It is also possible to specify one or more preconditions for the scale action. If either of the current-replicas or resource-version parameters are specified, they are validated before the scale is attempted. The precondition must hold true for the scale to be applied to the server. For more information, see kubectl scale.

  2. Optional: Define a policy that automatically scales the number of deployment replicas.
    For more information, see Horizontal Pod Autoscaler.

Results

The following values are updated after the deployment scaled.
  • DESIRED - The desired number of replicas of a pod, which you define when you create the deployment.
  • CURRENT - The number of replicas currently running.
  • READY - The number of replicas that are available to the users compared to the desired state.
  • AVAILABLE - The number of replicas that are available to the users.