Cloud VM Policies

Automation Workflow

For details about cloud VM actions, see Cloud VM Actions.

Cloud Scale

Action	Default Mode	AWS, Azure, and Google Cloud
Cloud Scale All	Manual
Cloud Scale for Performance	Manual
Cloud Scale for Savings	Manual

Other Actions

Action	Default Mode	Google Cloud
Buy discounts	Recommend	Not yet supported
Provision container platform node (VM)	Manual
Suspend container platform node (VM)	Manual

Scaling Target Utilization - GPU (AWS Only)

Turbonomic uses these settings in conjunction with aggressiveness constraints to control scaling actions for VMs. You can set the aggressiveness per the percentile of utilization, and set the length of the sample period for more or less elasticity on the cloud.

For more information about GPU metrics, see this topic.

Attribute	Default Value
Scaling Target GPU Count Utilization	100 The target utilization as a percentage of GPU count capacity.
Scaling Target GPU FP16 Utilization	90 The target utilization as a percentage of GPU FP16 capacity.
Scaling Target GPU FP32 Utilization	90 The target utilization as a percentage of GPU FP32 capacity.
Scaling Target GPU FP64 Utilization	90 The target utilization as a percentage of GPU FP64 capacity.
Scaling Target GPU Memory BW Utilization	90 The target utilization as a percentage of GPU Memory BW (bandwidth) capacity.
Scaling Target GPU Memory Utilization	90 The target utilization as a percentage of GPU memory capacity.
Scaling Target GPU Tensor Utilization	90 The target utilization as a percentage of GPU Tensor capacity.

Scaling Target Utilization - IOPS (Azure Only)

Turbonomic uses this setting in conjunction with aggressiveness constraints to control scaling actions for VMs. You can set the aggressiveness per the percentile of utilization, and set the length of the sample period for more or less elasticity on the cloud.

Attribute	Default Value
Scaling Target IOPS Utilization	70 The target percentile value Turbonomic will attempt to match.

Attribute

Default Value

Scaling Target IOPS Utilization

70

The target percentile value Turbonomic will attempt to match.

For details on how IOPS utilization affects scaling decisions, see IOPS-aware Scaling for Azure VMs.

Scaling Target Utilization - vCPU, vMem, IO/Net Throughput

These advanced settings determine how much you would like a scope of workloads to utilize their resources. These are fixed settings that override the way Turbonomic calculates the optimal utilization of resources. You should only change these settings after consulting with Technical Support.

While these settings offer a way to modify how Turbonomic recommends actions, in most cases you should never need to use them. If you want to control how Turbonomic recommends actions to resize workloads, you can set the aggressiveness per the percentile of utilization, and set the length of the sample period for more or less elasticity on the cloud.

Attribute	Default Value
Scaling Target VCPU Utilization	70 The target utilization as a percentage of VCPU capacity.
Scaling Target VMEM Utilization	90 The target utilization as a percentage of memory capacity.
Scaling Target IO Throughput Utilization	70 The target utilization as a percentage of IO throughput (Read and Write) capacity.
Scaling Target Net Throughput Utilization	70 The target utilization as a percentage of network throughput (Inbound and Outbound) capacity.

Aggressiveness and Observation Periods

Turbonomic uses these settings to calculate utilization percentiles and then recommends actions to improve utilization based on the observed values for a given time period.

Aggressiveness

Attribute

Default Value

Aggressiveness

95th Percentile

When evaluating performance, Turbonomic considers resource utilization as a percentage of capacity. The utilization drives actions to scale the available capacity either up or down. To measure utilization, the analysis considers a given utilization percentile. For example, assume a 95th percentile. The percentile utilization is the highest value that 95% of the observed samples fall below. Compare that to average utilization, which is the average of all the observed samples.

Using a percentile, Turbonomic can recommend more relevant actions. This is important in the cloud, so that analysis can better exploit the elasticity of the cloud. For scheduled policies, the more relevant actions will tend to remain viable when their execution is put off to a later time.

For example, consider decisions to reduce the capacity for CPU on a VM. Without using a percentile, Turbonomic never resizes below the recognized peak utilization. For most VMs, there are moments when peak CPU reaches high levels, such as during reboots, patching, and other maintenance tasks. Assume utilization for a VM peaked at 100% just once. Without the benefit of a percentile, Turbonomic will not reduce allocated CPU for that VM.

With Aggressiveness, instead of using the single highest utilization value, Turbonomic uses the percentile you set. For the previous example, assume a single CPU burst to 100%, but for 95% of the samples CPU never exceeded 50%. If you set Aggressiveness to 95th Percentile, then Turbonomic can see this as an opportunity to reduce CPU allocation for the VM.

In summary, a percentile evaluates the sustained resource utilization, and ignores bursts that occurred for a small portion of the samples. You can think of this as aggressiveness of resizing, as follows:
- 100th and 99th Percentile – More performance. Recommended for critical workloads that need maximum guaranteed performance at all times, or workloads that need to tolerate sudden and previously unseen spikes in utilization, even though sustained utilization is low.
- 95th Percentile (Default) – The recommended setting to achieve maximum performance and savings. This assures application performance while avoiding reactive peak sizing due to transient spikes, thus allowing you to take advantage of the elastic ability of the cloud.
- 90th Percentile – More efficiency. Recommended for non-production workloads that can stand higher resource utilization.
By default, Turbonomic uses samples from the last 30 days. Use the Max Observation Period setting to adjust the number of days. To ensure that there are enough samples to analyze and drive scaling actions, set the Min Observation Period.
Max Observation Period

Attribute

Default Value

Max Observation Period

Last 30 Days

To refine the calculation of resource utilization percentiles, you can set the sample time to consider. Turbonomic uses historical data from up to the number of days that you specify as a sample period. If the database has fewer days' data then it uses all of the stored historical data.

You can make the following settings:
- Less Elastic – Last 90 Days
- Recommended – Last 30 Days
- More Elastic – Last 7 Days
Turbonomic recommends an observation period of 30 days following the monthly workload maintenance cycle seen in many organizations. VMs typically peak during the maintenance window as patching and other maintenance tasks are carried out. A 30-day observation period means that Turbonomic can capture these peaks and increase the accuracy of its sizing recommendations.

You can set the value to 7 days if workloads need to resize more often in response to performance changes. For workloads that cannot handle changes very often or have longer usage periods, you can set the value to 90 days.
Min Observation Period

Attribute

Default Value

Min Observation Period

None

This setting ensures historical data for a minimum number of days before Turbonomic will generate an action based on the percentile set in Aggressiveness. This ensures a minimum set of data points before it generates the action.

Especially for scheduled actions, it is important that resize calculations use enough historical data to generate actions that will remain viable even during a scheduled maintenance window. A maintenance window is usually set for "down" time, when utilization is low. If analysis uses enough historical data for an action, then the action is more likely to remain viable during the maintenance window.
- More Elastic – None
- Less Elastic – 7 Days

Attribute	Default Value
Aggressiveness	95th Percentile

Attribute	Default Value
Max Observation Period	Last 30 Days

Attribute	Default Value
Min Observation Period	None

Cloud Instance Types

Attribute	Default Value
Cloud Instance Types	None

By default, Turbonomic considers all instance types currently available for scaling when making scaling decisions for VMs. However, you may have set up your cloud VMs to only scale to or avoid certain instance types to reduce complexity and cost, improve discount utilization, or meet application demand. Use this setting to identify the instance types that VMs can scale to.

Note:

Under most circumstances, when a cloud provider offers a new instance type that is meant to replace an older type, the provider offers it at a lower cost. However, a provider may provide a new instance type with identical costs as the older instance types. If this occurs, and capacity and cost are equal, Turbonomic cannot ensure that it chooses the newer instance type. To work around this issue, you can create an Action Automation policy that excludes the older instance type.

Click Edit to set your preferences. In the new page that displays, expand a cloud tier (a family of instance types, such as a1 for AWS or B-series for Azure) to see individual instance types and the resources allocated to them. If you have several cloud providers, each provider will have its own tab.

Select your preferred instance types or cloud tiers, or clear the ones that you want to avoid. After you save your changes, the main page refreshes to reflect your selections.

If you selected a cloud tier and the service provider deploys new instance types to that tier later, then those instance types will automatically be included in your policy. Be sure to review your policies periodically to see if new instance types have been added to a tier. If you do not want to scale to those instance types, update the affected policies.

Consistent Resizing

Attribute	Default Setting
Consistent Resizing	Off

Consistent Resizing for User-defined Automation Policies

When you create a policy for a group of VMs and turn on Consistent Resizing, Turbonomic resizes all the group members to the same size, such that they all support the top utilization of each resource commodity in the group. For example, assume VM A shows top utilization of CPU, and VM B shows top utilization of memory. A resize action would result in all the VMs with CPU capacity to satisfy VM A, and memory capacity to satisfy VM B.

For an affected resize, the Actions List shows individual resize actions for each of the VMs in the group. If you automate resizes, Turbonomic executes each resize individually in a way that avoids disruption to your workloads.

Use this setting to enforce the same template across all VMs in a group when resizing VMs on the public cloud. In this way, Turbonomic can enforce a rule to size all the VMs in a group equally.

Consistent Resizing for Auto-discovered Groups

In public cloud environments, Turbonomic discovers groups that should keep all their VMs on the same template, and then creates read-only policies for them to implement Consistent Resizing. The details of this discovery and the associated policy vary depending on the cloud provider.

Azure

Turbonomic discovers Azure availability sets and scale sets.
- For availability sets, Turbonomic does not enable Consistent Resizing, but it can recommend scale actions for individual VMs in the availability set.
  
  When a scale action for a VM in an availability set fails due to insufficient resources in the compute cluster, the action remains pending. When you hover over the pending action, you will see a message indicating that action execution has been temporarily disabled due to a previous execution error in the availability set. Turbonomic assumes that all other VMs in the availability set will fail to scale due to the same resource issue, so it creates a temporary policy that disables action execution for the availability set. Specifically, this policy sets the action acceptance mode for scale actions to Recommend and stays in effect for 730 hours (one month). This means that for the duration of the policy, Turbonomic will continue to generate read-only, non-executable scale actions for individual VMs, so you can evaluate their resource requirements and plan accordingly. You can delete this policy if you need to re-enable action execution in the availability set.
- For scale sets, Turbonomic enables Consistent Resizing across all the VMs in the group. You execute those actions directly in Azure. If you do not need to resize all the members of a given scale set to a consistent template, create another policy for that scope and turn off Consistent Resizing.
AWS

Turbonomic discovers Auto Scaling Groups and automatically enables Consistent Resizing across all the VMs in each group. You can choose to execute all the actions for such a group, either manually or automatically. In that case, Turbonomic executes the resizes one VM at a time. If you do not need to resize all the members of a given Auto Scaling Group to a consistent template, create another policy for that scope and turn off Consistent Resizing.

If you select one or all actions for the group either manually or automatically, Turbonomic will change the Launch Configuration for the Auto Scaling Group but it will not terminate the EC2 instances.

The following examples are some use cases for employing Consistent Resizing for a group.

If you have deployed load balancing for a group, then all the VMs in the group should experience similar utilization. In that case, if one VM needs to be resized, then it makes sense to resize them all consistently.
A common HA configuration on the public cloud is to deploy mirror VMs to different availability zones, where the given application runs on only one of the VMs at a given time. The other VMs are on standby to recover in failover events. Without Consistent Resizing, Turbonomic would tend to size down or suspend the unused VMs, which would make them unready for the failover situation.

When working with Consistent Resizing, consider these points:

You should not mix VMs in a group that has a Consistent Resizing policy, with other groups that enable Consistent Resizing. One VM can be a member of more than one group. If one VM (or more) in a group with Consistent Resizing is also in another group that has Consistent Resizing, then both groups enforce Consistent Resizing together, for all their group members.
If one VM (or more) is in a group with Consistent Resizing turned on, and the same VMs are in a group with Consistent Resizing turned off, the affected VMs assume the ON setting. This is true if you created both groups, or if Turbonomic created one of the groups for Azure Scale Sets or AWS Auto Scaling Groups.
For any group of VMs that enables Consistent Resizing, you should not mix the associated target technologies. For example, one group should not include VMs that are managed on both Azure and AWS platforms.
Charts that show actions and risks assign the same risk statement to all the affected VMs. This can seem confusing. For example, assume one VM needs to resize to address vCPU risk, and 9 other VMs are set to resize consistently with it. Then charts will state that 10 VMs need to resize to address vCPU risks.

Ignore NVIDIA GPU Compute Capability Constraints (AWS Only)

For AWS VMs running supported NVIDIA GPU instance types, Turbonomic can generate scale actions that change the GPU compute capability of VMs. By default, these actions are only executable in AWS. When you review a pending scale action, you will see the following message in the Execution Prerequisites section that explains why action execution in Turbonomic is blocked.

CUDA applications running on <GPU_VM> may need to be re-configured or recompiled 
to execute on the new accelerated instance. The compute capability on the current instance 
is <x>, while the new compute capability will be <y> if the action is taken. 
When the pre-requisites are verified, this action must be executed manually in the AWS console. 
Alternatively, you can configure a policy to ignore GPU compute capability constraints.

You can turn on this setting to allow scale actions that change compute capability to execute in Turbonomic.

Attribute	Default Setting
Ignore NVIDIA GPU Compute Capability Constraints	Off

Ignore NVMe Constraints (AWS Only)

For AWS, Turbonomic recognizes when a VM instance includes an NVMe driver. To respect NVMe constraints, it will not recommend scaling to an instance type that does not also include an NVMe driver. If you ignore NVMe constraints, then Turbonomic is free to scale the instance to a type that does not include an NVMe driver.

Attribute	Default Setting
Ignore NVMe Constraints	Off

Instance Store Aware Scaling (AWS Only)

Attribute	Default Setting
Instance Store Aware Scaling	Off

The template for your workload determines whether the workload can use an instance store, and it determines the instance store capacity. As Turbonomic calculates a resize or move action, it can recommend a new template that does not support instance stores, or that does not provide the same instance store capacity.

To ensure that resize actions respect the instance store requirements for your workloads, turn on Instance Store Aware Scaling for a given VM or for a group of VMs. When you turn this on for a given scope of VMs, then as it calculates move and resize actions, Turbonomic will only consider templates that support instance stores. In addition, it will not move a workload to a template that provides less instance store capacity.