How to optimize Google Cloud Platform cloud costs with IBM Turbonomic

1 May 2023

8 min read

Google Cloud Platform (GCP) enables customers to build, manage and deploy modern, scalable applications to achieve digital business success. However, due to its complexity, achieving operational excellence in the cloud is difficult. Fundamentally, as a Cloud Operator, you need to ensure great end-user experiences while staying within budget.

In this blog post, we will review the various methods of GCP cloud cost management, what problems they address, how GCP users can best use them, and the Compute Engine management tool. However, regardless of your cloud cost optimization strategy, achieving operational excellence at scale and taking advantage of the elasticity of the cloud requires software that optimizes your consumption simultaneously for performance and cost—and makes it easy for you to automate it, safely and confidently. Let’s review how IBM Turbonomic helps customers optimize their GCP cloud costs upfront and make their environment more cost-effective.

Learn more about IBM Turbonomic and Google Cloud Platform (GCP).

Right-sizing instances

Google Cloud Platform’s operating expense model (OPEX) charges customers for the capacity available for different google cloud resources, regardless of whether they are fully utilized or not. GCP users can purchase different instance types and sizes, but often buy the largest instance available to ensure performance. Right-sizing resources is the process of matching instance types and sizes to workload performance and capacity requirements. To operate at the lowest cost, right-sizing resources must be done on a continuous basis. However, cloud operators often right-size reactively—for example, after executing a “lift and shift” cloud migration or development.

Migrate for Compute Engine is a GCP tool that has a right-sizing feature that recommends instance types for optimized cost and performance. This tool provides two types of right-sizing recommendations. The first is performance-based recommendations that are based on CPU and RAM currently allocated to the on-premises virtual machine (VM). The second is cost-based recommendations that are based on the current CPU and RAM configuration of the on-prem VM and the average usage of the VM during a given period.

If you are currently leveraging or looking to leverage preemptible VMs with Compute Engine, check out the GCP Preemptible VM instance page (link resides outside ibm.com) for more information.

How to use IBM Turbonomic to right-size instances

Let’s review how IBM Turbonomic GCP users right-size instances through percentile-based scaling. The diagrams below represent the IBM Turbonomic UI. Figure 1 shows the application stack. The supply chain on the left’s general purpose is to represent the resource relationships that Turbonomic maps out via API from the business application down to the Cloud Region. It can include other components like container pods, which would interest DevOps users, storage volumes, virtual machines and much more, all depending on the infrastructure that supports the application.

This full-stack understanding is what makes Turbonomic’s recommendations trustworthy and gives cloud engineering and operations the confidence to automate. For this GCP account, Turbonomic has identified 15 pending scaling actions:

After selecting SHOW ALL, customers are brought to Turbonomic’s Action Center, which can be found in Figure 2, below. This image shows all the scaling actions available for this GCP account. By viewing this dashboard, customers can find relevant information like the account name, instance type, discount coverage and on-demand cost. Customers can select different actions and execute them by clicking EXECUTE ACTIONS in the top-right corner:

For customers looking for more details on a particular action, they can select DETAILS and Turbonomic will provide additional information that it considers in its recommendations. As shown below in Figure 3, this instance needs to be scaled down because it has underutilized vCPU. Other information for this action includes the cost impact of executing the action, the resulting CPU utilization and capacity, and net throughput:

Scaling instances

Public cloud environments are always changing, and to achieve performance and budget goals, Google Cloud Platform (GCP) users must scale their instances both vertically (right-sizing/scaling up) and horizontally (scaling out). To scale horizontally, GCP customers can observe application load balances and then scale-out instances as load increases from increased demand. Distributing load across multiple instances through horizontal scaling increases performance and reliability, but instances must be scaled back as demand changes to avoid incurring unnecessary costs.

Learn more about cloud scalability and scaling up vs. scaling out.

Compute Engine also offers GCP customers autoscaling capabilities by automatically adding or deleting VM instances based on increases or decreases in load. However, this tool scales under the constraint of user-defined policies and only for designated VM instances called managed instance groups (MIGs).

The only way to optimize horizontal scaling is to do it in real-time through automation. IBM Turbonomic continuously generates scaling actions so applications can always perform at the lowest cost. Figure 4 below represents a GCP account that needs to be scaled out:

The horizontal scaling action for this GCP account can be executed in the Action Center under the Provision Actions subcategory found in Figure 5 below. Here, you can find information on the actions and the corresponding workload, such as the container cluster, the namespace and the risk posed to the workload (which, in this case, is transaction congestion):

In Figure 6 below, you can see how Turbonomic provides the rationale behind taking the action. In this case, a VM is experiencing vCPU congestion and needs to be provisioned additional CPU to improve performance. Turbonomic also specifies all the details, including the name, ID, Account, and age:

Suspending instances

Another significant way to optimize GCP cloud spend is to shut down idle instances. An organization may suspend instances if it is not currently using the instance (such as during non-business hours) but expects to resume use in the near term. When deleting an instance, the instance will be shut down and any data stored on the persistent disk is also deleted.

However, when suspending an instance, customers do not delete the underlying data contained in the attached persistent disk. When starting the instance again, the persistent disk is simply attached to a newly provisioned instance. GCP users can also use Compute Engine to suspend instances. GCP customers cannot suspend instances that use GPU, and suspension must be executed manually through the Google Cloud console. 

IBM Turbonomic automatically identifies and provides recommendations for suspending instances. To suspend an instance with Turbonomic, customers will need to first select a GCP account with a pending suspension action, as shown in Figure 7 below:

To execute a suspension action, Turbonomic customers need to go to the Action Center, select the corresponding action and execute. Under the Suspend Actions tab of the Action Center, as seen in Figure 8, customers can see the vMem, vCPU and vStorage capacity for each instance with a pending action:

If customers need additional details before executing, they can select the DETAILS, as shown in Figure 9 below. The details provided for this action include the reasoning behind the action (in this case, to improve infrastructure efficiency) and the cost impact, age of the instance, the virtual CPU and Memory, and the number of consumers for this instance:

Leveraging discounted pricing

To offset rising costs from the cloud’s variable pricing model, customers can leverage discounted pricing through optimizing committed-use discount (CUD) coverage and utilization. The Google Cloud service, Compute Engine, allows customers to purchase and renew resource-based committed-use contracts or commitments in return for heavily discounted prices for VM usage. GCP users can leverage committed-use discount recommendations that Compute Engine generates through analyzing customers’ VM usage patterns.

IBM Turbonomic’s analytics engine automatically ingests and displays negotiated rates with GCP and then generates specific committed-use discount scaling actions so customers can maximize CUD-to-instance coverage. Figure 10 represents a GCP account that has 15 pending actions to increase CUD utilization and coverage:

Figure 11 represents the scale actions that can be executed in the Action Center to increase CUD coverage. Some important details listed in the Action Center here are the resulting instance type, percent discount coverage and on-demand cost of taking the scaling action.

Figure 12 provides more details for this action, such as the vCPU and vMem utilization, throughput capacity and utilization, and total savings. All this information can again be found in the action’s corresponding DETAILS tab:

Deleting unattached resources

Finally, as previously discussed, Google Cloud Platform’s operating expense model (OPEX) charges customers not just for the resources that are actively in use, but also for the entire pool of resources available. As organizations build and deploy new releases into their environment, some resources are left unattached. Unattached resources are when customers create a resource but stop using it entirely.

After development, hundreds of different resource types can be left unattached. Deleting unattached resources can significantly reduce wasted cloud spend. Figure 13 below shows a GCP account that has identified five unattached resources that can be removed. Like suspending idle instances, GCP users can leverage Compute Engine to manually delete unused instances:

The delete actions for this account are listed in the Action Center in Figure 14. The information listed in the Delete category of the Action Center includes the size of the persistent disk, the cloud storage tier, the amount of time it has been unattached and the cost impact of removing it:

For additional insight on the impact of these delete actions, customers can select the DETAILS tab and find more information, as shown in Figure 15. Below, you can see the purpose of this action is to increase savings. Customers can also see additional information like the volume details, whether the action is disruptive and the resource and cost impact:

Trustworthy automation with IBM Turbonomic is the best way to maximize business value on Google Cloud Platform

For cloud engineering and operations teams looking to achieve budget goals, reduce their cloud bill, and pay for only what they need without negatively impacting customer experience, IBM Turbonomic offers a proven path that you can trust. Only Turbonomic can analyze your Google Cloud Platform (GCP) environment and continuously match real-time application demand to Google Cloud’s unprecedented number of configuration options across compute, storage, database and discounted google cloud pricing. In addition to optimizing Google Cloud Platform (GCP) costs with Turbonomic, incorporating serverless cloud functions into your cloud cost management strategy can provide additional flexibility and cost savings.

Are you looking to reduce spend across your GCP environment and harness the full potential of cloud computing as soon as possible? IBM Turbonomic’s automation can be operationalized, allowing teams to see tangible results immediately and continuously, while achieving 247% ROI over three years. Read the Forrester Consulting commissioned study to see what outcomes and performance upgrades our customers have achieved with IBM Turbonomic.

Author

Spencer Mehm

Product Marketing Manager

Insights you can’t miss. Subscribe to our newsletters.

Go beyond the hype with expert news on AI, quantum computing, cloud, security and much more.

Subscribe today