AWS Metrics Collection

It is highly recommended that you enable collection of metrics in your environment. Enabling metrics allows Turbonomic to generate scale actions to optimize VM resource usage. For Turbonomic to collect metrics, you must enable the collection of these metrics on the VMs in your environment.

This topic describes the collection of the following metrics:

Some of the steps to do this are different depending on whether your VM is running a Linux or Windows OS. To enable metrics collection, you must meet the following requirements:

  • The VM image must have an SSM agent installed

    • Linux VMs:

      By default, Linux AMIs dated 2017.09 and later include an installed SSM Agent.

    • Windows VMs:

      You must install the SSM agent on the VMs. For more information, see Working with SSM Agent.

  • Access to the CloudWatch service

    Your AWS Instance must have internet access or direct access to CloudWatch so it can push data to CloudWatch.

  • Access from Turbonomic

    For Turbonomic to access metrics, the account that it uses to connect to the AWS target must include the correct permissions. If you configured the AWS target via an AWS key (not an IAM role), then you must include the permissions as specified in the section for configuring an AWS target.

    If you use an IAM role for the Turbonomic connection, then that role must include the following as a minimum:

    • AmazonEC2ReadOnlyAccess
    • AmazonS3ReadOnlyAccess
    • AmazonRDSReadOnlyAccess

To set up the collection of metrics for your VMs:

  1. Attach an IAM role to each VM instance.

    Each EC2 instance must have an attached IAM role that grants CloudWatch access. To grant that access, include the AmazonSSMFullAccess policy in the role.

    Use the AWS System Manager to attach the necessary roles to your VMs.

    Note:

    If you want to grant the role lesser access, you can use the AmazonEC2RoleforSSM policy. This is a custom policy that allows the action ssm:GetParameter to access the resource, arn:aws:ssm:*:*:parameter/*.

  2. Install the CloudWatch agent on your Linux VMs.

    Navigate to the AWS System Manager service for the account and region that you want to configure. In the service, navigate to the Run Command screen and set up the AWS-ConfigureAWSPackage command to install AmazonCloudWatchAgent on your VMs. For more information, see the AWS documentation.

  3. Create configuration data for the CloudWatch agent.

    The configuration data is a JSON object that you will add to as a parameter to the Parameter Store. The object must include the following, depending on whether it's for a Linux or a Windows VM instance.

    • Linux Configuration for Standard Memory

      {
        "agent":{
         "metrics_collection_interval":60,
         "logfile":"/opt/aws/amazon-cloudwatch-agent/logs/amazon-cloudwatch-agent.log"
        },
        "metrics":{
         "namespace": "custom",
         "metrics_collected":{
           "mem":{
            "measurement":[
              {
               "name":"mem_available", "rename":"MemoryAvailable", "unit": "Bytes"
              }
            ]
           }
         },
         "append_dimensions":{
           "AutoScalingGroupName": "${aws:AutoScalingGroupName}",
           "ImageId": "${aws:ImageId}",
           "InstanceId": "${aws:InstanceId}",
           "InstanceType": "${aws:InstanceType}"
         }
        }
      }
    • Linux Configuration for Standard Memory and NVIDIA GPU Card/Memory Utilization

      {
        "agent":{
         "metrics_collection_interval":60,
         "logfile":"/opt/aws/amazon-cloudwatch-agent/logs/amazon-cloudwatch-agent.log"
        },
        "metrics":{
         "namespace": "CWAgent",
         "metrics_collected":{
            "nvidia_gpu": {
               "measurement": [
                   "utilization_gpu",
                   "memory_used"
                ]
            },
           "mem":{
            "measurement":[
              {
               "name":"mem_available", "rename":"MemoryAvailable", "unit": "Bytes"
              }
            ]
           }
         },
         "append_dimensions":{
           "AutoScalingGroupName": "${aws:AutoScalingGroupName}",
           "ImageId": "${aws:ImageId}",
           "InstanceId": "${aws:InstanceId}",
           "InstanceType": "${aws:InstanceType}"
         }
        }
      }
    • Linux Configuration for NVIDIA GPU Metrics (DCGM)

      Run the setup_aws_dcgm_exporter.py script to automate the collection of NVIDIA GPU metrics through Data Center GPU Manager (DCGM). Certain prerequisites must be met before you run the script. For more information, see this GitHub page.

      If you need assistance with the script, contact your Turbonomic representative.

    • Windows Configuration for Standard Memory

      {
        "metrics": {
          "namespace": "Windows System",
          "append_dimensions": {
            "InstanceId": "${aws:InstanceId}"
          },
          "aggregation_dimensions" : [ ["InstanceId"] ],
          "metrics_collected": {
            "Memory": {
              "measurement": [
                {"name" : "Available Bytes", "rename": "MemoryAvailable", "unit": "Bytes"}
              ],
              "metrics_collection_interval": 60
            },
            "Paging File": {
              "measurement": [
                {"name": "% Usage", "rename": "paging_used"}
              ],
              "metrics_collection_interval": 60,
              "resources": [
                "*"
              ]
            }
          }
        }
      }

    Note that you can configure optional parameters for the CW Namespace and region. However, if you configure more metrics for CloudWatch to collect, these metrics do not affect Turbonomic analysis and they do not show up in the user interface.

  4. Create a parameter store.

    1. Create a parameter.

      In the AWS System Manager, navigate to Parameter Store and create a parameter. Copy and paste the JSON agent configuration (created in preceding steps) into the parameter Value field.

    2. Name the parameter.

      For example, AmazonCloudWatch-MyMemoryParam. You can use a different name, but per the Amazon documentation, the name must begin with AmazonCloudWatch. For more information, see Store the CloudwatchConfig File in Parameter Store.

      You must remember this parameter name.

    3. Click to create the parameter.

  5. Deploy the CloudWatch parameter to your VMs.

    In the AWS System Manager, navigate to the Run Command screen to configure and run the AmazonCloudWatch-ManageAgent command. The configuration should include:

    • Action: configure
    • Mode: ec2
    • Optional Configuration Source: ssm
    • Optional Configuration Location: Give the name of the parameter that you created earlier.
    • Optional Restart: yes (this restarts the CloudWatch Agent, not the VM instance)
    • Targets: The VMs that you will deploy the CloudWatch configuration to

    When the command is configured, run it. This configures collection of metrics for your instances.

  6. Verify that you are collecting metrics for your instances.

    Navigate to the CloudWatch page, and display Metrics in the CWAgent namespace. Then inspect the instances by ID to verify that you can see MemoryAvailable or utilization_gpu and memory_used metrics if you are collecting GPU metrics.

For more information about enabling metrics collection for AWS, see the following Support articles: