Customizing a runtime to use a MIG-enabled profile

If you are deploying a predictive machine learning model that requires significant processing power for inferencing, you can optionally configure a GPU runtime for the deployment.

Complete this task if both of the following statements are true:

  • The machine learning model requires GPU
  • You want to use the runtime on a MIG-enabled GPU

If you are configuring MIG for GPU-accelerated workloads, all GPU-enabled nodes should adhere to a single strategy determined in the prior configuration steps. This ensures consistent behaviour across all GPU-enabled nodes in the cluster. To configure MIG support, see Nvidia Guide for configuring MIG support.

Who needs to complete this task?

To complete this task you must have the instance administrator role.

Prerequisites

Before you begin, complete these tasks:

Customizing a runtime to use an MIG-enabled profile

To update the runtime definition to use an MIG-enabled profile, follow these steps:

  1. Download the runtime definition for the GPU runtime (for example, runtime-23.1-py3.10-cuda). For more information, see Downloading the runtime configuration in the Cloud Pak for Data documentation.

  2. Create a copy of the runtime definition.

    Attention: You can only update copies of the runtime definition by following this task.
  3. In the runtime definition copy, add the nodeAffinity property to specify the MIG profile:

    "nodeAffinity": { "requiredDuringSchedulingIgnoredDuringExecution": {
    "nodeSelectorTerms": [
        { "matchExpressions": [
        { "key": "nvidia.com/mig.config",
            "operator": "In",
            "values": ["all-1g.10g"]
        }
        ] }
    ] }
    }
    
  4. Update the runtime definition by using Zen API credentials:

    a. Generate a Zen API key for a CPD admin user.

    b. Update the runtime definition with PUT to /v2/runtime_definitions/<runtime_id>.

    The following code shows how to update the runtime definition in Python. The <runtime_id> is the runtime definition ID of the runtime definition that is being updated, and new_rd is the updated JSON.

    headers={'Authorization': 'ZenApiKey <TOKEN>', 'Content-Type': 'application/json'}
    
    response = requests.put(
        f"{CPD_URL}/v2/runtime_definitions/<runtime_id>",
        json=new_rd,
        headers=headers,
        verify=False)
    

    Use the token generated in 3.a.

After the custom runtime definitions are updated, you can create deployments that select the nodes that offer a certain MIG profile as updated in the runtime definition.

Parent topic: Administering Watson Machine Learning