Customizing a runtime to use a MIG-enabled profile
If you are deploying a predictive machine learning model that requires significant processing power for inferencing, you can optionally configure a GPU runtime for the deployment.
Complete this task if both of the following statements are true:
- The machine learning model requires GPU
- You want to use the runtime on a MIG-enabled GPU
If you are configuring MIG for GPU-accelerated workloads, all GPU-enabled nodes should adhere to a single strategy determined in the prior configuration steps. This ensures consistent behaviour across all GPU-enabled nodes in the cluster. To configure MIG support, see Nvidia Guide for configuring MIG support.
Who needs to complete this task?
To complete this task you must have the instance administrator role.
Prerequisites
Before you begin, complete these tasks:
Customizing a runtime to use an MIG-enabled profile
To update the runtime definition to use an MIG-enabled profile, follow these steps:
-
Download the runtime definition for the GPU runtime (for example,
runtime-23.1-py3.10-cuda). For more information, see Downloading the runtime configuration in the Cloud Pak for Data documentation. -
Create a copy of the runtime definition.
Attention: You can only update copies of the runtime definition by following this task. -
In the runtime definition copy, add the
nodeAffinityproperty to specify the MIG profile:"nodeAffinity": { "requiredDuringSchedulingIgnoredDuringExecution": { "nodeSelectorTerms": [ { "matchExpressions": [ { "key": "nvidia.com/mig.config", "operator": "In", "values": ["all-1g.10g"] } ] } ] } } -
Update the runtime definition by using Zen API credentials:
a. Generate a Zen API key for a CPD admin user.
b. Update the runtime definition with
PUTto/v2/runtime_definitions/<runtime_id>.The following code shows how to update the runtime definition in Python. The <runtime_id> is the runtime definition ID of the runtime definition that is being updated, and
new_rdis the updated JSON.headers={'Authorization': 'ZenApiKey <TOKEN>', 'Content-Type': 'application/json'} response = requests.put( f"{CPD_URL}/v2/runtime_definitions/<runtime_id>", json=new_rd, headers=headers, verify=False)Use the token generated in 3.a.
After the custom runtime definitions are updated, you can create deployments that select the nodes that offer a certain MIG profile as updated in the runtime definition.
Parent topic: Administering Watson Machine Learning