Hardware and software requirements
IBM Watson® Machine Learning Accelerator requires the following hardware and software.
Hardware requirements
- IBM® POWER8® with NVLink and NVIDIA GPUs
- IBM POWER9™ with NVLink and NVIDIA GPUs
- x86_64 Servers with P100 and V100 GPUs
The following tables list the minimum system requirements for running IBM Watson Machine Learning Accelerator in a production environment. You might have extra requirements (such as extra CPU and RAM) depending on the Spark instance groups that will run on the hosts, especially for compute hosts that run workloads.
Requirement | Management hosts | Compute hosts | Notes |
---|---|---|---|
RAM | 64 GB | 32 GB | In general, the more memory your hosts have, the better performance is. |
Disk space to extract install files from the WML Accelerator install package | 16 GB (First management host only) | NA | |
Disk space to install IBM Spectrum Conductor™ | 12 GB | 12 GB | |
Disk space to install IBM Spectrum Conductor Deep Learning Impact | 11 GB | 11 GB | |
Additional disk space (for Spark instance group packages, logs, and so on.) | Can be 30 GB for a large cluster | 1 GB*N slots + sum of service package sizes (including dependencies) | Disk space requirements depend on the number of Spark instance groups and the Spark applications that you run. Long running applications, such as notebooks and streaming applications, can generate huge amounts of data that is stored in Elasticsearch. What your applications log can also increase disk usage. Consider all these factors when estimating disk space requirements for your production cluster. For optimal performance, look at tuning how long to keep application monitoring data based on your needs. |
Software requirements
The following software is required:
Hardware | Operating system | GPU software |
---|---|---|
POWER8 | Red Hat Enterprise Linux (RHEL) 7.6 (ppc64le) |
|
POWER9 with this security fix: RHSA-2018:1374 - Security Advisory | RHEL 7.6 (ppc64le) |
|
x86 | RHEL 7.6 |
|
- Supported GPUs: NVIDIA P100 and V100
- Shared file system:
- IBM Spectrum Scale 4.2.3, 4.2.2, 4.2.1, 4.1.1, or 5.0.1
- Network file system (NFS) 2, 3, or 4Note: If using NFSv4, only framework plugin functionality for IBM Spectrum Conductor Deep Learning Impact is available. For full functionality of IBM Spectrum Conductor Deep Learning Impact including the cluster management console, use NFSv3 or NFSv2.
Deep learning frameworks
By default, all of the frameworks included with PowerAI are installed. At least one supported framework must be installed. However, it is recommended that you install both TensorFlow and IBM Caffe. If either framework is missing, the option for the missing framework will not work in the cluster management console.
To determine which frameworks are included with WML Accelerator, see What's included.
Required additional repositories
See the following topic: Red Hat Enterprise Linux operating system and repository setup.