Getting started with Apex

This release of WML CE includes a Technology Preview of Apex. Apex is a PyTorch add-on package from NVIDIA with capabilities for automatic mixed precision (AMP) and distributed training.

Apex is currently only provided for Python version 3.6.

WML CE includes Apex as a separate package which can be installed as shown below.

Note: PyTorch will be installed as a requisite to Apex.

Installing Apex

Follow these steps to install Apex:

  1. Create a virtual conda environment with python=3.6
    conda create -y -n my-py3-env python=3.6
  2. Activate the environment
    source activate my-py3-env
    (my-py3-env)$
  3. Install Apex into the virtual environment
    (my-py3-env)$ conda install apex

Validating the Apex installation

A quick set of tests to verify the installation can be executed using the following command.

(my-py3-env) $ apex-test

Apex examples

Examples can be found in the apex package. They can be copied to a local directory for inspecting and executing as follows:

(my-py3-env) $ apex-install-samples apex-samples

This command will create an apex-samples/examples directory and copy the examples into it. Each example subdirectory contains a README file which should be consulted for more information.

Running examples

Execution of the simple distributed example:

(my-py3-env) $ cd apex-samples/examples/simple/distributed
(my-py3-env) $ bash run.sh

Execution of the imagenet example:

(my-py3-env) $ cd apex-samples/examples/imagenet

From this point, imagenet requires additional steps that you will find in the README file, available in this directory. In short, there is information about obtaining an imagenet data set, which you will need to copy into place. After this is complete, you will execute the main_amp.py script with options of your choice, as described in the README.

More information about Apex

Community resources

The github repo for the Apex project contains a README with additional project links:

  • Description and webinar introduction of AMP
  • Imagenet example
  • Distributed Training documentation and walkthrough
Note: Community documentation resources are subject to change without notice.