Getting started with Apex
This release of WML CE includes a Technology Preview of Apex. Apex is a PyTorch add-on package from NVIDIA with capabilities for automatic mixed precision (AMP) and distributed training.
Apex is currently only provided for Python version 3.6.
WML CE includes Apex as a separate package which can be installed as shown below.
Installing Apex
Follow these steps to install Apex:
- Create a virtual conda environment with
python=3.6
conda create -y -n my-py3-env python=3.6
- Activate the environment
source activate my-py3-env (my-py3-env)$
- Install Apex into the virtual
environment
(my-py3-env)$ conda install apex
Validating the Apex installation
A quick set of tests to verify the installation can be executed using the following command.
(my-py3-env) $ apex-test
Apex examples
Examples can be found in the apex package. They can be copied to a local directory for inspecting and executing as follows:
(my-py3-env) $ apex-install-samples apex-samples
This command will create an apex-samples/examples
directory and copy the
examples into it. Each example subdirectory contains a README file which should be consulted for
more information.
Running examples
Execution of the simple distributed
example:
(my-py3-env) $ cd apex-samples/examples/simple/distributed
(my-py3-env) $ bash run.sh
Execution of the imagenet
example:
(my-py3-env) $ cd apex-samples/examples/imagenet
From this point, imagenet
requires additional steps that you will find in the
README file, available in this directory. In short, there is information about obtaining an
imagenet
data set, which you will need to copy into place. After this is complete,
you will execute the main_amp.py
script with options of your choice, as described
in the README.
More information about Apex
Community resources
The github repo for the Apex project contains a README with additional project links:
- Description and webinar introduction of AMP
- Imagenet example
- Distributed Training documentation and walkthrough