Getting started with IBM Distributed Accelerated ML library

Find information about getting started with pai4sk and SnapML.

This release of WML CE includes a conda package named pai4sk, which includes the IBM accelerated Machine Learning library. The main component of this library includes SnapML APIs. Snap ML is a library for training generalized linear models. It is being developed at IBM with the vision to remove training time as a bottleneck for machine learning applications. Snap ML supports many classical machine learning models and scales gracefully to data sets with billions of examples or features. Snap ML training can be performed in a single machine or distributed across a cluster of machines. It also offers GPU acceleration and supports sparse data structures. The library is exposed through a Python API compatible with sklearn and can seamlessly be integrated into existing Python application. The following APIs are supported:

LogisticRegression
LinearRegression
SupportVectorMachine
DecisionTreeClassifier (Supports Single Threaded - CPU version only)
RandomForestClassifier (CPU version only)

SnapML uses a proprietary data format named snap for efficient data loading for both single and multiple node training. The following list is a set of APIs provided to load and store the datasets in snap format,

Because pai4sk is built upon scikit-learn library version 0.20.1, it can be used as a replacement for scikit-learn. Some of the APIs are accelerated by making use of SnapML and cuML under the hood. This module will automatically fall back to original scikit-learn behavior when SnapML or cuML does not provide the necessary support. The following links are a list of such APIs:

Note: If you are using mpirun instead of snaprun, consider the following recommendations:

On single system without an InfiniBand set up, use --pami_noib option of mpirun.
On multiple systems without an InfiniBand set up, use -mca btl tcp,self instead of -tcp option of mpirun.

Note: It is recommended to run the similarity search applications in distributed mode using MPI while running on CPU as the CPU version of similarity search currently supports single threaded execution only.

To run pai4sk applications in a distributed way, use snaprun to start the application as follows,

Determines the necessary arguments to pass to MPI based on the current environment and version of MPI.
Tests connections to the hosts, including the correct setup of ssh keys.
Verifies that pai4sk is installed across the hosts.
Detects the hardware configuration of the hosts, including GPU count, and generates a valid topology.
Generates the necessary rankfile, providing options to specify more specific topology details.
Constructs, displays, and executes the mpirun command needed to distribute jobs to each node.

Run snaprun -h to get the usage details of this tool.

Example programs for each of the above mentioned APIs are provided as part of the conda package. To find out how to run the sample programs, refer to the READMEs placed under $CONDA_PREFIX/pai4sk/local-examples/ and $CONDA_PREFIX/pai4sk/mpi-examples/.

Sample Jupyter notebooks are provided in this github repository.

Note: DecisionTreeClassifier and RandomForestClassifier APIs will be in technology preview for this release. All the APIs which are dependent on cuML will be in technology preview for this release.