Installing IBM Watson Machine Learning Accelerator with WML or Watson Studio on IBM Cloud Pak for Data 3.0.1

After you install WML Accelerator 1.2.2 you can connect it to IBM Watson® Machine Learning or IBM Watson Studio Local on IBM® Cloud Pak for Data 3.0.1.

About this task

Note: Connecting WML Accelerator 1.2.2 with WML and Watson Studio on IBM Cloud Pak for Data 3.0.1 is available as a technical preview.

Technical preview: The code and application programming interfaces herein are technology preview information that may not be made generally available by IBM as or in a product. You are permitted to use the information only for internal use for evaluation purposes and not for use in a production environment. IBM provides the information without obligation of support and "as is" without warranty of any kind.

If you plan on using WML Accelerator with IBM Watson Studio Local or WML, you must complete the following after installing WML Accelerator.

Before you begin
  1. Ensure that you have installed WML Accelerator: Installing WML Accelerator.
    Note:
    • If you are installing WML Accelerator for the first time, or upgrading from WML Accelerator version 1.2.1 to 1.2.2, you must configure or rebuild your Anaconda environment to use WML CE 1.6.2: Configure a system for IBM Spectrum Conductor Deep Learning Impact
    • As part of the WML Accelerator installation, you created two instance groups, one distributed training (using the wmla-ig-template-2.3.3 template) and one for elastic distributed training (using the wmla-ig-edt-template-2.3.3 template). These instance groups will be used by WML when pushing training jobs to WML Accelerator.
  2. Ensure that you have installed WML: Installing IBM Watson Machine Learning.
  3. Review the list of known issues and limitations: Known issues in the WML integration
  4. Consider the following regarding user authentication and execution:
    • If LDAP is used with IBM Watson Studio Local, a common LDAP server can be used by both WML Accelerator and IBM Watson Studio Local for storing user credentials. To do so, you must configure LDAP for WML Accelerator, see: Configuring user authentication for PAM and default clients
    • If LDAP is not used, each IBM Watson Studio Local user that wants to run training jobs, must be added to WML Accelerator. You must create an OS user with the same username. You must ensure that each user has the same UID, group ID (GID), and password on all hosts in the cluster.
    • In either case, users must be assigned to the instance groups and they must have a Data Scientist or Consumer user role. Roles can be assigned by the Consumer administrator or the cluster administrator. See: Adding a consumer user or Assigning roles to users or user groups
Steps:

Procedure

  1. Update your instance groups to use the correct version of WML CE. Update your instance groups to use WML CE 1.6.2.
    1. Select the Workload tab and click Instance Groups.
    2. Find your instance groups used for training and elastic distributed training and click Modify.
    3. Configure an additional Spark parameter:
      1. In the parameter drop down, select Additional Environment Variables and click Add an Environment Variable.
      2. Set the Name to DLI_DEFAULT_CONDA_ENV_NAME and the Value to dlipy36-wmlce162.
      3. Click Save.
  2. Update the public key for IBM Watson Studio Local.
    1. Get the standalone IBM Watson Studio Local certificate:
      wget -e https://ws_host:443/auth/jwtcert -o jwtcert
      where ws_host is the IBM Watson Studio Local host IP address. To get the value of ws_host, issue the following command:
      oc get routes | grep ibm-nginx | awk '{print $2}'
    2. Get the public PEM key from the certificate:
      openssl x509 -pubkey -in jwtcert -noout >new_pub_key.pem
    3. Set the location of the secret key by setting the DLI_JWT_SECRET_KEY value in EGO_CONFDIR/../../dli/conf/dlpd/dlpd.conf to the location of the public PEM key.
      "DLI_JWT_SECRET_KEY": "/dlishared/public_key.pem",
    4. Update the cluster with the new public PEM key:
      cat new_pub_key.pem > /dlishared/public_key.pem
    5. Update permissions of the public PEM key.
      chown $CLUSTERADMIN $DLI_SHARED_FS/public_key.pem
    6. Source the environment.
      source EGO_TOP/profile.platform
      egosh user logon -u Admin -x Admin
    7. Stop the dlpd service.
      egosh service stop dlpd
    8. Start the dlpd service.
      egosh service start dlpd
  3. Set the following metric variables in $EGO_CONFDIR/../../dli/conf/dlpd/dlpd.conf.
    "DLI_METRICS_STREAMING_ENABLED": "Y",
    "METRICS_STREAMING": "Y",
    "EMETRICS_STREAMING": "on",
    "EMETRICS_STREAMING_GPU": "on",
    "EMETRICS_STREAMING_STDOUT": "on",
  4. Connect Watson Machine Learning with Watson Machine Learning Accelerator. Run the updateWMLClusterdetails.sh command line utility which allows IBM Watson Studio Local to locate and use a WML Accelerator instance.
    1. Use SSH to remotely access the Watson Machine Learning host from the master host.
    2. Navigate to the cpd-linux-workspace/modules/wml/x86_64/3.0.1 directory.
      cd cpd-linux-workspace/modules/wml/x86_64/3.0.1
    3. Extract the files from the wml-base-3.0.1-39.tgz.
      tar -zxvf wml-base-3.0.1-39.tgz
    4. Copy the updateWMLClusterdetails.sh file to /ibm/InstallPackage/components/modules/wml.
      mv updateWMLClusterdetails.sh /ibm/InstallPackage/components/modules/wml
    5. Switch to the IBM Watson Machine Learning directory.
      cd /ibm/InstallPackage/components/modules/wml
    6. Run the following command:
      ./updateWMLClusterdetails.sh <wmla_host> <wmla_ port> <wmla_ig> <wmla_edt_ig> <wml_external host adress>
      where:
      • wmla _host is the IP address that can be accessed from WML cluster in WML Accelerator cluster master host

      • wmla_ port is the port exposed by WML Accelerator for the deep learning rest API. By default this is set to 9243

      • wmla_ig is the instance group name created in WML Accelerator for single and distributed jobs. For example: wml-ig

      • wmla_edt_ig is the instance group created in WML Accelerator for elastic distributed training jobs. For example: wml-ig-edt

      • wml_external_host is the external host name of a modified IBM Cloud Pak for Data console URL that can be accessed from WML Accelerator.

      For example:
      updateWMLClusterdetails.sh https://wmla-master.example.com 9243 wml-ig wml-ig-edt
      Learn more about running the command line utility: Setting up WML Accelerator with IBM Watson Studio Local
      Note:
      Specifying an incorrect hostname while running ./updateWMLClusterdetails.sh will lead to a known issue with metrics. When running the ./updateWMLClusterdetails.sh script, ensure that your fifth parameter is correct. To ensure that it is correct, run:
      oc get routes | grep ibm-nginx | awk '{print $2}'

Results

What to do next

Get started

After you have successfully connected WML Accelerator with Watson Machine Learning and Watson Studio, here are a few links to get you started: