After you install WML Accelerator
1.2.2 you can connect it to IBM Watson® Machine
Learning or IBM Watson Studio
Local on IBM® Cloud Pak for
Data 3.0.1.
About this task
Note: Connecting WML Accelerator 1.2.2 with WML and Watson Studio on IBM Cloud Pak for Data
3.0.1 is available as a technical preview.
Technical preview: The code and application
programming interfaces herein are technology preview information that may not be made generally
available by IBM as or in a product. You are permitted to use the information only for internal use
for evaluation purposes and not for use in a production environment. IBM provides the information
without obligation of support and "as is" without warranty of any
kind.
If
you plan on using WML Accelerator with IBM Watson Studio
Local or WML, you must complete the following after
installing WML Accelerator.
Before you begin
- Ensure that you have installed WML Accelerator: Installing WML Accelerator.
Note:
- If you are installing WML Accelerator for the first time, or upgrading from WML Accelerator
version 1.2.1 to 1.2.2, you must configure or rebuild your Anaconda environment to use WML CE 1.6.2:
Configure a system for IBM Spectrum Conductor Deep Learning Impact
- As part of the WML Accelerator installation, you created two instance groups, one distributed
training (using the wmla-ig-template-2.3.3 template) and one for elastic distributed training (using
the wmla-ig-edt-template-2.3.3 template). These instance groups will be used by WML when pushing
training jobs to WML Accelerator.
- Ensure that you have installed WML: Installing IBM Watson Machine
Learning.
- Review the list of known issues and limitations: Known issues in the WML integration
- Consider the following regarding user authentication and execution:
- If LDAP is used with IBM Watson Studio
Local, a common LDAP
server can be used by both WML Accelerator and IBM Watson Studio
Local for storing user credentials. To do so, you must
configure LDAP for WML Accelerator, see: Configuring user authentication for PAM and default clients
- If LDAP is not used, each IBM Watson Studio
Local user that
wants to run training jobs, must be added to WML Accelerator. You must create an OS user with the same username. You must ensure that each user has the same
UID, group ID (GID), and password on all hosts in the cluster.
- In either case, users must be assigned to the instance groups and they must have a Data
Scientist or Consumer user role. Roles can be assigned by the Consumer administrator or the cluster
administrator. See: Adding a consumer user or Assigning roles to users or user groups
Steps:
Procedure
- Update your instance groups to use the correct version of WML CE. Update
your instance groups to use WML CE 1.6.2.
- Select the Workload tab and click Instance
Groups.
- Find your instance groups used for training and elastic distributed training and click
Modify.
- Configure an additional Spark parameter:
- In the parameter drop down, select Additional Environment Variables and
click Add an Environment Variable.
- Set the Name to DLI_DEFAULT_CONDA_ENV_NAME and the Value to
dlipy36-wmlce162.
- Click Save.
- Update the public key for IBM Watson Studio
Local.
- Get the standalone IBM Watson Studio
Local
certificate:
wget -e https://ws_host:443/auth/jwtcert -o jwtcert
where
ws_host is the
IBM Watson Studio
Local host
IP address. To get the value of
ws_host, issue the following command:
oc get routes | grep ibm-nginx | awk '{print $2}'
- Get the public PEM key from the certificate:
openssl x509 -pubkey -in jwtcert -noout >new_pub_key.pem
- Set the location of the secret key by setting the
DLI_JWT_SECRET_KEY value in
EGO_CONFDIR/../../dli/conf/dlpd/dlpd.conf to the location of the public PEM
key.
"DLI_JWT_SECRET_KEY": "/dlishared/public_key.pem",
- Update the cluster with the new public PEM key:
cat new_pub_key.pem > /dlishared/public_key.pem
- Update permissions of the public PEM key.
chown $CLUSTERADMIN $DLI_SHARED_FS/public_key.pem
- Source the environment.
source EGO_TOP/profile.platform
egosh user logon -u Admin -x Admin
- Stop the dlpd service.
- Start the dlpd service.
- Set the following metric variables in
$EGO_CONFDIR/../../dli/conf/dlpd/dlpd.conf.
"DLI_METRICS_STREAMING_ENABLED": "Y",
"METRICS_STREAMING": "Y",
"EMETRICS_STREAMING": "on",
"EMETRICS_STREAMING_GPU": "on",
"EMETRICS_STREAMING_STDOUT": "on",
- Connect Watson Machine Learning with Watson Machine Learning
Accelerator. Run the
updateWMLClusterdetails.sh
command line utility
which
allows IBM Watson Studio
Local to locate and use a WML Accelerator instance.
- Use SSH to remotely access the Watson Machine Learning host from the master
host.
- Navigate to the cpd-linux-workspace/modules/wml/x86_64/3.0.1
directory.
cd cpd-linux-workspace/modules/wml/x86_64/3.0.1
- Extract the files from the wml-base-3.0.1-39.tgz.
tar -zxvf wml-base-3.0.1-39.tgz
- Copy the updateWMLClusterdetails.sh file to
/ibm/InstallPackage/components/modules/wml.
mv updateWMLClusterdetails.sh /ibm/InstallPackage/components/modules/wml
- Switch to the IBM Watson Machine Learning directory.
cd /ibm/InstallPackage/components/modules/wml
- Run the following
command:
./updateWMLClusterdetails.sh <wmla_host> <wmla_ port> <wmla_ig> <wmla_edt_ig> <wml_external host adress>
where:
-
wmla _host is the IP address that can be accessed from WML cluster in WML Accelerator cluster master host
-
wmla_ port is the port exposed by WML Accelerator for the deep learning rest API. By default this is
set to 9243
-
wmla_ig is the instance group name created in WML Accelerator for single and distributed jobs. For example:
wml-ig
-
wmla_edt_ig is the instance group created in WML Accelerator for elastic distributed training jobs. For example:
wml-ig-edt
-
wml_external_host is the external host name of a modified IBM Cloud Pak for
Data console URL that can be accessed from WML Accelerator.
For
example:
updateWMLClusterdetails.sh https://wmla-master.example.com 9243 wml-ig wml-ig-edt
Learn
more about running the command line utility:
Setting up WML Accelerator with IBM Watson Studio
LocalNote:
Specifying an incorrect hostname while running
./updateWMLClusterdetails.sh
will
lead to a known issue with metrics. When running the
./updateWMLClusterdetails.sh
script, ensure that your fifth parameter is correct. To ensure that it is correct,
run:
oc get routes | grep ibm-nginx | awk '{print $2}'
What to do next
Get started
After you have successfully connected WML Accelerator with Watson Machine Learning and Watson Studio, here
are a few links to get you started: