Upgrading on a shared environment

Follow these steps for a rolling upgrade when your existing cluster is in a shared environment.

  • Back up any custom configuration files before you upgrade, so that you can restore your customization after you have completed the upgrade.
  • Ensure that you have read the requirements.
  • You require the IBM Spectrum Conductor Deep Learning Impact 1.2.3 installation package and the entitlement file.
  • Perform the rolling upgrade by using the same user account that you used to install the previous version of IBM Spectrum Conductor Deep Learning Impact.
  1. Log on to your master host by using the same user account that you used to install your previous version.
  2. If the WML Accelerator bin file was not extracted on this host, install the WML Accelerator license. Otherwise, skip this step.
    1. Copy ibm-wmla-license-1.2.1_*.tar.gz to the node.
    2. Set EGO_TOP to be the WML Accelerator destination directory, typically /opt/ibm/spectrumcomputing. Example:
      export EGO_TOP=/opt/ibm/spectrumcomputing
    3. Extract the package. Example:
      tar xvzf ibm-wmla-license-1.2.1_*.tar.gz -C $EGO_TOP --no-same-owner
    4. Accept the license by running this command:
      IBM_WMLA_LICENSE_ACCEPT=yes $EGO_TOP/ibm-wmla/1.2.1/bin/accept-ibm-wmla-license.sh
  3. (Optional) If you plan on performing a rollback, save the license for the previous version of WML Accelerator to a known location. For example:
    cp /opt/anaconda3/pkgs/ibm-wmla-license-version_number.tar.bz2 /tmpdir
  4. Remove the license for the previous version of WML Accelerator.
    . /opt/anaconda3/etc/profile.d/conda.sh
    conda remove ibm-wmla-license
  5. Copy the entitlement file dli_entitlement.dat to a location that is accessible from the master host.
  6. Export the IBM Spectrum Conductor Deep Learning Impact 1.2.3 entitlement file.
    export ENTITLEMENTFILE=/opt/temp/dli_entitlement.dat
  7. Set the environment variables for version 1.2.3 using the same environment variables as your previous version.
  8. Install version 1.2.3 on your master host. When installing version 1.2.3, make sure to set the same path for version 1.2.3 as your previous version.
    For Linux (64-bit):
    ./dli-1.2.3.0_x86_64.bin --prefix previous_install_location --dbpath previous_dbpath_location
    where:
    • --prefix previous_install_location specifies the absolute path to the previous installation directory.
    • --dbpath previous_dbpath_location specifies the absolute path to the previous RPM database.
    For Linux for POWER LE (64-bit):
    ./dli-1.2.3.0_ppc64le.bin --prefix previous_install_location --dbpath previous_dbpath_location
    where:
    • --prefix previous_install_location specifies the absolute path to the previous installation directory.
    • --dbpath previous_dbpath_location specifies the absolute path to the previous RPM database.
  9. Upgrade the conda package:
    #Upgrade conda package to 4.6.11
    conda install conda=4.6.11
  10. Upgrade WML Accelerator and IBM Spectrum Conductor Deep Learning Impact dependencies by running the following script:
    #Activate dlinsights and install dependencies.  
    conda activate dlinsights
    conda install --yes numpy=1.12.1
    conda install --yes pyopenssl==18.0.0 
    conda install --yes Flask==0.12.2 Flask-Cors==3.0.3 scipy==1.0.1 pathlib==1.0.1 SQLAlchemy==1.1.13 requests=2.21 alembic=1.0.5
    pip install --no-cache-dir warlock==1.3.0 elasticsearch==5.2.0 Flask-Script==2.0.5 Flask-HTTPAuth==3.2.2 mongoengine==0.11.0 python-heatclient==1.2.0 python-keystoneclient==3.17.0
    conda deactivate
    
    #Set up the WML CE Anaconda channel:
    conda config --system --add channels https://public.dhe.ibm.com/ibmdl/export/pub/software/server/ibm-ai/conda
    
    #WML CE DDL requires openssh-clients. Install it by running this command:
    yum install openssh-clients
    
    #Activate dlipy3 and install dependencies.  
    conda activate dlipy3
    conda install --yes powerai==1.6.1
    conda install --yes keras==2.2.4
    conda install --yes configparser==3.7.4
    conda install --yes ipython==5.3.0 python-lmdb==0.94 nose==1.3.7 requests==2.13.0 pathlib==1.0.1
    conda install --yes redis-py==2.10.5 chardet==3.0.4 flask==1.0.2
    conda install --yes python-gflags==3.1.2 pandas==0.24.2 pyzmq==17.1.2
    
    pip install --no-cache-dir easydict==1.9
    pip install --no-cache-dir hanziconv==0.3.2 gensim==3.6.0
    
    pip install --no-cache-dir asyncio==3.4.3 ipaddress==1.0.22 defusedxml==0.5.0
    conda deactivate
     
    #Activate dlipy2 and install dependencies.  
    conda activate dlipy2                                                                                                      
    conda install --yes powerai==1.6.1                                                                                          
    conda install --yes keras==2.2.4                                                                                        
    conda install --yes configparser==3.7.4
    conda install --yes ipython==5.3.0 python-lmdb==0.94 nose==1.3.7 requests==2.13.0 pathlib==1.0.1
    conda install --yes redis-py==2.10.5 chardet==3.0.4 flask==1.0.2
    conda install --yes python-gflags==3.1.2 pandas==0.24.2 pyzmq==17.1.2
    conda install --yes trollius==2.2 cython==0.29.4
    
    pip install --no-cache-dir easydict==1.9
    pip install --no-cache-dir hanziconv==0.3.2 gensim==3.6.0
    pip install --no-cache-dir weave==0.16.0 ipaddress==1.0.22 defusedxml==0.5.0
    conda deactivate
     
    #Install elastic distributed training dependencies.
    yum install openblas-devel glog-devel gflags-devel hdf5-devel leveldb-devel libsodium-devel lmdb-devel
    
    #Install additional dependency packages.
    yum install sudo openssh-clients gcc-c++ gcc-gfortran openssl-devel bzip2 gettext which net-tools iproute zip perl
  11. Log on to each management and compute hosts and complete the following steps:
    1. Source the environment:
      . $EGO_TOP/profile.platform
    2. Restart EGO on the host:
      egosh ego shutdown
      egosh ego start
  12. Log on to your master host and complete the following:
    1. Source the environment:
      . $EGO_TOP/upgrade/conf/profile.upgrade
    2. Run the cluster upgrade command.
      egoupgrade cluster [-f][-u username][-x password]
      For syntax and usage details, see egoupgrade.
  13. Migrate your datasets and model assets. When you run the upgrade migration script, data will be backed up, copied from DLI_SHARED_FS to DLI_RESULT_FS and permissions will be updated.
    sh $EGO_TOP/dli/1.2.3/dlpd/bin/dli-upgrade-1.2.3.sh
    Note: As datasets and models are copied from DLI_SHARED_FS to DLI_RESULT_FS, a copy remains in DLI_SHARED_FS for roll back purposes. If you do not intend on performing a rollback to the previous version of IBM Spectrum Conductor Deep Learning Impact, you can delete $DLI_SHARED_FS/models and $DLI_SHARED_FS/datasets.
Your hosts are upgraded to version 1.2.3. Your previous version configuration (including users, custom user roles, consumers, resource groups, resource plans, or custom reports; and enabled features) is also migrated to version 1.2.3.

Clear cache of your browser. Test your cluster and update your models to use the deep learning frameworks that are supported with version 1.2.3.