Installing IBM Spectrum Conductor to a shared environment

When you install to a shared environment, you install IBM® Spectrum Conductor once on a shared file system (such as IBM Spectrum Scale™) that every host in the cluster shares and to which every host has access.

  • Install the shared file system in your environment. For shared storage, you can use IBM Spectrum Scale.

    If you are going to set up a shared directory for failover that uses a different location or file system, then you have to manually create an environment file so that you can source the compute host environment separately.

  • IBM Spectrum Conductor that is installed to a shared environment does not support a mixed cluster that uses both Linux and Linux on POWER.
  • Install cURL 7.28 or higher for Elastic Stack on all management hosts, and all hosts that will be used to run notebooks.
  • (Optional) The ServiceDirector and WebServiceGateway services are, by default, set to start manually. If you plan to use these services on a management host, you must manually start them after installation. To start the ServiceDirector service, the management host must use glibc version 2.14 or higher.
Follow these steps to install IBM Spectrum Conductor to a shared file system. For this installation, you install IBM Spectrum Conductor once on a host and configure the setup for the different hosts in your cluster.

For production clusters, log in with root or sudo to root permission. For evaluation clusters you can install as any user, who becomes the cluster administrator. If you install as non root, every execution user that is specified must be the Cluster Administrator.

  1. Log in to the host (root or sudo to root permission).
  2. Define the cluster properties by setting the following environment variables. If you do not set the optional environment variables, the default values are used.
    Option Description
    BASEPORT Optional. Set for the cluster. The cluster uses seven consecutive ports from the base port. The default port number is 7869. For example:
    export BASEPORT=14899
    Note: Before installation, make sure that the seven consecutive ports are not in use.
    CLUSTERADMIN Mandatory if you are installing as root. Set to any valid operating user account, which then owns all installation files. For example:
    export CLUSTERADMIN=egoadmin
    Note: You must create the egoadmin user if it does not already exist. When you set up users on all your hosts (both management and compute hosts), the execution user must use the same user ID (UID) and GID on all the hosts.
    CLUSTERNAME Optional. Set to the name of the cluster. The default is cluster1. For example:
    export CLUSTERNAME=cluster123
    Important: You cannot change the cluster name after installation.
    IBM_SPECTRUM_CONDUCTOR_LICENSE_ACCEPT Mandatory if using quiet installation mode to accept the license agreement. For example:
    export IBM_SPECTRUM_CONDUCTOR_LICENSE_ACCEPT=Y
    SHARED_FS_INSTALL Mandatory when you install to a shared file system:
    export SHARED_FS_INSTALL=Y
  3. Optional: If you want to modify the Elastic Stack configuration, set the following environment variables. If you do not set these environment variables, the default values are used. You cannot change these values after installation.
    Option Description
    ELASTIC_HARVEST_LOCATION Specifies the directory under which the new logging structure logs' symbolic links that point to the real Spark log directory ($SPARK_HOME/logs) are stored that Filebeat uses to harvest information and support queries. All Spark instance group logs are organized under one directory with human readable names and structure. The ELASTIC_HARVEST_LOCATION directory must be shared if IBM Spectrum Conductor is installed to a shared environment, and the value must be different from ELK_HARVEST_LOCATION. The default directory is ${EGO_TOP}/elastic_logs.
    ELK_ESHTTP_PORT Specifies the port that is used by the indexer service for communication with Elasticsearch client node, and on which the Elasticsearch RESTful APIs are accessible. The default port number is 9200.
    ELK_ESHTTP_MASTER_PORT Species the port that is used for communication to the Elasticsearch masterprimary node. The default port number is 9201.
    ELK_ESHTTP_DATA_PORT Species the port that is used for communication to the Elasticsearch data node. The default port number is 9202.
    ELK_ESSYSCOMM_PORT Specifies the port that is used for communication to the Elasticsearch client node within the Elasticsearch cluster. The default port number is 9300.
    ELK_ESSYSCOMM_MASTER_PORT Specifies the port that is used for communication to the Elasticsearch primary node within the Elasticsearch cluster. The default port number is 9301.
    ELK_ESSYSCOMM_DATA_PORT Specifies the port that is used for communication to the Elasticsearch data node within the Elasticsearch cluster. The default port number is 9302.
    ELK_LOGSHIPPER_PORT Specifies the port that is used by the indexer service. The default port number is 5043.
    ELK_HARVEST_LOCATION Specifies the directory under which the old logging structure logs are stored that Filebeat uses to harvest information and support queries. The default directory is /var/tmp/elk_logs/.
    ELK_DATA_LOCATION Specifies the Elastic Stack data directory. The default value for this directory can be overwritten during installation, or is configurable at $EGO_CONFDIR/../../integration/elk/conf/elk.conf.
  4. Optional: If you do not want to enable SSL communication for your web servers and Spark instance groups (enabled by default), set the DISABLESSL environment variable:
    export DISABLESSL=Y
    Check for port conflicts to ensure that the web servers ports are free. The web servers are accessible on the following default ports:
    Web server With SSL Without SSL
    Web server for the cluster management console 8443 8080
    REST web server 8543 8180
    ascd web server 8643 8280
    Important: You must use the same SSL setting for the cluster management console and the RESTful web servers. If you disable SSL for one, you must disable SSL for the other as well. This setting also takes effect for cloud bursting with host factory. Ensure that SSL for all these functions is configured consistently in the cluster; without a uniform configuration, errors occur. Note, however, that when SSL is uniformly enabled, you can use different certificates and keys as required.
  5. Optional: If you want to use the non-production Derby database for reporting, set the DERBY_DB_HOST environment variable:
    export DERBY_DB_HOST=primary_host
    where primary_host is the primary host or another management host that serves as the database host.

    You cannot use the Derby database for production clusters. To produce regular reports for a production cluster, you must configure an external production database after installation. See Setting up an external database for production.

    Without the Derby database or an external database, you cannot generate reports or view the Rack View on the cluster management console. If you do not require these functions, you can manually disable individual data loaders after installation. See Data loaders.

  6. Run the IBM Spectrum Conductor installer package.
    • To install with default settings, enter the following commands:
      Note: IBM Spectrum Conductor is installed in its default directory: /opt/ibm/spectrumcomputing. This directory must be mounted on a shared file system.
      Entitled version:
      export IBM_SPECTRUM_CONDUCTOR_LICENSE_ACCEPT="Y"
      conductor2.3.0.0_ppc64le.bin --quiet
      export IBM_SPECTRUM_CONDUCTOR_LICENSE_ACCEPT="Y"
      conductor2.3.0.0_x86_64.bin --quiet
      Evaluation version:
      export IBM_SPECTRUM_CONDUCTOR_LICENSE_ACCEPT="Y"
      conductoreval2.3.0.0_ppc64le.bin --quiet
      export IBM_SPECTRUM_CONDUCTOR_LICENSE_ACCEPT="Y"
      conductoreval2.3.0.0_x86_64.bin --quiet
      Note: Alternative method: If you must use .rpm files instead of .bin, extract the .rpm files, install the ego*.rpm files, then install the ascd*.rpm and conductorspark*.rpm files.
      For example, first extract the .rpm files from the conductor2.3.0.0_x86_64.bin or conductor2.3.0.0_ppc64le.bin package by running one of these commands:
      conductor2.3.0.0_x86_64.bin --extract extract_directory
      conductor2.3.0.0_ppc64le.bin --extract extract_directory

      where extract_directory specifies the directory to extract .rpm files.

      Next, install each .rpm file in order by running:
      rpm -ivh rpm_file_name
      For example:
      rpm -ivh egocore-version.x86_64.rpm
      rpm -ivh egocore-3.7.0.0.ppc64le.rpm
    • To install to a custom location, enter the following commands:
      Entitled version:
      export IBM_SPECTRUM_CONDUCTOR_LICENSE_ACCEPT="Y"
      conductor2.3.0.0_ppc64le.bin --prefix install_location --dbpath dbpath_location --quiet
      Evaluation version:
      export IBM_SPECTRUM_CONDUCTOR_LICENSE_ACCEPT="Y"
      conductoreval2.3.0.0_ppc64le.bin --prefix install_location --dbpath dbpath_location --quiet
      enter one of the following commands:
      • For Linux 64-bit:
        conductor2.3.0.0_x86_64.bin --prefix install_location --dbpath dbpath_location
      • For Linux on POWER® LE:
        conductor2.3.0.0_ppc64le.bin --prefix install_location --dbpath dbpath_location
      where:
      • --prefix install_location specifies the absolute path to the installation directory. Specifying the installation path with the --prefix parameter is mandatory if you are installing on a shared file system, unless the default directory /opt/ibm/spectrumcomputing is already mounted from a shared file system. If you install without the --prefix option, IBM Spectrum Conductor is installed in its default directory: /opt/ibm/spectrumcomputing. Ensure that the path is set to a clean directory.
      • --dbpath dbpath_location sets the RPM database to a directory different from the default /var/lib/rpm. The --dbpath parameter is optional.
      For example:
      ./conductor2.3.0.0_x86_64.bin --prefix  /gpfs/test/platform4 --dbpath /gpfs/test/platform4/db
      ./conductor2.3.0.0_ppc64le.bin --prefix  /gpfs/test/platform4 --dbpath /gpfs/test/platform4/db
      Note: Alternative method: If you must use .rpm files instead of .bin, extract the .rpm files, install the ego*.rpm files, then install the ascd*.rpm and conductorspark*.rpm files.
      For example, first extract the .rpm files from the conductor2.3.0.0_x86_64.bin or conductor2.3.0.0_ppc64le.bin package by running the appropriate command:
      conductor2.3.0.0_ppc64le.bin
      conductor2.3.0.0_x86_64.bin --extract extract_directory

      where extract_directory specifies the directory to extract .rpm files.

      Next, install each .rpm file in order by running:
      rpm -ivh --prefix install_location --dbpath dbpath_location rpm_file_name
      For example:
      rpm -ivh --prefix /opt/mydir --dbpath /opt/mydir/mydb egocore-3.7.0.0egocore-version.x86_64egocore-version.x86_64.rpm
    • To install without user interaction, enter one of the following commands:
      • For Linux 64-bit:
        conductor2.3.0.0_x86_64.bin --quiet
      • For Linux on POWER LE:
        conductor2.3.0.0_ppc64le.bin --quiet

      where --quiet suppresses prompts during installation.

  7. After installation is complete, source the environment:
    • (csh) source $EGO_TOP/cshrc.platform
    • (bash) . $EGO_TOP/profile.platform

    where $EGO_TOP is the path to your installation directory (the default path is /opt/ibm/spectrumcomputing).

    IBM Spectrum Conductor automatically creates this profile.platform (or cshrc.platform file when you use CSH) on management hosts during installation. The profile.platform (cshrc.platform) file sources other files, all of which together set the environment for management hosts in the cluster.

  8. Create the files that set up the compute host environment in your cluster.
    Default: files are automatically created
    IBM Spectrum Conductor automatically creates the profile.platform (or cshrc.platform file when you use CSH) on management hosts during installation. The profile.platform (cshrc.platform) file sources other files, all of which together set the environment for management hosts in the cluster.
    Use the profile.platform (cshrc.platform) file to set your environment, as follows:
    • (csh) source $EGO_TOP/cshrc.platform
    • (bash) . $EGO_TOP/profile.platform
    Manually create the files
    If you are going to set up a shared directory for failover that uses a different location or file system, then you have to manually create the profile.platform.comp (or cshrc.platform.comp file when you use CSH) on compute hosts. The profile.platform.comp (cshrc.platform.comp) file, along with other files, sets the environment for compute hosts in the cluster. To create this file and set up your environment, complete these steps:
    1. Copy the profile.platform file to a new profile.platform.comp file.
    2. Copy the profile.platform and profile.ego files from profile.platform to *.comp.
    3. Append the .comp file extension to each file in profile.platform.comp.
      For example:
      cp profile.platform profile.platform.comp
      cp profile.ego profile.ego.comp
    4. Source your compute host environment:
      • (csh) source $EGO_TOP/cshrc.platform.comp
      • (bash) . $EGO_TOP/profile.platform.comp
  9. Link the Elastic Stack harvesting directory to a unique directory on your shared file system. This harvesting directory is the value of the ELK_HARVEST_LOCATION environment variable; by default, /var/tmp/elk_logs/.
    1. As the cluster administrator, create a unique directory in your shared file system for each host included in your cluster. For example, to create directories for hostA and hostB in your IBM Spectrum Scale file system, enter:
      mkdir /gpfs/conductor/var/tmp/elks_logs/hostA 
      mkdir /gpfs/conductor/var/tmp/elks_logs/hostB
    2. Determine your cluster administrator group:
      id CLUSTERADMIN -g
      where CLUSTERADMIN is your cluster administrator account (for example, egoadmin).
    3. Ensure correct permissions for each directory that you created in step 9.a:
      chown -Rh $CLUSTERADMIN:$ADMINGROUP $DIRECTORY
      chmod g+s $DIRECTORY
      chmod 777 $DIRECTORY
      where:
      • $CLUSTERADMIN is your cluster administrator account (for example, egoadmin).
      • $ADMINGROUP is the operating system group for your cluster administrator (determined in step 9.b).
      • $DIRECTORY is the host's unique directory in your shared file system (created in step 9.a).
    4. On each host in the cluster, check if the directory defined by ELK_HARVEST_LOCATION exists. If it does, remove the directory (default /var/tmp/elk_logs).
    5. On each host, create a link to the host's harvesting location in your shared file system:
      ln -s DIRECTORY ELK_HARVEST_LOCATION
      where:
      • DIRECTORY is the host's unique directory in the shared file system, created in step 9.a.
      • ELK_HARVEST_LOCATION is the directory that is specified by the ELK_HARVEST_LOCATION environment variable (default is /var/tmp/elk_logs).
  10. Optional: If you plan on scheduling batch applications, you can update the default algorithm and key size that is used to encrypt the scheduling user's token. The scheduling user token is maintained for all users who schedule batch applications.
    Note: This configuration can be updated only before the cluster is started for the first time. If you do not change the parameters, the default values are used.
    1. Open the ascd.conf configuration file at $EGO_CONFDIR/../../ascd/conf.
    2. Edit both parameters as required:
      Option Description
      CONDUCTOR_SPARK_SCHEDULED_APP_CIPHER_ALGORITHM Specifies the algorithm that is used to encrypt the token for the scheduling user. Valid values are AES (default) or DESede.
      CONDUCTOR_SPARK_SCHEDULED_APP_CIPHER_KEYSIZE Specifies the key size that is used to encrypt the token for the scheduling user. Valid values are as follows:
      • If CONDUCTOR_SPARK_SCHEDULED_APP_CIPHER_ALGORITHM=AES, set the key size to 128 (default), 192, or 256 bits.
      • If CONDUCTOR_SPARK_SCHEDULED_APP_CIPHER_ALGORITHM=DESede, set the key size to 112 or 168 bits.
    3. Save your changes.

After you install IBM Spectrum Conductor, join the host and set entitlement. See Entitling IBM Spectrum Conductor.

If you want to modify the ELK_DATA_LOCATION parameter after installation, you must first complete the following steps. Most, if not all prior Elasticsearch data will be wiped:
  1. Stop all services and shut down the cluster:
    egosh service stop all
    egosh ego shutdown all
  2. Create the new Elastic Stack data directory. The new data directory must have the same permissions as the previous data directory.
  3. Change the ELK_DATA_LOCATION parameter to point to the new data directory that you created.
  4. Optional: If you are not going to revert these changes, delete the data that exists in the previous data directory.
  5. Optional: Back up Elastic Stack cluster data in the $EGO_CONFDIR/../../integration/elk/hosts/es.master directory.
  6. Delete Elastic Stack cluster data in the $EGO_CONFDIR/../../integration/elk/hosts/es.master directory:
    rm -rf $EGO_CONFDIR/../../integration/elk/hosts/es.master
  7. Restart the cluster:
    egosh ego start all