Installing IBM Z Platform for Apache Spark

You can install IBM® Z Platform for Apache Spark (Spark) by using CBPDO or, alternatively, SystemPac or ServerPac.

Before you begin

Ensure that the following software requirements for IBM Z Platform for Apache Spark have been met:
  • IBM z/OS® V2.1 or later
  • The minimum required Java™ level is IBM 64-Bit SDK for z/OS Java Technology Edition V8, Service Refresh 6. However, if the RELEASE file in the Spark installation directory indicates that the product was built with a later Java level, IBM urges you to use that Java level.
  • Bourne Again Shell (bash) version 4.3.48.
For the latest list of requirements, see the information in the Preventive Service Planning (PSP) bucket.
Migration notes: If you already use IBM Open Data Analytics for z/OS, note the following differences in IBM Z Platform for Apache Spark:
  • IBM Z Platform for Apache Spark changes the level of Apache Spark. For more information, see Migrating to a new version of Apache Spark.
  • IBM Z Platform for Apache Spark changes the default z/OS Spark installation directory to /usr/lpp/IBM/zspark/spark/sparknnn (for instance, /usr/lpp/IBM/zspark/spark/spark32x).
  • IBM Z Platform for Apache Spark uses UTF-8 encoding. For details, see Setting up a user ID for use with IBM Z Platform for Apache Spark and Network port configurations.
  • The Spark REST server port is disabled. You can enable connections to the REST port (such as when using cluster deploy mode) in your local spark-defaults.conf file, but the port will not function properly until you complete the setup to secure and enable the REST port. For details, see Configuring networking for Apache Spark.
  • IBM Z Platform for Apache Spark introduces client authentication, which is enabled by default and requires additional setup. Apache Spark will not function properly until you complete the setup for client authentication or disable the client authentication function. For details, see Configuring z/OS Spark client authentication.
  • As compared to previous Spark products, IBM Z Platform for Apache Spark may be different in the way you assign job names to executor and driver processes. For details, see Assigning job names to Spark processes.
  • Spark master and worker daemons will perform environment verification during initialization and will fail to start if the verification fails. The reason for termination can be found in the daemon's log. You can disable this feature by setting the spark.zos.environment.verify to false in spark-defaults.conf.
Additional migration note: A new service (PTF) level for IBM Z Platform for Apache Spark (FMID HSPK130) might provide a new version of Spark. Before installing a new PTF, see Migrating to a new version of Apache Spark.
  • The level of Apache Spark is 3.2.0. For more information, see Migrating to a new version of Apache Spark.
  • If you specify an incorrect job name prefix, Spark worker daemon will fail rather than ignoring the error. For more information, see Assigning job names to Spark processes.
  • If client authentication is enabled, and you submit an application to the Spark master port in cluster-deploy-mode, then the Spark driver will run under the ID of the user who did the submit.
Note: IBM Z Platform for Apache Spark currently has some restrictions on Spark functionalities. For a list of restrictions, see Restrictions.

About this task

IBM Z Platform for Apache Spark is supplied in a Custom-Built Product Delivery Offering (CBPDO, 5751-CS3). For installation instructions, see Program Directory for IBM Z Platform for Apache Spark (GI13-5806).

You can also install IBM Z Platform for Apache Spark with a SystemPac or ServerPac. For information about the various z/OS product installation offerings, see z/OS Planning for Installation.

Service updates for IBM Z Platform for Apache Spark are provided as PTFs that perform a full replacement of the product. Therefore, you can use a PTF to update your existing installation.

IBM recommends that you mount your IBM Z Platform for Apache Spark file system from the same system on which the Spark cluster will be run.

Procedure

Complete the following steps to install IBM Z Platform for Apache Spark on your system.

  1. Choose the most appropriate method for installing IBM Z Platform for Apache Spark.
  2. Use the information in Program Directory for IBM Z Platform for Apache Spark (GI13-5806) to install IBM Z Platform for Apache Spark on your system.

Results

IBM Z Platform for Apache Spark is installed on your z/OS system.

What to do next

Before you use IBM Z Platform for Apache Spark for the first time, follow the customization instructions in Customization.