Updating instance groups to use a new Spark version
When a new (higher) Apache Spark version becomes available, upon adding it to your system to use it with IBM® Spectrum Conductor, you can then update existing instance groups to use the new Spark version.
Before you begin
- You must be a cluster administrator, consumer administrator, or have the Instance Groups Configure permission.
- The new Spark version that you want to associate with your existing instance groups must be installed on the system (see Adding Spark versions).
- The instance group must be in the Registered, Ready, Register Error, or Deploy Error state. If the instance group is running workload, stop the instance group and all associated notebooks before you update it. See Stopping instance groups and Stopping notebooks in an instance group.
- Upgrading the Spark version is a permanent change to your instance group. Once upgraded, you cannot roll
back the upgrade to downgrade the Spark version for the instance group. As a best practice, before you
upgrade your instance group on
production, back up necessary assets and test application compatibility before you configure it with
the new Spark version:
- The new Spark version must be higher than the current Spark version, support all notebooks enabled for, and all data connectors configured for, the existing instance group.
- Upgrading the Spark version for an existing instance group removes all the logs under the current instance group deployment directory. To retain theses logs, back them up prior to upgrading the Spark version for the instance group.
- The Spark version upgrade can introduce new configuration parameters or new default values to existing parameters. Review the configuration for the new Spark version prior to upgrading a instance group to use it.
- If you deploy a notebook with its base data directory configured within the instance group deployment directory, the Spark
version upgrade undeploys the current Spark version, and removes that directory. Back up this
directory to a location outside of the instance group deployment directory prior to
upgrading the Spark version for the instance group. Upon upgrading the Spark version
for the instance group, you can restore
this directory.
To avoid this step during future Spark version upgrades, reconfigure the base data directory for all notebooks to a location outside of the instance group deployment directory. You can complete this at the same time as the Spark version upgrade for the instance group, or as a separate step before or after the upgrade.
- Create a test instance group to
test application on the new Spark version. Either create a new instance group, or copy an existing one to a
template and create a new instance group from that template.
Upon testing your applications and verifying on the new Spark version using the test instance group, you can stop the production instance group and configure it to use the new Spark version.
Procedure
Results
What to do next
- Start the instance group. See Starting instance groups.
- Test your applications on the upgraded Spark version using the test instance group that you created before the upgrade. Once satisfied, stop the production instance group and configure it to use the new Spark version.
- If you backed up the base data directory to a location outside of the instance group deployment directory prior to upgrading the Spark version for the instance group, restore this directory for use with the upgraded Spark version.