Installing and starting the appliance (multi-node setup)

Follow these steps after the accelerator (SSC) LPAR definition for the multi-node setup of Db2 Analytics Accelerator on Z.

Before you begin

Take a look at the description and the different deployment options for a Multi-node setup.

What you need:

  • Accelerator (SSC) LPARs organized in an LPAR group (cluster).
  • At least 1.5 TB of main memory.
  • At least 7 network connection points that can be reached from the outside of the HiperSocket network (through OSA-Express® cards or RoCE Express cards).
  • At least 18 individual disk drives (FICON® (ECKD) or FCP (SCSI):
    • 40 GB or more are required for each boot disk in the cluster.
    • 80 GB or more are required for each runtime disk in the cluster. Six or a multiple of six runtime disks can be specified.
    • Each data disk needs as much storage as the data pool of a single-node installation. The number of data disks differs with regard to the hardware or the selected deployment option. See Table 1.
  • An absolute limit must be set for the number of the Integrated Facilities for Linux (IFLs) that the cluster can use, for example 40 IFLs. The minimum is 30 IFLs. You can use as many IFLs for accelerator (SSC) LPARs as a Central Processing Complex (CPC) or drawer can be equipped with.
  • The accelerator (SSC) LPARs must belong to the same Central Processing Complex (CPC) or drawer.
  • The accelerator (SSC) LPARs must be connected by a HiperSocket or RoCE Express card network.
  • Every accelerator (SSC) LPAR in the cluster must be connected to a management network.
  • Jumbo frame support is required for parts of the network. See "network_interface_bindings".
  • The head node is deployed across all accelerator (SSC) LPARs or on a single accelerator LPAR in the cluster. See Table 1.

    The head node must have access to 25 percent of the processing capacity of the shared IFLs. The initial weight of the head node LPAR needs to be set to 100 in the HMC activation profile. Furthermore, the head node requires at least 256 GB of main memory.

  • The other accelerator (SSC) LPARs are used for the data nodes.

    Each of the data nodes claims 25 percent of the processing capacity of the shared IFLs. The initial weight of a data node LPA needs to be set to 180 in the HMC activation profile. The data nodes require between 256 GB and 4 TB of main memory.

Differences

Requirements vary with regard to the hardware or the selected deployment option.

Table 1. Different requirements for multi-node deployments
  IBM z15® cross-drawer head node IBM z16™ cross-drawer head node IBM z16 confined head node
Number of LPARs in cluster 6 5 4
Min. number of LPARs for data nodes 5 4 3
Max. number of IFLs per LPAR 35 50 50
Head node Distributed across LPARs Distributed across LPARs Dual-purpose head node confined to one LPAR
Important:
  • In general, to change the number of LPARs, a complete reload of the accelerator and a fresh installation is required. For example, if you start with a three-drawer system with one LPAR each (as recommended), and then want to add a fourth drawer, create a new setup to reinstall the cluster with 4 drawers.
  • To extend a two-drawer system to a four-drawer system, you need not reinstall the entire system. Here, you can migrate normally. So instead of having two LPARs that share one drawer, you end up with one LPAR per drawer.
  • To migrate from a setup on the IBM z15, you do not have to change the number of accelerator LPARs. You can continue with a setup of five LPARs in the cluster. However, the performance of such a setup is not as good as a setup with four nodes on the IBM z16.
  • For confined head node installations, the recommendation is to use dedicated IFLs only. However, if you want to use a confined head node setup for shared IFL workloads, make sure that the LPARs that provide the shared workloads have a significantly lower priority than the LPARs of the confined head node cluster. This way, the slowing impact on the performance is kept at a minimum.

Procedure

Restriction:
  • For the following steps, you need Mozilla Firefox or Google Chrome. Other browsers are not supported.
  • If the Login page does not show all the controls needed to login successfully, or if after the logon an SSC installer window is not fully usable because controls are missing, change the browser language to English and try again.

  1. For an IBM z16, define 4 dedicated Secure Service Container (SSC) LPARs as described in Defining an LPAR for Db2 Analytics Accelerator on Z. For an IBM z15, define 5 dedicated Secure Service Container (SSC) LPARs.
  2. Log on to the Admin UI and proceed to the Welcome page.
    For a description, see Logging on to the Admin UI.
  3. On the Welcome page, click First-Time Setup.
    You see the following page:
    Figure 1. Accelerator Configuration Definition page
    After clicking First-Time-Setup, you reach this page, the heading of which is Accelerator Configuration Definition
  4. Starting with product version 7.1.9, all configuration settings are made by uploading a configuration file in JavaScript Object Notation (JSON) format.
    Important: For a multi-node setup, a sample configuration file is not provided. This is because the setup is complex, and a sample file cannot cover all possibilities without becoming very confusing. Contact IBM support for help with the creation of a configuration file in JSON format.
  5. Create a JSON file in a text editor of your choice and include settings as shown in the following steps.
    An editor capable of validating JSON files is recommended because the configuration file must be valid JSON. If it cannot be parsed correctly, you will run into errors. Valid JSON means:
    • Quotes are required around attribute values, even if these are plain numbers.
    • Colons must be used to separate attribute names from their values.
    • Object definitions consisting of key/value pairs must be enclosed in curly braces.
    • Arrays or lists must be enclosed in brackets.
    Most of the following attributes are required.
    "version" (required)
    The version of the accelerator. The version in the configuration file must match the version of the accelerator exactly. Do not change the value so that it deviates from the version of the accelerator.
    "accelerator_name" (required)
    The name of the accelerator. This attribute is used to identify the accelerator in the Admin UI and in log and trace files written to the logs/dumps/traces/panels directory. You can change the name online. A reset or restart is not required.
    "accelerator_description" (optional)
    Optional text description. You might want to add some information about the accelerator, especially if it is helpful for an administrator. You can change the description online. A reset or restart is not required.
    "accelerator_type" (required)
    As the name suggests, the type of the accelerator. Set this attribute to the value "multi-node". This means that the runtime and storage environments you define later contain definitions for five or six LPARs, and five or six storage environments. A single processing node of a multi-node accelerator can use up to 30 Integrated Facilities for Linux (IFLs) and 1.5 TB of memory.
    Important: Once this parameter has set for and used by an accelerator, it cannot be changed anymore. If you want to switch to a single-node installation, you must provide an entirely new configuration. Also, if you switch from one mode to the other, the existing data is not migrated or preserved.
    "admin_ui_timeout" (optional)
    Defines the session time of the Admin UI. When the specified time has passed, the session expires, and the administrator has to log on again to continue. Values from 1 to 1440 minutes are allowed. The default time is 15 minutes.

    If you set or change this parameter, the change takes effect immediately, that is, without a restart of the accelerator.

    Example:
    "admin_ui_timeout": "60",

    This sets the session timeout period for the Admin UI to 60 minutes.

    Note: This parameter was introduced with product version 7.5.11. If you try to use it with an earlier product version, your configuration setup will fail, and an error message will be issued.
    "mln_distribution" (optional)
    The ratio of head-node MLNs to data-node MLNs. The first digit (M) stands for the head-node MLNs; the second (N) for the data-node MLNs. For multi-node, the only supported values are 4:4 and 1:4. The default setting is 1:4.
    Attention:
    • Before you can use this parameter, you must drop all active pairings between the accelerator and connected Db2 subsystems.
    • After you change this parameter, you must reset the accelerator configuration on the Accelerator Component Health Status page of the Admin UI. You also have to select the Wipe data option. This means that all your data on the accelerator will be erased. A complete reload of the configuration is required. You must also redefine and reload tables after loading the new configuration.
    • This parameter was introduced with product version 7.5.11. If you try to use it with an earlier product version, your configuration setup will fail, and an error message will be issued.
    Example:
     "mln_distribution": "4:4",

    This leads to a configuration with a confined dual-purpose head node. The head node accommodates 1 catalog MLN, 3 data MLNs, and 4 data MLNs on each data node. In total, the cluster will thus have 15 data MLNs and one catalog MLN if the deployment uses 4 LPARs.

    "cluster_manager_ram_drive_size" (optional)
    The amount of memory (RAM) that the cluster manager component can use. The value is an integer plus the unit GB. If you do not specify a value, the default value of 1 GB is used. This is sufficient in most cases. Allotted RAM that is not used by the cluster manager does not remain occupied or reserved; it can be used by other applications.

    Starting with version 7.5.11, it is possible to run the cluster manager component in a RAM drive. Before, it was always run on a disk drive. Having the cluster manager in a RAM drive adds to the overall system stability. Crashes of the accelerator are less likely under heavy workloads.

    If, for some reason, you prefer to have the cluster manager on disk, set the value to 0 GB.

    Example:
    "cluster_manager_ram_drive_size": "2 GB",
    Note: This parameter was introduced with product version 7.5.11. If you try to use it with an earlier product version, your configuration setup will fail, and an error message will be issued.
    "db2_pairing_ipv4" (required)
    The IP address used to pair your Db2 subsystem with the head node of the specified accelerator. This IP address uniquely identifies the head node and is used by Db2 for z/OS to connect to the accelerator.high
    Changing this value requires a subsequent reset from the Accelerator Components Health Status page of the Admin UI. See the following figure.
    Figure 2. Resetting an accelerator configuration
    Screen capture of Reset button and wipe option in the Admin UI
    Important: To change the value, it is no longer necessary to select the Wipe data option and delete the existing data. This was only required in previous versions and involved a new pairing and upload of the tables.
    You can specify a netmask as part of the IPv4 address , like /24 for a subnet with 254 usable addresses. For example:
    "db2_pairing_ipv4": "10.108.16.184/24"
    This specifies the IP address 10.108.16.184 as the identifier of a subnet that comprises the address range from 10.108.16.1 to 10.108.16.254.
    Tip: All IP addresses in the configuration file can point to a subnet.
    "temp_working_space" (optional)
    The accelerator needs temporary storage for extensive sort operations that cannot be executed exclusively in the system memory. Increase the size if certain operations cannot be completed because the temporary storage and the system memory do not suffice. In addition, the accelerator needs temporary storage for the replication spill queue and for query results that tend to arrive faster than they can be picked up by the receiving client. The "temp_working_space" parameter specifies the size of the temporary storage as part of the data pool.

    Involved queries need a high amount of temporary storage. A single query can easily consume multiple terabytes of temporary storage during its execution. If your system processes several long-running queries of this type at the same time, some of these queries might have to be canceled if the temporary storage is running out.

    Transient storage devices (NVMe drives), which are highly recommended for various reasons, can help you avoid situations like this. See transient_storage for more information.

    You can also define a transient storage pool, and dedicate external devices (disk drives) to this pool. This option is not as fast as NVMe drives, but still much better than using a portion of your data pool for temporary data. It is the recommended solution if your accelerator is not deployed on a LinuxOne computer, but on a classic IBM Z computer. See "transient_devices" (optional) for more information.

    If you do not use any of the transient storage options, a portion of your data pool will be used for the temporary data.

    It is important to note that there is no swapping or failover. You either use disk space in your data pool for the placement of temporary data, or transient storage if that exists. If your transient storage is running out, processing will not continue with temporary working space in your data pool.

    You can set the "temp_working_space" parameter to the following values:

    "unlimited"
    The advantage of this setting is that it works without sizing, that all storage resources can be shared, and that no storage is reserved as temporary working space if it is not needed. The disadvantage is a lower operation stability because space-intensive queries might lead to the cancellation of other jobs, irrespective of their types. A space-intensive query can even cause the cancellation of INSERT operations.

    If a situation is reached where nearly all temporary working space is used up by running queries, replication jobs, load jobs, or by the population of accelerator-shadow tables, any additional workload that claims more than the available disk space is canceled automatically.

    "automatic"
    Starting with product version 7.5.11, this is the default, that is, the value used if you omit the parameter. This setting results in the creation of a database-managed (DMS) table space based on the system size and the free space in the data pool. The size will be the smaller of the following two values:
    • 80 percent of the LPAR memory
    • 50 percent of the free space of the data pool
    Tip: In many cases, this is not an ideal allocation. If your queries require significant amounts of temporary storage, consider the use of transient_storage. If, in contrast, your query workload needs just small amounts or hardly any temporary storage at all, specify a fixed size with a low value or "none".

    The current default is recommended only if you do not know the storage requirements of your query workload very well, or if you do not want to spend much time on fine-tuning.

    "none"
    Just a very small temporary workspace is used. A query that includes an extensive sort which leads to a memory overflow will fail immediately.
    Fixed size
    A DMS table space (see "automatic") with a size you determine. For example:
    "temp_working_space": "500 GB"

    The specified size is reserved for the temporary work space and serves as a size limit at the same time.

    A change of the setting requires a restart of the accelerator. Changes do not take effect before the restart. If you use the setting "automatic", the size of the DMS table space is re-adjusted dynamically after each start of the accelerator, according to the memory and disk usage rates that are encountered.

    If you set "temp_working_space" to "automatic" or to a fixed value, and if the free storage space has shrunk during the time from one restart to the next, a smaller temporary workspace is provided after the latest restart. If you were close to 100 percent storage usage before a restart, the temporary workspace might be reduced so much during the restart that certain jobs cannot be executed anymore. In that case, consider adding storage to the accelerator. This can be done while your system is online.

    "dispatch_mode" (optional)
    This optional parameter determines how the workload is distributed across the available CPUs.
    Note: The "dispatch_mode" parameter has an effect only if you process shared IFL workloads. Otherwise, you can ignore it.

    If you use a distributed head node, you must configure all IFLs in the cluster as shared IFLs. This is because the head node shares resources with the data nodes in this mode, so that PR/SM virtualization capabilities can be exploited. With just a single node or a confined head node, you can use dedicated IFLs or shared IFLs. Dedicated IFLs probably work a little faster in these modes. However, to simplify this documentation, the use of shared IFLs is assumed throughout the text.

    You can set this parameter to the value "horizontal" or "vertical". The default value is "vertical".

    Vertical dispatch mode (HiperDispatch mode) means that the workload is processed by just a subset of the available CPUs, which reduces the scheduling overhead. It is the most efficient mode for systems with many logical processors.

    Horizontal dispatch mode means that the workload is spread across all available CPUs. For older versions of Db2 Analytics Accelerator on Z, you could not set a dispatch mode. The horizontal mode was always used.

    "network_interface_bindings" (required)
    In the network interface bindings section, you map physical interfaces to network names that can be specified in a runtime environment. The settings in this section cannot be changed online. If you need to change them, you must restart the Db2 Analytics Accelerator on Z accelerator.
    "mgmt_nw"
    This network, which is used by the Admin UI and other support interfaces, is defined by the HMC activation profile of the accelerator (SSC) LPAR. It is not part of the Db2 Analytics Accelerator on Z configuration. Therefore, use the attribute value "activation-profile". Note that the name of the management network might change if someone updates the activation profile of the accelerator (SSC) LPAR on the HMC.
    Note: Jumbo frame support is not required for the switches or network interface controllers in this network.
    "db2_nw"
    This network name points to the IP address of your Db2 subsystem (counterpart of the "db2_pairing_ipv4"). It is used during the pairing process, and all network traffic between your Db2 subsystem and the accelerator will run through this interface.

    The attribute value must be the same as one of the "name:" attributes in your "network_interfaces" definitions further down in the configuration file. The value must be an alphanumeric character string no longer than 8 characters. Compare this with the sample code.

    In the "network_interfaces" section, which is described below, you find the details of all networks, including the network used for the pairing process. In the example, this is the network device with the ID 0.0.4b00.
    Important: The switches and network interface controllers in this network must support jumbo frames.
    "cluster_nw"
    The name of the HiperSocket or RoCE Express card network that connects your nodes.

    The attribute value must be the same as one of the "name:" attributes in your "network_interfaces" definitions further down in the configuration file.

    In the network_interfaces: section, which is described below, you find the details of all networks, including the HiperSocket or RoCE Express card network. In the example, this is the network device with the ID 0.0.7f00.

    An example of the "network_interface_bindings" block:

    "network_interface_bindings": {
      "db2_nw": "db2_conn",
      "cluster_nw": "my_hiper",
      "mgmt_nw": "activation-profile"
    }
    
    Notes:
    "runtime_environments" (required)
    This block defines the network interfaces and other LPAR-specific settings of the accelerator (SSC LPARs). Each LPAR is identified by the CPC name and the LPAR name. A set of networks must be defined for each accelerator (SSC) LPAR. This is usually the Db2 network, and the HiperSocket or RoCE Express card network of your cluster. Specify the following attributes to identify an accelerator (SSC) LPAR:
    "cpc_name"
    The name of the CPC as defined on the Hardware Management Console (HMC).

    You cannot change this name online. If you must change it because the CPC name changes or because you want to move the accelerator to a different CPC, first create an additional runtime environment that contains the new value. Then shut down the accelerator, change the old value and reactivate the accelerator.

    "head"
    The definition of the head node. For example:
    "head": {
      "lpar_name": "LPAR0",
      "network_interfaces": [
        {
          "name": "my_hiper",
          "device": "0.0.7f00",
          "ipv4": "172.84.0.180/23"
        },
        {
          "name": "my_db2_network",
          "device": "0.0.4b00"
        }
      ]
    }
    
    "lpar_name"
    The name of an accelerator (SSC) LPAR as defined on the HMC.

    You cannot change an "lpar_name" online. If you must change it because the LPAR name changes or because you want to use different LPARs for the accelerator, first create an additional runtime environment that contains the new values. Then shut down the accelerator, change the old values and reactivate the accelerator.

    Continue with the network interfaces for the accelerator (SSC) LPAR:

    "transient_storage": "NVMe" (optional)
    This is a highly recommended option for owners of an IBM LinuxONE system. NVMe storage is local storage of the LinuxONE, which can be accessed directly. It has a much better performance than external storage devices, and might help free up capacity on your external devices. When in use, all data on the NVMe storage is encrypted. The encryption keys are kept in the system memory only. This ensures that all data on the NVMe is securely removed when the accelerator is stopped or restarted.

    NVMe storage is ideal for the placement of temporary data, especially if you need a lot of this space because you run involved queries that execute sort operations or produce result sets so large that they do not fit into the accelerator memory. The database engine of the accelerator can store and read significant amounts of data from NVMe storage in a short period of time.

    NVMe storage is also good for the replication spill queue and for query results that tend to arrive faster than they can be picked up by the receiving client.

    Configuration example:

    "runtime_environments": [
            {
                "cpc_name": "Z16_4",
                "head": {
                    "lpar_name": "METIS",
                    "zfcp_devices": [
                        "0.0.7800",
                        "0.0.7900",
                        "0.0.7a00",
                        "0.0.7b00"
                    ],
                    "transient_storage": "NVMe",
                    "network_interfaces": [
                             .
                             .
    
    Important:
    • If you specify "transient_storage":"NVMe", you don't have to set the "temp_working_space" parameter at all. Any setting of this parameter will become ineffective if transient storage is used.
    • If NVMe drives do not provide enough storage for your temporary data, consider using a transient data pool. See "transient_devices" (optional) for more information.

    For an in-depth discussion of the transient storage options, see Configuring and using transient storage for IBM Db2 Analytics Accelerator for z/OS on Z.

    "network_interfaces" (required)
    This keyword defines the physical network interfaces (OSA, RoCE, or HiperSocket) that are used by a runtime environment. Each network name defined in the "network_interface_bindings" section must be mapped to a physical network interface in the corresponding runtime environment. You can change all of the definitions in this block online. The following attributes must be specified for each physical network:
    "name" (required)
    The name of the network interface in the "network_interface_bindings" section.
    "ipv4"
    The IPv4 address and the subnet used for an interface, for example:
    10.4.1.101/16
    Important: The IPv4 address that the "db2_nw" network uses is specified as the "db2_pairing_ipv4" address. Therefore, do not specify an additional "ipv4" address for the "db2_nw" network.
    "device" (required)
    The identifier of an OSA-Express card or a HiperSocket. A device can only be used once. This includes the device specified in the activation profile for the HMC.
    "port" (optional)
    The network port to be used. If this value is omitted, the port number defaults to "0". Most OSA cards have a single physical port "0".
    "vlan" (optional)
    If a virtual LAN (VLAN) has been defined for the accelerator (SSC) LPAR and you want to use this VLAN as an interface for Db2 Analytics Accelerator on Z, you can specify the VLAN name here.
    Example:
    "runtime_environments": [
      {
        "cpc_name": "IBMZ1",
        "head": {
          "lpar_name": "LPAR0",
          "network_interfaces": [
            {
              "name": "my_hiper",
              "device": "0.0.7f00",
              "ipv4": "172.84.0.180/23"
            },
            {
              "name": "my_db2_network",
              "device": "0.0.4b00"
            }
          ]
        },
    
    "static_routes" (optional)

    This option is used to define additional network routes for an interface.

    If the IP address of a Db2 for z/OS LPAR or the GDPS® keys LPAR is in a different subnet than the IP address assigned to the accelerator, an additional route definition is needed to establish the connection. An additional static route also helps to avoid undesired network traffic through a default gateway, which might have been defined in the HMC activation profile of the accelerator (SSC) LPAR.

    "ipv4"
    The IPv4 address or the IPv4 address and subnet of the target network.
    "via"
    The IPv4 address of the routing device.

    Example: The accelerator's pairing IP address is 10.20.1.33/24 and there are two Db2 for z/OS LPARs with the IP addresses 10.1.1.47/24 and 10.1.1.48/24.

    One or more gateways connect both subnets. One gateway is accessed through IP address 10.20.1.1, the other through 10.1.1.1.

    To allow traffic from one network to the other, the TCPIP.PROFILE definition in z/OS defines a route to 10.20.1.0/24, which uses the gateway 10.20.1.1. The accelerator uses the following configuration to enable traffic to the 10.1.1.0 network using the corresponding gateway at 10.1.1.1:

    {
      "accelerator_name": "S1",
      "db2_pairing_ipv4": "10.20.1.33/24",
      "network_interface_bindings": {
        "db2_nw": "db2_conn",
        "mgmt_nw": "activation-profile"
      },
      "runtime_environments": [
        {
          "cpc_name": "IBMZ1",
          "head": {
            "lpar_name": "LPAR0",
            "network_interfaces": [
              {
                "name": "db2_conn",
                "device": "0.0.0440",
                "vlan": "552"
                "static_routes": [ { "ipv4": "10.1.1.0/24", "via": "10.20.1.1" } ],
              }
            ]
          },
    
    

    This way, all traffic to an IPv4 address that starts with 10.1.1 uses the OSA-Express card with device ID 0.0.0440 via gateway 10.20.1.1. All network traffic between the accelerator and destinations in the 10.1.1.0/24 subnet is thus bound to that OSA device.

    "bond_settings" (optional)
    This attribute allows you to define several network cards (OSA-Express cards) as a single device. Bonding is usually employed in a high-availability setup, as the remaining network cards in the setup can take over if one network card fails. It is also possible to run all available network cards simultaneously.
    Example:
    "network_interfaces": [
      {
        "name": "db2_conn",
        "vlan": "700",
        "bond_settings": {
          "mode": "active-backup",
          "workers": [
            {
              "device": "0.0.0a00",
              "port": "0"
            },
            {
              "device": "0.0.1b00",
              "port": "1"
            }
          ]
        }
      }
    ]
    

    In this example, two OSA cards (devices 0a00 and 1b00) are combined to one bonding device called "db2_conn". The device works in "active-backup" mode, meaning that at any time, just one of the network cards is active. The other card takes over when the active card fails.

    You can alternatively specify "mode": "802.3ad", in which case all network cards of the device will be active at the same time. "802.3ad" stands for the IEEE 802.3ad link aggregation mode.

    In 802.3ad mode, you need at least two physical devices. Specify these in the same way as you specify the devices for active-backup mode. That is, use a "workers" list as shown in the previous example.

    "options" (optional)
    It is not necessary to specify "options" for "bond_settings". If the options are omitted, default values are used. Whether options apply to a particular setup depends on the selected mode ("active-backup" or "802.3ad") . For a detailed description of these options, see Chapter 7. Configure Network Bonding in the Red Hat® Enterprise Linux 7: Networking Guide. A link is provided at the end of this topic.
    Restriction: Currently, you cannot specify just a subset of the available options. You either have to specify no options at all, in which case default values are used, or specify all options pertaining to a particular mode.

    All of the following values can be changed online.

    "primary"
    Valid in active-backup mode only. The first physical device to be used. This is "0.0.0a00" according to the previous example. The primary device is the first of the bonding interfaces. It will be used as the active device unless it fails.
    "primary-reselect": "always"
    Valid in active-backup mode only. Determines how the active physical device is selected after a failure. Specify "always", which means that an attempt will be made to make the first physical device (labeled "primary") active again.

    Other allowed options are "better", which means that the fastest device will be used as the active device, or "failure", which means that the active physical device is only changed if the currently active device fails.

    "failover-MAC": "none"
    Valid in active-backup mode only. Allows you set all physical devices to the same MAC address or determine these addresses according to a policy. Specify the value "none", which means that the same MAC address will be used for all physical devices.
    "no-gratuitous-ARPs": "0"
    Valid in active-backup mode only. Determines the number of peer notifications after a failover event. Specify "0", which means no notifications. This option corresponds to the num_grat_arp or num_unsol_na option in the Red Hat Enterprise Linux 7: Networking Guide.
    "transmit-hash-policy": "layer2"
    Valid in 802.3ad mode only. Selects a policy according to which the MAC addresses of the physical devices are determined. Specify layer2, which means that traffic to a particular network peer is assigned to the same network device, which is determined solely by its MAC address. Other allowed options are "layer3+4" and "layer2+3". The option "layer3+4" means that multiple network devices can be used to reach a single network peer even if a single network connection does not span multiple network devices. The option "layer2+3" is similar to "layer 2", but the network device is selected by its IP address in addition to its MAC address.
    Note: In the Red Hat Enterprise Linux 7: Networking Guide, this option is called "xmit-hash-policy"
    "LACP-rate": "slow" | "fast"
    Valid in 802.3ad mode only. The rate at which physical devices transmit Link Aggregation Control Protocol Data Units (LACPDUs). Specify "slow", which means every 30 seconds, or "fast", which means every 1 second.
    "link-monitoring": "MII"
    Selects the method to be used for monitoring the physical device's ability to carry network traffic. Select "MII", which stands for media-independent interface. With this setting, the driver, the MII register, or the ethtool can be queried for monitoring information about a physical device. Alternatively, you can specify "ARP" to use the ARP monitor.
    "monitoring-frequency": "100"
    The time interval that passes between two monitoring events. It is an integer value that stands for milliseconds. Use a value of "100".
    "link-up-delay": "0"
    Delay that needs to pass before network traffic is sent to a physical device after link monitoring has reported the device to be up. Specify "0", which means no delay.
    "link-down-delay": "0"
    Delay that needs to pass before network traffic is routed to the failover device after link monitoring has reported the failure of the previously active device. Specify "0", which means no delay.
    Example (active-backup mode):
    "options": {
      "primary": "0.0.0a00",
      "primary-reselect": "always",
      "failover-MAC": "none",
      "no-gratuitous-ARPs": "0",
      "link-monitoring": "MII",
      "monitoring-frequency": "100",
      "link-up-delay": "0",
      "link-down-delay": "0"
    }
    Example (802.3ad mode):
    "options": {
      "LACP-rate": "slow",
      "transmit-hash-policy": "layer2",
      "link-monitoring": "MII",
      "monitoring-frequency": "100",
      "link-up-delay": "0",
      "link-down-delay": "0"
    }
    "zfcp_devices" (required if ZFCP drives are used)
    The FICON Express ports of your devices must be listed in the "runtime_environments" section. This is required for ZFCP storage devices only, as the ports of ECKD devices are handled by the firmware of the CPC. For ZFCP devices, however, you must list the port names. See the following example:
    "runtime_environments": [
      {
        "cpc_name": "CPC001",
        "lpar_name": "IGOR01",
        "network_interfaces": [
          {
            "name": "osa2db2",
            "device": "0.0.0a00"
          }
        ],
        "zfcp_devices": [
          "0.0.1b10",
          "0.0.1b40",
          "0.0.2c80"
        ]
      }
    ]
    

    The accelerator uses multiple paths, that is, it tries to use all specified ports. For that reason, a list of different ports rather than just one port can increase the performance.

    You can use a single port identifier for a ZFCP device or for an ECKD device, but not for both.

    You can change the ZFCP port names online.

    "data1"
    The definition of the first data node. It has only one network interface: the HiperSocket or RoCE Express card network of the cluster.
    Example:
    "data1": {
      "lpar_name": "LPAR1",
      "network_interfaces": [
        {
          "name": "my_hiper",
          "ipv4": "172.84.0.181/23",
          "device": "0.0.7f00"
        }
      ]
    },
    

    Specify the other data nodes in the same manner, that is, create the sections "data2", "data3", and "data4". If you configure an accelerator on an IBM z15, also add "data5".

    Complete example:
    "runtime_environments": [
      {
        "cpc_name": "IBMZ1",
        "head": {
          "lpar_name": "LPAR0",
          "network_interfaces": [
            {
              "name": "my_hiper",
              "device": "0.0.7f00",
              "ipv4": "172.84.0.180/23"
            },
            {
              "name": "my_db2_network",
              "device": "0.0.4b00"
            }
          ]
        },
        "data1": {
          "lpar_name": "LPAR1",
          "network_interfaces": [
            {
              "name": "my_hiper",
              "device": "0.0.7f00",
              "ipv4": "172.84.0.181/23"
            }
          ]
        },
        "data2": {
          "lpar_name": "LPAR2",
          "network_interfaces": [
            {
              "name": "my_hiper",
              "device": "0.0.7f00",
              "ipv4": "172.84.0.182/23"
            }
          ]
        },
        "data3": {
          "lpar_name": "LPAR3",
          "network_interfaces": [
            {
              "name": "my_hiper",
              "device": "0.0.7f00",
              "ipv4": "172.84.0.183/23"
            }
          ]
        },
        "data4": {
          "lpar_name": "LPAR4",
          "network_interfaces": [
            {
              "name": "my_hiper",
              "device": "0.0.7f00",
              "ipv4": "172.84.0.184/23"
            }
          ]
        },
        "data5": {
          "lpar_name": "LPAR5",
          "network_interfaces": [
            {
              "name": "my_hiper",
              "device": "0.0.7f00",
              "ipv4": "172.84.0.185/23"
            }
          ]
        }
      }
    ]
    
    "storage_environments" (required)
    This block lists all storage devices, that is, disks or disk enclosures. At a minimum, this block contains the name of the storage device on which the accelerator is initially deployed. In a multi-node setup, a storage environment defines storage devices for the head node and for each data node. You can add storage devices or define multiple storage environments for mirroring or failover purposes. See Figure 3. Note, however, that the accelerator does not manage the mirroring or the copying of data to additional devices or storage environments.
    Figure 3. Definition of multiple storage environments in a multi-node setup
    The graph is a schematic overview that shows how to define multiple storage environments in a multi-node setup.

    The "storage_environments" section combines the "primary_storage" and "storage_maps" sections found in configuration files of earlier releases. During the first-time deployment, these devices are formatted, which means that the existing data on these devices is erased.

    Migration from an older release:

    If your current JSON configuration file still shows the "primary_storage" and "storage_maps" keywords, you need to update your storage environment configuration. In Table 2, you find an example of an old configuration block and a corresponding new configuration block for a setup with two storage environments. Use this table as a reference to make the required changes.

    Table 2. Old and new storage environment configurations
    Old configuration (version 7.5.2 and lower) New configuration (version 7.5.3 and higher)
    "primary_storage": [
      {
        "head": {
          "boot_device": {
            "type": "dasd",
            "device": "0.0.9b11"
          },
          "runtime_devices": {
            "type": "dasd",
            "devices": [
              "0.0.9b12",
              "0.0.9b13"
            ]
          },
          "data_devices": {
            "type": "dasd",
            "devices": [
              [ "0.0.9c00", "0.0.9cff" ]
            ]
          }
        },
        "data1": {
          "boot_device": {
            "type": "dasd",
            "device": "0.0.9b11"
          },
          "runtime_devices": {
            "type": "dasd",
            "devices": [
              "0.0.9b15",
              "0.0.9b16"
            ]
          },
          "data_devices": {
            "type": "dasd",
            "devices": [
              [ "0.0.9e00", "0.0.9eff" ]
            ]
          }
        },
        "data2": ...
    
        "data3": ...
    
        "data4": ...
    
        "data5": ...
    
      ]
    
    "storage_environments": [
      {
        "head": {
          "boot_device": {
            "type": "dasd",
            "device": "0.0.9b11"
          },
          "runtime_devices": {
            "type": "dasd",
            "devices": [
              "0.0.9b12",
              "0.0.9b13"
            ]
          },
          "data_devices": {
            "type": "dasd",
            "devices": [
              [ "0.0.9c00", "0.0.9cff" ]
            ]
          }
        },
        "data1": {
          "boot_device": {
            "type": "dasd",
            "device": "0.0.9b11"
          },
          "runtime_devices": {
            "type": "dasd",
            "devices": [
              "0.0.9b15",
              "0.0.9b16"
            ]
          },
          "data_devices": {
            "type": "dasd",
            "devices": [
              [ "0.0.9e00", "0.0.9eff" ]
            ]
          },
        "data2": {
          "boot_device": ...,
          "data_devices": ...,
          "runtime_devices": ...
        },
        "data3": {
          "boot_device": ...,
          "data_devices": ...,
          "runtime_devices": ...
        },
        "data4": {
          "boot_device": ...,
          "data_devices": ...,
          "runtime_devices": ...
        },
        "data5": {
          "boot_device": ...,
          "data_devices": ...,
          "runtime_devices": ...
        }
      },
      
    Table 3. Old and new storage environment configurations (continued)
    Old configuration (version 7.5.2 and lower) New configuration (version 7.5.3 and higher)
    
    
      "storage_maps": [
        {
          "boot_device": "0.0.9b11", "0.0.9b14", ...
          "map": [
            {
              "primary": "0.0.9b12",
              "copy":    "0.0.1b25"        },
            {
              "primary": "0.0.9b13",
              "copy":    "0.0.1b26"
            },
            {
              "primary": ["0.0.9c00","0.0.9cff"],
              "copy":    ["0.0.1d00","0.0.1d0f"]
            },
            {
              "primary": "0.0.9b15",
              "copy":    "0.0.9b27"
            },
            {
              "primary": "0.0.9b16",
              "copy":    "0.0.9b28"
            },
            {
              "primary": ["0.0.9e00","0.0.9eff"],
              "copy":    ["0.0.1f00","0.0.1f0f"]
            },
    
            .
            .
            .
    
          ]
        }
      ]
    
    
      {
        "head": {
          "boot_device": {
            "type": "dasd",
            "device": "0.0.9b14"
          },
          "runtime_devices": {
            "type": "dasd",
            "devices": [
              "0.0.9b25",
              "0.0.9b26"
            ]
          },
          "data_devices": {
            "type": "dasd",
            "devices": [
              [ "0.0.1d00", "0.0.1d0f" ]
            ]
          }
        },
        "data1": {
          "boot_device": {
            "type": "dasd",
            "device": "0.0.9b14"
          },
          "runtime_devices": {
            "type": "dasd",
            "devices": [
              "0.0.9b27",
              "0.0.9b28"
            ]
          },
          "data_devices": {
            "type": "dasd",
            "devices": [
              [ "0.0.1f00", "0.0.1f0f" ]
            ]
          },
        "data2": {
          "boot_device": ...,
          "data_devices": ...,
          "runtime_devices": ...
        },
        "data3": {
          "boot_device": ...,
          "data_devices": ...,
          "runtime_devices": ...
        },
        "data4": {
          "boot_device": ...,
          "data_devices": ...,
          "runtime_devices": ...
        },
        "data5": {
          "boot_device": ...,
          "data_devices": ...,
          "runtime_devices": ...
        }
      }
    ]
    

    An accelerator can use up to four types of storage: the boot device, the runtime data pool, the data pool for operative data, and, optionally, a transient pool for temporary data. Each storage device or storage pool can use DASD (ECKD) or ZFCP (SCSI) devices. The mixing of different device types is not supported. However, the size of individual devices in a pool is not restricted.

    Important: You can add devices to a storage pool at any time while the accelerator is online. New devices are automatically integrated into a storage pool. This does not require a reset or a restart. However, you cannot remove devices from a storage pool in this way. A removal requires an entirely new installation.
    "head" (required)
    This section lists the storage devices for the head node.

    Each node uses three categories of storage: the boot device, the runtime data pool, and the data pool for operative data. You must define these storage devices by using the following attributes in the configuration file:

    "boot_device" (required)
    A boot device contains the software image that is written by the Secure Service Container (SSC) installer. The cluster will be started from the boot devices of its nodes. The boot devices are also the target devices for uploading the SSC installer image before the initial deployment or before an update. A boot device must be a single device with at least 40 GB net storage capacity.

    A boot device uniquely identifies a storage environment. The storage environment (definition) that lists the currently active boot device of a node will be used. An node without a valid storage environment is invalid.

    Important: It is not possible to change the boot device during an update.
    "runtime_devices" (required)
    A runtime device is used by the Docker container that runs the accelerator software on each node. It does not contain user data and its size is fixed. The size does not depend on the amount of user data processed by the accelerator Specify a list of devices with a total net capacity of at least 80 GB for each node .During normal operation, the utilization rate of a single device should not exceed 80 percent. If it does exceed 80 percent most of the time, consider adding devices.
    "data_devices" (required)
    A data device is used to store user data and the temporary data of the accelerator (table data). It is typically the largest storage area of an individual node. Its size is determined by the amount of data that the node has to handle. During normal operation, the utilization rate of a single device should not exceed 80 percent. If it does exceed 80 percent most of the time, consider adding devices.
    "transient_devices" (optional)
    You can use a transient data pool for the storage of temporary data. If you do use a transient pool, temporary data does not occupy disk space in your data pool. When in use, a transient storage pool behaves in the same way as transient_storage, except that the performance of NVMe drives for transient storage is better than the performance of a transient pool on disks. Nevertheless, the separation of the data pool into a pool for temporary data and other data leads to a significant performance gain when compared to a solution where all this data resides in a single data pool.

    A transient storage pool is recommended if your accelerator is not deployed on a LinuxOne computer (where you could use NVMe drives), but on a classic IBM Z computer.

    For example:

    "transient_devices": {
          "type": "dasd",
          "devices": [
            "0.0.9b12"
          ]
        }
    Important:
    • In a multi-node environment, you must define a transient data pool on all the LPARs that are used by your data pool. You must also assign the same amount of storage to the transient pool on each LPAR.
    • A transient storage pool and transient storage with NVMe drives cannot be used at the same time. A configured transient storage pool is therefore ignored if transient storage with NVMe drives exists.
    • The existence of transient storage (pool or NVMe drives) invalidates any custom setting of the "temp_working_space" parameter because this parameter applies to temporary data in the data pool only. As soon as temporary data is processed on transient storage, the "temp_working_space" parameter becomes ineffective.
    • In a Geographically Dispersed Parallel Sysplex® (GDPS), you can have different transient storage configurations for the different sites. For example, it is possible to use internal NVMe transient storage with 150 TB capacity on the primary site, NVMe transient storage with 10 TB capacity on the secondary site, and temporary working space of the data pool on the third site.

    For an in-depth discussion of the transient storage options, see Configuring and using transient storage for IBM Db2 Analytics Accelerator for z/OS on Z.

    "type" (required)
    This is the type of storage to be used (disk type). Possible values are "dasd" for extended count key data (ECKD) volumes and "zfcp" for Small Computer System Interface (SCSI) volumes. You must specify the type for each device category (that is, the boot device, the runtime device, and the data device).
    Important:
    • It is not possible to mix ECKD and SCSI devices in a single device category or device pool.
    • If you use DASD (ECKD) storage, HyperPAV aliases are strongly recommended because they increase the processing speed.
      Note: The "PAV" in HyperPAV stands for Parallel Access Volumes. It is a concept of using multiple devices or aliases to address a single DASD (ECKD) disk device.
    • DASD (ECKD) devices are formatted by the dasdfmt program of the Linux operating system on the accelerator. This can take a long time, sometimes even hours for large devices or storage pools. HyperPAV aliases also help speed up formatting. Therefore, define HyperPAV aliases also in your initial JSON configuration file.
    • You can use DASD (ECKD) devices of different sizes in a single pool.
    • You can use ZFCP devices of different sizes in a single pool.
    • For ZFCP devices, use Ficon Express (FE) ports because these ports guarantee a much better performance and availability. FE ports are defined in the runtime environment.
    • Although the adding of devices is supported while the accelerator is online, the type cannot be changed after the initialization of the storage pool. To change the type, you must remove the accelerator and reinstall it.
    Example (ECKD or "dasd"):
    "type": "dasd",
    "devices": [
      "0.0.9c00",
      "0.0.9c01",
      "0.0.9c02"
    ]
    
    Example (SCSI or "zfcp"):
    "type": "zfcp",
    "udids": [
      "0c984712545423523614b8d812345632",
      "0c0a0b5c9d15555a545545b46456c4d6"
    ]
    
    "device" or "devices" (required)
    This attribute is used to list the devices by their names or identifiers. You must specify a device or a list of devices for each device category (that is, the boot device, the runtime device, and the data device).
    Example:
    "storage_environments": [
      {
        "head": {
          "boot_device": {
            "type": "dasd",
            "device": "0.0.9b11"
          },
          "runtime_devices": {
            "type": "dasd",
            "devices": [
              "0.0.9b12",
              "0.0.9b13"
            ]
          },
          "data_devices": {
            "type": "dasd",
            "devices": [
              [ "0.0.9c00", "0.0.9cff" ]
            ]
          }
        },
    
    Note: To change the ID of a boot device, a few extra steps are required:
    1. Copy the current storage environment in your JSON file. That is, create a duplicate block in the file.
    2. Change the ID in the copied block.
    3. Restart the accelerator.
    4. Remove the old storage environment from the JSON file when the accelerator is online again.
    "data1" ... "data5"
    Your data nodes. Specify a "boot_device", "runtime_devices", and "data_devices" for each of your data nodes in the same way as for the head node ("storage_environments").
    Definition of a single storage environment (partly shortened):
    "storage_environments": [
      {
        "head": {
          "boot_device": {
            "type": "dasd",
            "device": "0.0.9b11"
          },
          "runtime_devices": {
            "type": "dasd",
            "devices": [
              "0.0.9b12",
              "0.0.9b13"
            ]
          },
          "data_devices": {
            "type": "dasd",
            "devices": [
              [ "0.0.9c00", "0.0.9cff" ]
            ]
          }
        },
        "data1": {
          "boot_device": ...,
          "data_devices": ...,
          "runtime_devices": ...
        },
        "data2": {
          "boot_device": ...,
          "data_devices": ...,
          "runtime_devices": ...
        },
        "data3": {
          "boot_device": ...,
          "data_devices": ...,
          "runtime_devices": ...
        },
        "data4": {
          "boot_device": ...,
          "data_devices": ...,
          "runtime_devices": ...
        },
        "data5": {
          "boot_device": ...,
          "data_devices": ...,
          "runtime_devices": ...
        }
      }
    ]
    

    In this example, the system might start from an LPAR with the boot device 0.0.9b11, which has the runtime devices 0.0.9b12 and 0.0.9b13, plus the data device ["0.0.9c00","0.0.9cff"], which is an array or enclosure consisting of two disks. What follows is the definition of the storage devices for the data nodes ( data1 through data5). If more than one storage environment were defined, you could use one of the environments for failover purposes. However, it this case, you would have to replicate the disk content.

    Attention: If you use GDPS integration, GDPS will manage the storage replication. In this case, do not define more than one storage environment. Multiple storage environments are not supported in the GDPS context.

    HyperPAV aliases:

    HyperPAV aliases can be used in the following ways:

    • Automatic HyperPAV aliases
    • Explicitly listed HyperPAV aliases

    In automatic HyperPAV mode, all HyperPAV alias devices that are visible to an LPAR and that are connected to the same control-unit image (LCU) are used automatically for that LPAR. To enable the automatic HyperPAV mode, you must add a definition to the storage environments section in the JSON configuration file.

    Important:
    • HyperPAV aliases can be used with DASD (ECKD) storage only.
    • Make sure that only the volumes and HyperPAV aliases you want to use on a particular LPAR are visible to that LPAR. This is even more important if you use automatic alias devices because in that case, your accelerator (SSC) LPAR has to sift through all visible devices just to determine and activate the alias devices.
    • The use of HyperPAV aliases requires a change in the input/output definition file (IODF). See Input/output definition file (IODF) for more information.
    • Having added HyperPAV devices to an existing configuration, you must shut down and restart the affected accelerator (SSC) LPAR. For more information, see Shutting down and restarting a cluster (multi-node setup).
    Example:
    "storage_environments": [
      {
        "boot_device": {
              .
              .
        },
        "runtime_devices": {
              .
              .
        },
        "data_devices": {
              .
              .
        },
        "hyperpav": "auto"
      }
    ]

    Mind that in "auto" mode, the system uses all available HyperPAV alias devices.

    To use just a subset of the available devices, it is preferable to use explicitly listed HyperPAV aliases. as in the following example:

    "hyperpav": [
      [
        "0.1.4000",
        "0.1.4007"
      ],
      "0.1.1234"
    ]

    In this particular case, the system uses a range of HyperPAV aliases from 0.1.4000 to 0.1.4007 plus a single HyperPAV alias with the ID 0.1.1234.

  6. When you're finished with your configuration file, upload it to the Admin UI.
    On the Accelerator Configuration Definition page, click the upload button (see also Figure 1):
    The upload button on the Accelerator Configuration Definition page
    If something is wrong with the file you uploaded, an error message is displayed on the page:
    Figure 4. Error message after uploading a faulty configuration file
    An error message is displayed on top of the page if something is wrong with the uploaded file.
  7. If errors occurred, fix these and repeat the upload (steps 5 and 6).
    If no errors occurred, the Accelerator Configuration Definition page shows the settings of your configuration file in expandable sections. You can expand each section to display its settings by clicking the downward-pointing arrows.
    Figure 5. Accelerator Configuration Definition after a successful configuration file upload
    Accelerator Configuration Definition page after a successful configuration file upload. Settings in the file are represented by a folder structure. The folders have been expanded to show the settings.
  8. Click Node Credentials.
  9. In the Multi-Node Credentials window, enter the IP addresses of the accelerator's data-node LPARs. Also provide the user ID and the password of each accelerator (SSC) LPAR, as defined by the HMC. This is required during the initial installation or during an upgrade of the accelerator.
    The HMC credentials are needed to deploy the images on the LPARs. When an initial installation or upgrade has been completed, the HMC credentials are no longer required. The HMC credentials are not stored or otherwise saved by the accelerator. Therefore, a change of the user IDs or passowords on the HMC has no impact on running accelerators.
    See Figure 1.
    Figure 6. The Multiple Node Deployment Credentials window
    The Multiple Node Deployment Credentials window, which opens after you click Node Credentials.
  10. Click Validate.
    Check the validation results. The Valid column on the very right of the window must show green dots for each node. Here is an example of a successful validation:
    Figure 7. The Multiple Node Deployment Credentials window after a successful validation
    The window shows green icons indicating that the validation has been successful. It also shows an Apply button instead of the Validate button.
  11. If the credentials of all data nodes are valid (indicated by green icons), you can click Apply.
    A Status window is displayed, which indicates the progress:
    Status indicator window, which comes up after clicking the Apply button

Results

When these processes have finished, the Accelerator Components Health Status page is displayed automatically. The page should now give you the following information:
Figure 8. The Accelerator Components Health Status page is displayed after a successful configuration
This image shows the information on the Accelerator Components Health Status page when the appliance has finally been started. You see a table that shows the status of all components: appliance infrastructure, appliance runtime, appliance authentication service, appliance data service, and Db2 accelerator service. The rows that deal with the appliance infrastructure, appliance runtime, and appliance data service show the status of the head node and all data nodes. The status of the appliance authentication service and the Db2 accelerator service is shown for the head node only. The status of all these components is green. On the right, you find buttons to reset, update, and shut down the appliance.

The message Accelerator status: ready on the top right indicates that all installation steps have been completed and that components have been started for Db2 Analytics Accelerator on Z.

1 RoCE: Remote Direct Memory Access over Converged Ethernet