IBM Support

IBM Spectrum Scale: GPUDirect Storage (GDS) is available as a Technical Preview feature in Spectrum Scale 5.1.1.

Flashes (Alerts)


Abstract

GPUDirect Storage (GDS) is available as a Technical Preview feature in the Spectrum Scale 5.1.1 release.

Content

This web page describes the use of GPUDirect Storage (GDS) with Spectrum Scale version 5.1.1. In this release GDS is provided as a tech preview only. The succeeding version of Spectrum Scale 5.1.2 has been released in October 2021 and it includes the supported version of GPUDirect Storage. Please check https://www.ibm.com/docs/en/spectrum-scale/5.1.2?topic=architecture-gpudirect-storage-support-spectrum-scale for details (supported functionality, software versions, limitations, etc.). IBM Spectrum Scale 5.1.2 is the recommended version for GPUDirect Storage. Please upgrade to this level if you are using GPUDirect Storage.
Before using this feature, one should be familiar with the NVIDIA GDS online documentation: docs.nvidia.com/gpudirect-storage
The NVIDIA GDS driver that supports Spectrum Scale 5.1.1 can be downloaded from partners.nvidia.com
The remainder of this document lists the requirements and restrictions for the Spectrum 5.1.1 GDS Technical Preview and is intended as an addendum to the NVIDIA GDS documentation.
Requirements:
  • The storage must not be locally attached to the GDS clients as GDS requires the NSD path for storage access. ESS and non-ESS storage servers are supported. The storage servers may be in the same or different cluster as the GDS clients.
  • GDS requires Mellanox RDMA over InfiniBand between GDS clients and storage servers (RoCE is not supported).
  • Hardware:
    • GDS clients: x86 with a GPU model that supports GDS (refer to the NVIDIA GDS documentation)
    • Network: EDR or HDR InfiniBand
    • InfiniBand adapter: Mellanox CX4, CX5 or CX6 (CX4 firmware must be 12.27.4000 or higher)
  • Software:
    • MOFED:
      • GDS clients: MOFED 5.2.1.0.4.0
      • Storage Servers:
        • ESS storage servers: reinstall MOFED adding “--upstream-libs” (use MOFED version that ships with the ESS)
          • "ofed_uninstall.sh --force"
          • mount MOFED iso found in "/install/ess/sync/rhels8/x86_64/mofed" on IO servers
          • "mlnxofedinstall --add-kernel-support --disable-kmp --without-fw-update --upstream-libs"
        • non-ESS storage servers:  install MOFED 5.2.1.0.4.0
    • Spectrum Scale 5.1.1 PTF 1 (GDS clients and storage servers)
    • Nvidia (GDS clients)
      • CUDA:  11.0
      • GDS driver: refer to the link - partners.nvidia.com
    • OS (GDS clients): RHEL 8.3, Ubuntu 20.04
  • Before starting the Spectrum Scale daemons, set the verbs configuration as follows using "mmchconfig" on the GDS clients and storage servers:
    • verbsRdma enable 
    • verbsRdmaSend yes
    • verbsPorts (see "Additional Considerations" section below)
    • verbsRdmaCm disable
  • Before building the Spectrum Scale Linux portability layer on the GDS clients, remove the comment around "#define GPU_DIRECT_STORAGE" in"/usr/lpp/mmfs/src/gpl-linux/verdep.h"
     Before: /* #define GPU_DIRECT_STORAGE */
     After:         #define GPU_DIRECT_STORAGE
  • Set "mmchconfig IgnoreNonDioInstCount=yes" before starting the Spectrum Scale daemons.
  • In the NVIDIA GDS config file cufile.json on the GDS clients, set allow_compat_mode to false.

Restrictions:
  • Spectrum Scale does not support GDS write (cuFileWrite).
  • Spectrum Scale does not support GDS on files less than 4096 bytes in length.
  • Spectrum Scale does not support GDS on sparse files or files with pre-allocated storage (for example, fallocate(), gpfs_prealloc(), etc)
  • Spectrum Scale does not support GDS on files that are encrypted.
  • Spectrum Scale does not support GDS on memory-mapped files.
  • Spectrum Scale does not support GDS on files that are compressed or marked for deferred compression (for more information on compression, refer to www.ibm.com/docs/en/spectrum-scale/5.1.0?topic=reference-file-compression)
  • Spectrum Scale does not support GDS if the mmchconfig option "disableDIO" is set to "true" (the default value of "disableDIO" is "false")
The cases listed above are handled in "compatibility mode", providing correct operation but at a lower performance than GDS. In this mode, the NVIDIA GDS lib issues a non-GDS Direct IO operation, and manually transfers data between the system memory buffer and the user application GPU buffer.
Additional Restrictions:
  • Spectrum Scale does not support GDS on files that use data tiering, include transparent cloud tiering. 
  • Spectrum Scale does not support the NVIDIA GDS asynchronous "poll" mode. The NVIDIA GDS lib implicitly converts a poll mode request on a file in a Spectrum Scale mount to a synchronous GDS IO request.
  • Spectrum Scale does not support GDS on files in snapshots or clones. If a GDS read is issued on a file in a snapshot or clone, -EIO is returned to the user application.
Other Considerations:
  • Reading a file with GDS read concurrent with a buffered read will not deliver GDS performance for the GDS thread. This limitation holds whether the concurrent threads are part of the same or different user application. In this context, "buffered" is defined as non-GDS, non-DirectIO.
  • The NVIDIA GDS utility "gdscheck -p" should be run prior to GDS workloads to verify the environment. Particular attention should be given to ACS and the IOMMU, as these affect GDS function and performance. Refer to the NVIDIA GDS documentation for more information.
  • GDS clients and the storage servers that serve the target GDS filesystems must have Spectrum Scale 5.1.1 PTF 1 installed.
  • GDS clients and storage servers must be configured to use one InfiniBand (IB) network for verbs and RDMA. For clusters with two IB networks, adjust the mmchconfig option "verbPorts" to restrict verbs and RDMA to one of the two IB networks. Note this requirement doesn't limit the number of IB links (connected to the one IB network) at the storage servers and GDS clients.
  • The IP over IB addresses specified in the NVIDIA GDS config file cufile.json must be consistent with the setting of verbsPorts on the GDS clients. Refer to the NVIDIA GDS documentation for more information on the NVIDIA config file.
  • Tuning GDS performance is dependent on the workload and machine architecture of the GDS client. It is generally advised to contact IBM for tuning assistance with the 5.1.1 GDS tech preview, but when using DGX-A100 machines as GDS clients, the following two settings in the NVIDIA GDS config file cufile.json are typically sufficient to achieve good performance when using the IB NICs attached directly to the CPU complex, sometimes referred to as the "storage" NICs:

    • "rdma_load_balancing_policy": "RoundRobin"

    • "rdma_access_mask": "0x1f"

Known Issues:
  • Running a GDS workload concurrent with a non-GDS workload on the same client is not recommended. Mixed workloads on a client can eliminate the GDS performance benefit, and in rare cases, cause a kernel crash in GDS driver de-registration. 

[{"Type":"MASTER","Line of Business":{"code":"LOB26","label":"Storage"},"Business Unit":{"code":"BU058","label":"IBM Infrastructure w\/TPS"},"Product":{"code":"STXKQY","label":"IBM Spectrum Scale"},"ARM Category":[{"code":"a8m50000000KzqOAAS","label":"NSD-\u003ENSD Performance"},{"code":"a8m50000000KzqOAAS","label":"NSD-\u003ENSD Performance"},{"code":"a8m50000000KzqOAAS","label":"NSD-\u003ENSD Performance"}],"ARM Case Number":"","Platform":[{"code":"PF016","label":"Linux"}],"Version":"5.1.1"}]

Document Information

Modified date:
17 November 2021

UID

ibm16444075