May 4, 2023 By Augie Mena 5 min read

Summarizing key functional and performance updates in the latest releases of the IBM Spectrum LSF offering on IBM Cloud.

The IBM Spectrum LSF on IBM Cloud offering allows customers to easily deploy a cluster of compute nodes where they can run their High-Performance Computing (HPC) workloads on IBM Cloud by using the IBM Spectrum LSF scheduling software. Our offering was initially released in 2021. In this blog, we provide a summary of additional features that we have added to the offering in recent releases.   

Intel oneAPI HPC Toolkit

The Message Passing Interface (MPI) is provided as a software library that is used for communication between different processes that are located either on the same virtual machine or on different virtual machines. MPI has been widely adapted by the HPC community, and it plays a key role in achieving scalable performance across many nodes or VMs in a cluster. HPC users can choose from several MPI implementations, including Intel MPI and open-source versions, such as OpenMPI and MPICH.

In a recent release of the Spectrum LSF offering, we added support for use of the Intel MPI library for Intel oneAPI as an alternative to the use of OpenMPI. And, in order to evaluate performance of the library, we chose SNAP as the communication-intensive HPC application to use for this purpose.  SNAP is commonly used to evaluate commodity HPC clusters by the U.S. Department of Energy Labs: Livermore, Los Alamos, and Sandia. We used the version of SNAP that can be found here.

A technical report that includes details of the LSF cluster environment on IBM Cloud, how the SNAP benchmark was executed and observations and analysis of the results are in this white paper here. However, we share the following summary of the results that demonstrate good scalability of the benchmark in an LSF cluster consisting of up to 63 compute nodes. 

SNAP output includes multiple metrics, but the primary metric is the Figure of Merit (FOM), which is based on the solve time for a particular problem and the number of iterations and the total number of unknowns (parts of the problem to solve). The FOM is a direct indicator of the performance of the system. If you solve the same problem in half the time, the FOM increases by 2x, and if you solve a problem with 2x more unknowns in the same time per iteration, the FOM increases by 2x. We studied weak-scaling, where the size of the global domain increases linearly with the number of MPI ranks. For ideal scaling, the solve time per iteration would remain constant and the FOM would increase linearly with the number of MPI ranks. 

A summary of the performance measurements is provided in Table 1, which shows the solve time and the Figure of Merit for SNAP, scaling from 8 cores (1 compute node) to 504 cores (63 compute nodes) on IBM Cloud. When using a single compute node, all communication is through shared memory, which is very efficient. As one scales out to an increasing number of compute nodes, more of the communication is over the Ethernet interface using TCP, and a larger fraction of the solve time is spent on MPI communication, resulting in somewhat lower performance per node. However, the aggregate performance, as indicated by the Figure of Merit, continues to show significant improvement over the full range of compute nodes available in our LSF cluster and close to linear scaling, as can be seen in Figure 1.

Table 1. Performance results for SNAP on IBM Cloud using a local domain with dimensions 640x4x4 grid cells.

Figure 2. Scaling curve for SNAP on IBM Cloud using a local domain with dimensions 640x4x4 grid cells. The dashed lines indicate perfect linear scaling relative to measured values at 64 cores or 256 cores.

LSF Application Center

IBM Spectrum LSF Application Center provides a flexible and easy-to-use interface for cluster users and administrators. It enables users to interact with intuitive, self-documenting interfaces, and it is now included as an option to use as part of our LSF offering.

The LSF Application Center web-based UI provides the ability to easily do the following:

  • Create and manage cluster users and access permissions.
  • Select the types of notifications and alerts to receive about jobs.
  • Submit, monitor and control jobs.
  • Monitor usage of compute nodes in the cluster.

Screenshots of the LSF Application Center login screen as well as views of some of the capabilities mentioned above are included here:

Figure 3. LSF Application Center login screen.

Figure 4. LSF user role and permission management.

Figure 5. LSF job status notification options.

Figure 6. LSF job resource requirement and other options.

Figure 7. LSF job status monitoring.

For more detailed information on LSF Application Center details and its usage, see the official documentation here

Custom image creation

The IBM Spectrum LSF offering includes two default custom images that are used to provision the VSIs for the clusters:

  • LSF worker and management nodes
  • Storage nodes (in the case that Spectrum Scale storage is selected for use) 

However, users can supply their own custom images that may include, for instance, additional software required by their HPC applications.

In a recent release, we have added scripts and documentation that make it simple for users to create their own custom images. The scripts make use of the popular Packer, which is an automated virtual machine image creation tool.

Summary of new Spectrum LSF on IBM Cloud features

Since the initial release of Spectrum LSF on IBM Cloud, we have continued to improve its usability with the introduction of new functional and performance related features. In this blog post, we described a few of those features which have been added in recent releases:

  • Inclusion of the Intel oneAPI HPC toolkit for use by applications running on the cluster worker nodes
  • An option to deploy LSF Application Center within the cluster and provide an easy-to-use interface for cluster user administration and job submission and monitoring
  • Scripts and documentation that simplify the process for custom image creation

In order to evaluate if your HPC applications may benefit from use of the offering, see the IBM Spectrum LSF on IBM Cloud documentation.

More from Cloud

New 4th Gen Intel Xeon profiles and dynamic network bandwidth shake up the IBM Cloud Bare Metal Servers for VPC portfolio

3 min read - We’re pleased to announce that 4th Gen Intel® Xeon® processors on IBM Cloud Bare Metal Servers for VPC are available on IBM Cloud. Our customers can now provision Intel’s newest microarchitecture inside their own virtual private cloud and gain access to a host of performance enhancements, including more core-to-memory ratios (21 new server profiles/) and dynamic network bandwidth exclusive to IBM Cloud VPC. For anyone keeping track, that’s 3x as many provisioning options than our current 2nd Gen Intel Xeon…

IBM and AWS: Driving the next-gen SAP transformation  

5 min read - SAP is the epicenter of business operations for companies around the world. In fact, 77% of the world’s transactional revenue touches an SAP system, and 92% of the Forbes Global 2000 companies use SAP, according to Frost & Sullivan.   Global challenges related to profitability, supply chains and sustainability are creating economic uncertainty for many companies. Modernizing SAP systems and embracing cloud environments like AWS can provide these companies with a real-time view of their business operations, fueling growth and increasing…

Experience unmatched data resilience with IBM Storage Defender and IBM Storage FlashSystem

3 min read - IBM Storage Defender is a purpose-built end-to-end data resilience solution designed to help businesses rapidly restart essential operations in the event of a cyberattack or other unforeseen events. It simplifies and orchestrates business recovery processes by providing a comprehensive view of data resilience and recoverability across primary and  auxiliary storage in a single interface. IBM Storage Defender deploys AI-powered sensors to quickly detect threats and anomalies. Signals from all available sensors are aggregated by IBM Storage Defender, whether they come…

IBM Newsletters

Get our newsletters and topic updates that deliver the latest thought leadership and insights on emerging trends.
Subscribe now More newsletters