Portable hardware locality

Portable Hardware Locality (hwloc) is an open source software package that is distributed under BSD license. It provides a portable abstraction (across OS, versions, architectures, and so on) of the hierarchical topology of modern architectures, including NUMA memory nodes, socket, shared caches, cores, and simultaneous multithreading (SMT). hwloc is integrated into LSF to detect hardware information, and can support most of the platforms that LSF supports.

Functionality

The hwloc package gathers various system attributes such as cache and memory information as well as the locality of I/O device such as network interfaces. It primarily aims at helping applications with gathering information about computing hardware.

It also detects each host hardware topology when the LIM starts and the host topology information is changed. The management host LIM detects the topology of the management host. The server host LIM detects the topology of the local host. It updates the topology information to the management host when it joins the cluster or sends topology information to the management host LIM for host configuration. Host topology information is updated once the hardware topology changes. Hardware topology changes if any NUMA memory node, caches, socket, core, PU and so on, changes. Sometimes topology information changes even though the core number did not change.

Use the lim -T and lshosts -T commands to display host topology information. The lim -t command displays the total number of NUMA nodes, total number of processors, total number of cores, and total number of threads.

Structure of topology

A NUMA node contains sockets. Each socket contains cores (processes) which contain threads. If there is no hwloc library, LSF uses the PCT logic. Some AMD CPUs have the opposite structure where socket nodes contain NUMA nodes. The hierarchies of the topology is similar to a tree. Therefore, the host topology information (NUMA memory nodes, caches, sockets, cores, PUs, and so on) from hwloc is organized as a tree. Each tree node has its type. The type includes host, NUMA, socket, cache, core, and pu. Each tree node also includes its attributes.

In the following example, hostA has 64 GB of memory and two NUMA nodes. Each socket node has one NUMA, eight cores, 16 PUs (two PUs per core), and 32 GB of memory. Both the NUMA nodes and the PUs are numbered in series that is provided by the system. LSF displays NUMA information based on the level it detects from the system. The output format displays as a tree, and the NUMA information displays as NUMA[ID: memory]. The PU displays as parent_node(ID ID ...), where parent_node may be host, NUMA, socket, or core.

In the following example, NUMA[0: 32G] means that the NUMA ID is 0 and has 32 GB of memory. core0(0 16) means that there are two PUs under the parent core node, and the ID of the two PUs are 0 and 16.
Host[64G] hostA 
Socket0
  NUMA[0: 32G]
   core0(0 16)
   core1(1 17)
   core2(2 18)
   core3(3 19)
   core4(4 20)
   core5(5 21)
   core6(6 22)
   core7(7 23)
Socket1
 NUMA[1: 32G]
   core8(8 24)
   core9(9 25)
   core10(10 26)
   core11(11 27)
   core12(12 28)
   core13(13 29)
   core14(14 30)
   core15(15 31)

Some CPUs, especially old ones, may have incomplete hardware topology in terms of missing information for NUMA, socket, or core. Therefore, their topology is incomplete.

For example,

  • hostB (with one Intel Pentium 4 CPU) has 2G of memory, one socket, one core, and two PUs per core. Information on hostB is displayed as follows:
    Host[2G] hostB 
    Socket
          core(0 1)
  • hostC (with one Intel Itanium CPU) has 4 GB of memory, and two PUs. Information on hostC is displayed as follows:
    Host[4G] (0 1) hostC

Some platforms or operating system versions will only report a subset of topology information.

For example, hostD has the same CPU as hostB, but hostD is running RedHat Linux 4, which does not supply core information. Therefore, information on hostD is displayed as follows:
Host[1009M] hostD
Socket (0 1)

Dynamically load the hwloc library

You can configure LSF to dynamically load the hwloc library from the system library paths to detect newer hardware. This allows you to use the latest supported version of the hwloc (2.11.1) and LSF integration at any time if there are no compatibility issues between this version of the hwloc library and header file for hwloc. If LSF fails to load the library, LSF defaults to using the hwloc functions in the static library.

Enable the dynamic loading of the hwloc library by enabling the LSF_HWLOC_DYNAMIC parameter in the lsf.conf file.