Portable hardware locality
Portable Hardware Locality (hwloc) is an open source software package that is distributed under BSD license. It provides a portable abstraction (across OS, versions, architectures, and so on) of the hierarchical topology of modern architectures, including NUMA memory nodes, socket, shared caches, cores, and simultaneous multithreading (SMT). hwloc is integrated into LSF to detect hardware information, and can support most of the platforms that LSF supports.
Functionality
The hwloc package gathers various system attributes such as cache and memory information as well as the locality of I/O device such as network interfaces. It primarily aims at helping applications with gathering information about computing hardware.
It also detects each host hardware topology when the LIM starts and the host topology information is changed. The management host LIM detects the topology of the management host. The server host LIM detects the topology of the local host. It updates the topology information to the management host when it joins the cluster or sends topology information to the management host LIM for host configuration. Host topology information is updated once the hardware topology changes. Hardware topology changes if any NUMA memory node, caches, socket, core, PU and so on, changes. Sometimes topology information changes even though the core number did not change.
Use the lim -T and lshosts -T commands to display host topology information. The lim -t command displays the total number of NUMA nodes, total number of processors, total number of cores, and total number of threads.
Structure of topology
A NUMA node contains sockets. Each socket contains cores (processes) which contain threads. If there is no hwloc library, LSF uses the PCT logic. Some AMD CPUs have the opposite structure where socket nodes contain NUMA nodes. The hierarchies of the topology is similar to a tree. Therefore, the host topology information (NUMA memory nodes, caches, sockets, cores, PUs, and so on) from hwloc is organized as a tree. Each tree node has its type. The type includes host, NUMA, socket, cache, core, and pu. Each tree node also includes its attributes.
In the following example, hostA has 64 GB of memory and two NUMA nodes. Each socket node has one NUMA, eight cores, 16 PUs (two PUs per core), and 32 GB of memory. Both the NUMA nodes and the PUs are numbered in series that is provided by the system. LSF displays NUMA information based on the level it detects from the system. The output format displays as a tree, and the NUMA information displays as NUMA[ID: memory]. The PU displays as parent_node(ID ID ...), where parent_node may be host, NUMA, socket, or core.
Host[64G] hostA
Socket0
NUMA[0: 32G]
core0(0 16)
core1(1 17)
core2(2 18)
core3(3 19)
core4(4 20)
core5(5 21)
core6(6 22)
core7(7 23)
Socket1
NUMA[1: 32G]
core8(8 24)
core9(9 25)
core10(10 26)
core11(11 27)
core12(12 28)
core13(13 29)
core14(14 30)
core15(15 31)Some CPUs, especially old ones, may have incomplete hardware topology in terms of missing information for NUMA, socket, or core. Therefore, their topology is incomplete.
For example,
- hostB (with one Intel Pentium 4 CPU) has 2G of memory, one socket,
one core, and two PUs per core. Information on hostB is displayed as
follows:
Host[2G] hostB Socket core(0 1) - hostC (with one Intel Itanium CPU) has 4 GB of memory, and two PUs.
Information on hostC is displayed as
follows:
Host[4G] (0 1) hostC
Some platforms or operating system versions will only report a subset of topology information.
Host[1009M] hostD
Socket (0 1)Dynamically load the hwloc library
You can configure LSF to dynamically load the hwloc library from the system library paths to detect newer hardware. This allows you to use the latest supported version of the hwloc (2.11.1) and LSF integration at any time if there are no compatibility issues between this version of the hwloc library and header file for hwloc. If LSF fails to load the library, LSF defaults to using the hwloc functions in the static library.
Enable the dynamic loading of the hwloc library by enabling the LSF_HWLOC_DYNAMIC parameter in the lsf.conf file.