IBM Support

IBM Spectrum Scale: multi-connection over TCP (MCOT): tuning may be required

News


Abstract

Starting with IBM Spectrum Scale 5.1.1, multiple connections over TCP is enabled by default which enhances the IBM Spectrum Scale communication subsystem to establish multiple TCP connections for communication between each pair of daemons. Using more connections may improve performance and resiliency, particularly for some bonded configurations which can use more than one physical interface per node with multiple TCP connections. Depending on factors like network speed and number of nodes, the value of the new maxTcpConnsPerNodeConn parameter may need to be adjusted up or down.

Content

maxTcpConnsPerNodeConn config parameter

Though this feature cannot be disabled, maxTcpConnsPerNodeConn can be used to control the maximum number of TCP connections that the GPFS mmfsd daemon will establish to another node. Valid values are 1-8, with the default being 2. For any given pair of nodes, the number of established TCP connections will be the minimum value of maxTcpConnsPerNodeConn defined on these two nodes. For more information about how to configure the maxTcpConnsPerNodeConn parameter, see the mmchconfig command.

Tuning guide for maxTcpConnsPerNodeConn

The value that is assigned to maxTcpConnsPerNodeConn must be defined after you consider the following factors:

  • The overall bandwidth of the cluster network.
  • The number of nodes in the cluster.
  • The value that is configured for the maxReceiverThreads parameter.
  • Memory resource implications of setting a higher value for the maxTcpConnsPerNodeConn parameter.

Tuning based on the overall bandwidth of the cluster network

The bandwidth that one TCP connection can achieve depends on a number of factors such as tuning, CPU, memory, and network adapter performance. However, a good starting point estimation is to assume that the bandwidth per each TCP connection can be as much as 25 Gbps. You can then set a larger value for the maxTcpConnsPerNodeConn parameter on faster networks so that the full network bandwidth can be used.

Tuning based on the number of nodes in the cluster

As the number of nodes in a cluster increases, you should decrease the value of the maxTcpConnsPerNodeConn parameter because each node requires a number of connections to other nodes, and the additional connections consume more network bandwidth. That is, large clusters with many client connections to NSD server nodes might not need to set maxTcpConnsPerNodeConn higher than 1, unless there are other traffic patterns such as the use of HDFS transparency, which can benefit from this tuning.
The following table provides guidelines for the value of maxTcpConnsPerNodeConn in proportion to the number of nodes in the cluster.

Number of nodes in the cluster Network bandwidth   Value of maxTcpConnsPerNodeConn
Less than 100 100 Gbps 2 - 8
100 - 1000 100 Gbps 1 - 4
1000 - 2000 100 Gbps 1 - 2
More than 2000 100 Gbps 1
 

Implications on receiver threads:

Configuring maxTcpConnsPerNodeConn has a potential impact on the maxReceiverThreads parameter because the additional network connections might require more receiver threads. Each receiver thread can typically monitor up to 128 TCP connections, but optimal performance can be achieved when each receiver thread monitors fewer connections. The value of maxReceiverThreads should be selected after considering the value of the maxTcpConnsPerNodeConn parameter and the number of nodes in your cluster. Some large clusters need to increase the value of maxReceiverThreads based on the number of TCP connections that will be needed to other nodes in both local clusters and remote clusters that are joined. The total number of TCP connections that are required is calculated by using the following formula: (maxTcpConnsPerNodeConn*(number of nodes -1)).

The maximum number of receiver threads that are created on any node is defined to be the minimum of the number of logical CPUs on the node and the value of the maxReceiverThreads parameter. You can specify a value in the 1-128 range for the maxReceiverThreads parameter, with the default value being 32. For more information about how to configure maxReceiverThreads, see the mmchconfig command.

Implications on memory resources

Setting a higher value for this parameter can require more memory for allocations such as kernel socket buffers, and it can put more pressure on memory-related resources in IBM Spectrum Scale.

Compatibility

5.1.1 can coexist with prior 5.0.x and 5.1.0 (pre-MCOT) versions, but those older levels need to be upgraded to 5.0.5.6 (IJ30634) (ESS 5.3.7)  or later, or 5.1.0.3 (IJ30606) to include the fix for a problem where an RPC may be handled twice after a TCP reconnect.

Fix central link For IBM Spectrum Scale V5.1.1 or later:

https://www.ibm.com/support/fixcentral/swg/selectFixes?parent=Software%20defined%20storage&product=ibm/StorageSoftware/IBM+Spectrum+Scale&release=5.1.1&platform=All&function=all

Fix central link For IBM Spectrum Scale V5.1.0.3 or later:

https://www.ibm.com/support/fixcentral/swg/selectFixes?parent=Software%20defined%20storage&product=ibm/StorageSoftware/IBM+Spectrum+Scale&release=5.1.0&platform=All&function=all

Fix central link For IBM Spectrum Scale V5.0.5.6 (ESS 5.3.7) or later:

https://www.ibm.com/support/fixcentral/swg/selectFixes?parent=Software%20defined%20storage&product=ibm/StorageSoftware/IBM+Spectrum+Scale&release=5.0.5&platform=All&function=all

https://www.ibm.com/support/fixcentral/swg/selectFixes?parent=Software%20defined%20storage&product=ibm/StorageSoftware/IBM+Elastic+Storage+Server+(ESS)&release=5.3.0&platform=All&function=all

[{"Type":"SW","Line of Business":{"code":"LOB26","label":"Storage"},"Business Unit":{"code":"BU058","label":"IBM Infrastructure w\/TPS"},"Product":{"code":"STXKQY","label":"IBM Spectrum Scale"},"ARM Category":[{"code":"a8m50000000Kzw8AAC","label":"Classifications->Cause->Network"}],"ARM Case Number":"","Platform":[{"code":"PF002","label":"AIX"},{"code":"PF016","label":"Linux"},{"code":"PF033","label":"Windows"}],"Version":"5.1.1"}]

Document Information

Modified date:
28 April 2021

UID

ibm16446651