IBM Support

10 Gbit Ethernet Bad Assumptions and Best Practice

How To


Summary

Just because 10 Gbit networking is 10 time faster than 1 Gbit - don't assume you get that in practice with no effort.

Objective

Nigels Banner

Steps

This Gareth Coates article dates from on DeveloperWorks and got Visits=84652. Hence the reference to POWER6 and POWER7 but the tuning is still valid, even if newer network go much faster.
Below is from your Power Systems, Advanced Technical Support team in Europe (EMEA) with lots of input from many people.
 
We find many people are over optimistic and making assumptions - which can catch them out - we learnt the hard way too.
 
Don't Assume 1 - The same boost as last time
  • It may be that you migrated from 2Mit to 10Mbit and saw approximately a 5 fold increase in performance.
  • Similarly when you went from 10Mbit to 100Mbit, and from 100Mbit to 1Gbit; you saw approximately a 10 fold improvement.
  • Don't assume that when you go from 1Gbit to 10Gbit that you will automatically see another 10 fold improvement.
 
Don't Assume 2 - Your old Ethernet cables will work
  • Your structured cabling may be good enough to carry 1Gbit, but will it support 10Gbit?
  • If you are using Copper,
  • Category 6A or better balanced twisted pair cables specified in ISO 11801 amendment 2 or ANSI/TIA-568-C.2 are needed to carry 10GBASE-T up to distances of 100 metre.
  • Category 6 cables can carry 10GBASE-T for shorter distances when qualified according to the guidelines in ISO TR 24750 or TIA-155-A.
  • For Fibre, it is not quite so simple, the best advice is to check rather than to assume that your existing infrastructure is of a high enough specification.
 
Don't Assume 3 - This will not take CPU and memory resources
  • Don't assume that if you simply replace a 1Gbit card with a 10Gbit card that there will immediately be a 10 fold improvement with no change to the LPAR's resources.
  • This applies to Shared Ethernet Adapters (SEA) in VIOS as well.
 
Don't Assume 4 - Everything can be Max-ed out at the same time
  • It not safe to assume that all the ports of a switch can go at full speed all of the time.
  • It is not necessarily the case that a switch can handle all of its ports going at full speed.
  • Check the specifications. The size of the buffers can be important too.
  • Same goes for multiple port adapters - don't expect all four ports to Max-out at the same time as the on-board CPU can be a limit - especially with small packets.
 
Don't Assume 5 - A faster network fixes everything
  • Imagine a scenario as follows:
  • A company has some POWER6 based servers.
  • They are consolidating to POWER7 and are considering adopting 10Gbit technology.
  • A particular LPAR runs a production database and is using 36 rPerfs.
  • They want to have about 2X the database performance, so plan an LPAR with 72 rPerfs on the POWER7 server, and expect a 10X performance increase on the LAN.
  • Well, they have doubled the rPerfs which is OK for the database, but they are expecting the network device drivers and TCP/IP stack to be able to do 10 times as much work and they have not taken that into consideration.
  • In addition, to be able to see a 10 fold increase in bandwidth, the data needs to be available. It is quite possible that applications running on smaller POWER7 LPARs, and their attached storage; will not be able to generate that sort of data flow. If the performance is not as expected, it is not necessarily the network causing a bottleneck. A bottleneck could be a single threaded process or internal application locking on logical resources or internal buffering of the data arriving.
 
 It can work really well - so are our first set of hot tips ...
 
 It is possible to achieve respectable bandwidth when using 10GBit Ethernet on IBM System Power equipment. With some careful consideration, planning, configuration and tuning, it is possible to achieve very good results.
 
Top Tip 1 - Flow Control
  • Turn on Flow Control - everywhere.
  • With 10 Gbit Ethernet it is very useful to turn on flow control to stop the need for retransmission.
  • It is very easy at high bandwidth to completely fill buffers on switches and adapters so that transmitted packets are dropped. This leads to time-outs and retransmits. Together, these lead to delays, wasted bandwidth and compute cycles and uses more energy.
  • If flow control is enabled, the packets will flow as efficiently as possible.
  • There are multiple places to turn on flow control - see following tips....
 
Top Tip 2 - Flow Control on the network switches
  • Turn on flow control On the network switch
  • The method depends upon the manufacturer of the switch
  • Don't assume it is on
 
Top Tip 3 -  Flow Control on AIX networks
  • Turn on flow control  In the Operating System
  • For AIX, you can use chdev -l ent# -a flow_ctrl=yes to turn on flow control
  • Replace # with the number of the adapter
 
Top Tip 4 - Flow Control on the Host Ethernet Adapter (HEA/IVE) - know called a vNIC
  • If the adapter is part of a Host Ethernet Adapter(HEA), you need turn on flow control there too.
  • You can do this via the HMC:
  • Select the Managed System-> Hardware Information -> Adapters-> Host Ethernet
  • Select the appropriate adapter and click "Configure"
 
Top Tip 5 - If possible, directly use the HEA rather than via a SEA
  • You can use a 10 Gbit Host Ethernet adapter as part of a Shared Ethernet Adapter (SEA).
  • Ideally, don't do it. It is completely possible and supported to use a 10Gbit Ethernet adapter as part of an SEA.
  • It is much better to map a logical HEA port directly to each of the client LPARs, and you will see much better performance with less tuning.
  • For LPARs which need a lot of bandwidth, you can dedicate the adapter to the particular LPAR.
  • This by itself would mean that Live Partition Mobility would not be possible.
  • It is possible to dedicate a network adapter to an LPAR as desired and use it as part of an Ether-channel with an SEA as a backup device.
  • LPM is possible with the use of a shell script to temporarily remove the HEA.
  • There is some smitty support for this type of thing - see the Using Live Partition Mobility with a HEA or more detail here
 
Top Tip 6 -  Set the network option Largesend
  • Turn on the network option  largesend
  • When you are using a 10Gbit adapter in AIX, you should turn on largesend using the ifconfig or chdev command on the VIOS SEA.
  • AIX Example: ifconfig en0 largesend
  • AIX Example: chdev -l en0 -a largesend=on
  • VIOS Example: chdev -dev ent2 -attr largesend=1
 
Top Tip 7 - Large packets with MTU
  • We have found that increasing MTU to 64K can have a massive benefit when using 10Gbit adapters in Power servers.
  • This can drastically reduce the number of transactions (one per packet) on the CPUs and adapters.
  • We use, for quickness: smitty chif
  • This uses chdev as in this example: chdev -s en0 -a mtu=65535
  • The actual number you can use can depend on the device you are using, so it the above does not work try 65536 (the full 64K), 65394 (64K minus overhead), 65390 (64K minus VLAN overhead) or at least try 9000.
 
Top Tip 8 - Multiple virtual switches within the POWER6 and POWER7 machines
  • Most people haven't heard of multiple virtual switches but they are worth considering, so don't worry if you are uncertain :-)
  • The "IBM PowerVM Virtualization Introduction and Configuration" Redbook, talks about them in section 2.10.3.
  • At first it describes the concept of one virtual switch and then goes on to talk about multiple virtual switches, which was introduced in POWER6 based servers, multiple virtual switches offer performance and resilience benefits.
  • Also see the few notes in Infocenter Configuring Details Virtual Ethernet Switch.
  • Here is an excellent Whitepaper: Using Virtual Switches in PowerVM to Drive Maximum Value of 10 Gb Ethernet ( format PDF 1.5 MB)  PowerVM-VirtualSwitches-091010.pdf
 
If you have any hot tips, please pass them on via the comments below.
Thanks, Gareth M Coates

Additional Information


Other places to find content from Nigel Griffiths IBM (retired)

Document Location

Worldwide

[{"Business Unit":{"code":"BU058","label":"IBM Infrastructure w\/TPS"},"Product":{"code":"SWG10","label":"AIX"},"Component":"","Platform":[{"code":"PF002","label":"AIX"}],"Version":"All Versions","Edition":"","Line of Business":{"code":"LOB08","label":"Cognitive Systems"}},{"Business Unit":{"code":"BU054","label":"Systems w\/TPS"},"Product":{"code":"HW1W1","label":"Power -\u003EPowerLinux"},"Component":"","Platform":[{"code":"PF016","label":"Linux"}],"Version":"All Versions","Edition":"","Line of Business":{"code":"","label":""}},{"Business Unit":{"code":"BU058","label":"IBM Infrastructure w\/TPS"},"Product":{"code":"SWG60","label":"IBM i"},"Component":"","Platform":[{"code":"PF012","label":"IBM i"}],"Version":"All Versions","Edition":"","Line of Business":{"code":"LOB57","label":"Power"}}]

Document Information

Modified date:
14 June 2023

UID

ibm11120143