10 Gbit Ethernet Bad Assumptions and Best Practice
How To
Summary
Just because 10 Gbit networking is 10 time faster than 1 Gbit - don't assume you get that in practice with no effort.
Objective
Steps
This Gareth Coates article dates from on DeveloperWorks and got Visits=84652. Hence the reference to POWER6 and POWER7 but the tuning is still valid, even if newer network go much faster.
Below is from your Power Systems, Advanced Technical Support team in Europe (EMEA) with lots of input from many people.
We find many people are over optimistic and making assumptions - which can catch them out - we learnt the hard way too.
Don't Assume 1 - The same boost as last time
It may be that you migrated from 2Mit to 10Mbit and saw approximately a 5 fold increase in performance.
Similarly when you went from 10Mbit to 100Mbit, and from 100Mbit to 1Gbit; you saw approximately a 10 fold improvement.
Don't assume that when you go from 1Gbit to 10Gbit that you will automatically see another 10 fold improvement.
Don't Assume 2 - Your old Ethernet cables will work
Your structured cabling may be good enough to carry 1Gbit, but will it support 10Gbit?
If you are using Copper,
Category 6A or better balanced twisted pair cables specified in ISO 11801 amendment 2 or ANSI/TIA-568-C.2 are needed to carry 10GBASE-T up to distances of 100 metre.
Category 6 cables can carry 10GBASE-T for shorter distances when qualified according to the guidelines in ISO TR 24750 or TIA-155-A.
For Fibre, it is not quite so simple, the best advice is to check rather than to assume that your existing infrastructure is of a high enough specification.
Don't Assume 3 - This will not take CPU and memory resources
Don't assume that if you simply replace a 1Gbit card with a 10Gbit card that there will immediately be a 10 fold improvement with no change to the LPAR's resources.
This applies to Shared Ethernet Adapters (SEA) in VIOS as well.
Don't Assume 4 - Everything can be Max-ed out at the same time
It not safe to assume that all the ports of a switch can go at full speed all of the time.
It is not necessarily the case that a switch can handle all of its ports going at full speed.
Check the specifications. The size of the buffers can be important too.
Same goes for multiple port adapters - don't expect all four ports to Max-out at the same time as the on-board CPU can be a limit - especially with small packets.
Don't Assume 5 - A faster network fixes everything
Imagine a scenario as follows:
A company has some POWER6 based servers.
They are consolidating to POWER7 and are considering adopting 10Gbit technology.
A particular LPAR runs a production database and is using 36 rPerfs.
They want to have about 2X the database performance, so plan an LPAR with 72 rPerfs on the POWER7 server, and expect a 10X performance increase on the LAN.
Well, they have doubled the rPerfs which is OK for the database, but they are expecting the network device drivers and TCP/IP stack to be able to do 10 times as much work and they have not taken that into consideration.
In addition, to be able to see a 10 fold increase in bandwidth, the data needs to be available. It is quite possible that applications running on smaller POWER7 LPARs, and their attached storage; will not be able to generate that sort of data flow. If the performance is not as expected, it is not necessarily the network causing a bottleneck. A bottleneck could be a single threaded process or internal application locking on logical resources or internal buffering of the data arriving.
It can work really well - so are our first set of hot tips ...
It is possible to achieve respectable bandwidth when using 10GBit Ethernet on IBM System Power equipment. With some careful consideration, planning, configuration and tuning, it is possible to achieve very good results.
Top Tip 1 - Flow Control
Turn on Flow Control - everywhere.
With 10 Gbit Ethernet it is very useful to turn on flow control to stop the need for retransmission.
It is very easy at high bandwidth to completely fill buffers on switches and adapters so that transmitted packets are dropped. This leads to time-outs and retransmits. Together, these lead to delays, wasted bandwidth and compute cycles and uses more energy.
If flow control is enabled, the packets will flow as efficiently as possible.
There are multiple places to turn on flow control - see following tips....
Top Tip 2 - Flow Control on the network switches
Turn on flow control On the network switch
The method depends upon the manufacturer of the switch
Don't assume it is on
Top Tip 3 - Flow Control on AIX networks
Turn on flow control In the Operating System
For AIX, you can use chdev -l ent# -a flow_ctrl=yes to turn on flow control
Replace # with the number of the adapter
Top Tip 4 - Flow Control on the Host Ethernet Adapter (HEA/IVE) - know called a vNIC
If the adapter is part of a Host Ethernet Adapter(HEA), you need turn on flow control there too.
You can do this via the HMC:
Select the Managed System-> Hardware Information -> Adapters-> Host Ethernet
Select the appropriate adapter and click "Configure"
Top Tip 5 - If possible, directly use the HEA rather than via a SEA
You can use a 10 Gbit Host Ethernet adapter as part of a Shared Ethernet Adapter (SEA).
Ideally, don't do it. It is completely possible and supported to use a 10Gbit Ethernet adapter as part of an SEA.
It is much better to map a logical HEA port directly to each of the client LPARs, and you will see much better performance with less tuning.
For LPARs which need a lot of bandwidth, you can dedicate the adapter to the particular LPAR.
This by itself would mean that Live Partition Mobility would not be possible.
It is possible to dedicate a network adapter to an LPAR as desired and use it as part of an Ether-channel with an SEA as a backup device.
LPM is possible with the use of a shell script to temporarily remove the HEA.
There is some smitty support for this type of thing - see the Using Live Partition Mobility with a HEA or more detail here
Top Tip 6 - Set the network option Largesend
Turn on the network option largesend
When you are using a 10Gbit adapter in AIX, you should turn on largesend using the ifconfig or chdev command on the VIOS SEA.
AIX Example: ifconfig en0 largesend
AIX Example: chdev -l en0 -a largesend=on
VIOS Example: chdev -dev ent2 -attr largesend=1
Top Tip 7 - Large packets with MTU
We have found that increasing MTU to 64K can have a massive benefit when using 10Gbit adapters in Power servers.
This can drastically reduce the number of transactions (one per packet) on the CPUs and adapters.
We use, for quickness: smitty chif
This uses chdev as in this example: chdev -s en0 -a mtu=65535
The actual number you can use can depend on the device you are using, so it the above does not work try 65536 (the full 64K), 65394 (64K minus overhead), 65390 (64K minus VLAN overhead) or at least try 9000.
Top Tip 8 - Multiple virtual switches within the POWER6 and POWER7 machines
Most people haven't heard of multiple virtual switches but they are worth considering, so don't worry if you are uncertain :-)
The "IBM PowerVM Virtualization Introduction and Configuration" Redbook, talks about them in section 2.10.3.
At first it describes the concept of one virtual switch and then goes on to talk about multiple virtual switches, which was introduced in POWER6 based servers, multiple virtual switches offer performance and resilience benefits.
Also see the few notes in Infocenter Configuring Details Virtual Ethernet Switch.
Here is an excellent Whitepaper: Using Virtual Switches in PowerVM to Drive Maximum Value of 10 Gb Ethernet ( format PDF 1.5 MB) PowerVM-VirtualSwitches-091010.pdf
If you have any hot tips, please pass them on via the comments below.
Thanks, Gareth M Coates
Additional Information
Other places to find content from Nigel Griffiths IBM (retired)