IBM Support

Power E850 - Removing the CPU & RAM Airflow Cover=Power-Off

How To


Summary

Warning: raising the server's lid and removing the internal air flow cover on this model powers off the server.

Objective

Nigels Banner

Steps

We (Advanced Technology Support) had our Early Ship Program (ESP) proto-type E850 server refurbished at the Power Development Lab in the USA to bring it up to the General Available (GA) hardware level.  Many parts were replaced.  Tthe parts seemed to be working fine during our test but need to get to the official level.  It is perfectly normal ESP process of removing early and test parts of a server so that it can be used as normal in a computer room.  What changed (apart from internal electronics) which look the same?  We noticed the following:

  1. We now have a lovely high-quality front cover rather than the hand made early one.  The cover has a fancy green stripe with the correct shade of green. The early proto-type one was a horrible green.
  2. There are more large diagram pictures on the server labels showing the layout and location codes/numbers = useful during maintenance.
  3. There are loads more "heath and safety" and legal stickers.
  4. A new cable management arm - the old ESP one was impressively solid and robust but the new one is even stronger and without the position locking pin - it now has a simpler locking mechanism and knob.
  5. We now know where the front I/O cage locking screws are placed.  These screws are used to release the cage as part of the weight reduction process when installing the server on the rack rails.
  6. The front SSDs have a slightly different mechanism for removing them - it does work better.
  7. The inside airflow cover for the CPUs and Memory cards area changed from solid black to semi-transparent black - you to see any light-path diagnostics LED is are lit up to indicate a failed device.

The new POWER9 E850 server is so nice that we like to show it off to customers and run regular "show and tell" sessions to graduates.  We do the same for a few years with our POWER8 S824 and E870 servers. 

Oops!   The server powered-off as the cover of the CPU and Memory cards was lifted out the way.  This cover increases the air flow across the CPU and Memory cards by not allowing the air to flow in the air gap above them.  This Power-Off feature was not working on our earlier server and took us by surprise!  Fortunately, we were not running too much - just a few copies of the new VIOS and the latest AIX.

We investigated the E850 internal design documents and if states "yes, it was in the design".  So we investigated the actual server to find the pressure switch. It is under the knurled blue release knob that you undo to remove the air flow cover.

internal pressure switch

The pressure switch pushes up the cover, once the knurled knob is unscrewed, as it is sprung loaded.

When this happens, it causes the Service Processor (FSP2) to report the issue to the HMC and raises a Performance Management Hardware event (PMH). The PMH is reported back to IBM Hardware Support.  Here are the details:

On the HMC for this E850 server you see the following:

alert on you removing the air flow covers

That LED code shows you:

part 2 alert details

We assume LDSWTCH is short for Lid Switch - it is guess.

If following the HMC replacing, adding, or removing a part procedure for a POWER8 CPU, Memory card or VRM then before removing the air flow cover, you are instructed to power-off the server.

The Event Details look like this:

part 2

And

part 3

So Watch Out - Don't remove the E850 internal air flow cover "just to take a look"

The S822, S824 type scale-out servers does not have this power-off feature but the Enterprise E850 does.

Is there some logic to that difference?  Yes, I think there is.

  • The E850 has up to 48 CPU cores in the space that the S824 has up to 24 CPU cores - other Scale out models have fewer CPU cores.
  • The E850 has 32 memory cards and the S series models has up to 16 memory cards.
  • The E850 also needs more voltage regulator modules (VRM) to but the cores and RAM.
  • The results in the S series servers you might temporarily raise the air flow cover to take a look.  I strongly recommend you don't.  The cover has blue handles indicating the server must be powered off before you touch the processor chips, memory cards, or voltage regulars.
  • In the E850, the air flow cover is more important for efficient cooling.  I would guess, if removed while the server is under heavy load that is hot - we could cause damage components with over heating.

We hope that helps you avoid our mistakes and keep your E850's running.

    Additional Information


    Other places to find content from Nigel Griffiths IBM (retired)

    Document Location

    Worldwide

    [{"Business Unit":{"code":"BU058","label":"IBM Infrastructure w\/TPS"},"Product":{"code":"SWG10","label":"AIX"},"Component":"","Platform":[{"code":"PF002","label":"AIX"}],"Version":"All Versions","Edition":"","Line of Business":{"code":"LOB08","label":"Cognitive Systems"}},{"Business Unit":{"code":"BU054","label":"Systems w\/TPS"},"Product":{"code":"HW1W1","label":"Power -\u003EPowerLinux"},"Component":"","Platform":[{"code":"PF016","label":"Linux"}],"Version":"All Versions","Edition":"","Line of Business":{"code":"","label":""}},{"Business Unit":{"code":"BU058","label":"IBM Infrastructure w\/TPS"},"Product":{"code":"SWG60","label":"IBM i"},"Component":"","Platform":[{"code":"PF012","label":"IBM i"}],"Version":"All Versions","Edition":"","Line of Business":{"code":"LOB57","label":"Power"}}]

    Document Information

    Modified date:
    14 June 2023

    UID

    ibm11165672