APAR status
Closed as program error.
Error description
On a seemingly idle Windows IDS server, it's possible to have a cpu vp using 100% cpu. For instance, on a 12 cpu Windows IDS 12.10.TC11 server, we were able to get stacks for the cpu vps from a memory dump. The stacks for 1cpu, 8cpu, 9cpu, 10cpu, 11cpu, 12cpu, 13cpu, 14cpu, 16cpu, 17cpu, 18cpu: oninit.exe!net_aio_poll(void *hPort, int timeout) Line 173 oninit.exe!NT_P(_VP *v) Line 1640 oninit.exe!NT_idle_loop(_VP*i_vp, unsigned int bz, int wakeup) Line 5210 oninit.exe!NT_idle_processor() Line 5124 oninit.exe!startup() Line 177 The stack for 15cpu is slightly different: oninit.exe!net_aio_poll(void *hPort, int timeout) Line 173 oninit.exe!NT_yield_processor_mvp() Line 18070 oninit.exe!NT_idle_processor() Line 5107 oninit.exe!startup() Line 177 Looking at process explorer, we could see that the oninit.exe thread for 15cpu was running at 100%. The underlying issue here is that the vp struct associated with that 15cpu has a positive num_ready_threads value but there are no threads in its ready queue(s). This keeps the idle vp from every sleeping as it constantly thinks there is a thread ready to run when there isn't. To identify this on an idle system, you can first observe the 100% cpu usage, but you can also look at "onstat -g sch" output. The cpu vp that is using up the cpu cycles will have a positive number in the Q-ln column with nothing in the ready queue "onstat -g rea". For instance, from "onstat -g sch" you can see the value 1 in the Q-ln column for 15cpu below: Thread Migration Statistics: vp pid class steal-at steal-sc idlvp-at idlvp-sc inl-polls Q-ln 1 9568 cpu 0 0 0 0 0 0 2 8184 adm 0 0 0 0 0 0 3 8156 lio 0 0 0 0 0 0 4 7212 pio 0 0 0 0 0 0 5 7156 aio 0 0 0 0 0 0 6 11088 msc 0 0 0 0 0 0 7 816 fifo 0 0 0 0 0 0 8 7476 cpu 0 0 0 0 0 0 9 10904 cpu 0 0 0 0 0 0 10 10940 cpu 0 0 0 0 0 0 11 10936 cpu 0 0 0 0 0 0 12 11096 cpu 0 0 0 0 0 0 13 8064 cpu 0 0 0 0 0 0 14 6256 cpu 0 0 0 0 0 0 15 5996 cpu 0 0 0 0 0 1 16 5984 cpu 0 0 0 0 0 0 17 6928 cpu 0 0 0 0 0 0 18 8056 cpu 0 0 0 0 0 0 19 924 soc 0 0 0 0 0 0 20 920 soc 0 0 0 0 0 0 21 10960 soc 0 0 0 0 0 0 22 10932 soc 0 0 0 0 0 0 This defect is being entered for defensive purposes. We should be able to identify this case and address it returning the idle cpu vp to normal behavior.
Local fix
Problem summary
**************************************************************** * USERS AFFECTED: * * Users of Informix Server prior to 12.10.xC14 and 14.10.xC4. * **************************************************************** * PROBLEM DESCRIPTION: * * See Error Description * **************************************************************** * RECOMMENDATION: * ****************************************************************
Problem conclusion
Problem fixed in Informix Server versions 12.10.xC14 and 14.10.xC4.
Temporary fix
Comments
APAR Information
APAR number
IT31694
Reported component name
INFORMIX SERVER
Reported component ID
5725A3900
Reported release
C10
Status
CLOSED PER
PE
NoPE
HIPER
NoHIPER
Special Attention
NoSpecatt / Xsystem
Submitted date
2020-01-29
Closed date
2020-02-24
Last modified date
2020-02-24
APAR is sysrouted FROM one or more of the following:
APAR is sysrouted TO one or more of the following:
Fix information
Fixed component name
INFORMIX SERVER
Fixed component ID
5725A3900
Applicable component levels
[{"Business Unit":{"code":"BU053","label":"Cloud & Data Platform"},"Product":{"code":"SSGU8G","label":"Informix Servers"},"Component":"","ARM Category":[],"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"C10","Edition":"","Line of Business":{"code":"LOB10","label":"Data and AI"}}]
Document Information
Modified date:
24 February 2020