APAR status
Closed as program error.
Error description
There appears to be a problem that might occur when a primary and rss node connect. This can lead to 1 or more bad smx pipes that can't then be used to transfer data between servers. When this happens since pipes are picked randomly, if the bad pipe is picked there is an approximate 30 second delay before the server picks a new pipe to try and send data. This will greatly impact performance. In onstat -g smx output, a bad pipe would show up in the output for the node, but will have no data transmitted on it. So it would look something like this: Peer server name: Rssnode1 SMX connection address: 0x700000116143480 Encryption status: Disabled Total bytes sent: 0 Total bytes received: 0 Total buffers sent: 0 Total buffers received: 0 Total write calls: 0 Total read calls: 0 Total retries for write call: 0 Data compression level: 0 Data sent: compressed 0 bytes by 0% Data received: compressed 0 bytes by 0% SMX connection address: 0x70000090f1f3a30 Encryption status: Disabled Total bytes sent: 7110578 Total bytes received: 76965 Total buffers sent: 11990 Total buffers received: 4853 Total write calls: 3545 Total read calls: 4853 Total retries for write call: 0 Data compression level: 1 Data sent: compressed 37158220 bytes by 80% Data received: compressed 92260 bytes by 16% So the above -g smx output the 1st SMX connection is bad and not being used while the 2nd appears to be functioning. Additionally, smx pipe connections between servers involve spawning multiple threads, a smxsnd <servername>, smxrcv <servername>, and smxRecvSnd. For the bad pipe, the setup hasn't finished and in this case, the onstat -g cpu/-g ath output you would only see a smxsnd <servername>, smx, and smxRecvSnd thread. The smx thread hasn't renamed itself to smxrcv <servername> yet. Those threads would have the following stacks: Stack for thread: 370 smxsnd <servername> 0x0000000100062b94 (oninit)yield_processor_mvp 0x000000010006e25c (oninit)mt_wait 0x000000010067aba0 (oninit)smx_send_thread 0x0000000100fea500 (oninit)th_init_initgls 0x00000001017c2860 (oninit)startup smxRecvSnd 0x0000000100062b94 (oninit)yield_processor_mvp 0x0000000100069b6c (oninit)mt_yield 0x00000001002b6b8c (oninit)cdrTimerWait 0x000000010067c358 (oninit)smx_send_from_recv_thread 0x0000000100fea500 (oninit)th_init_initgls 0x00000001017c2860 (oninit)startup Stack for thread: 363 smx 0x0000000100062b94 (oninit)yield_processor_mvp 0x000000010006e25c (oninit)mt_wait 0x0000000100680964 (oninit)smx_thread 0x0000000100c62718 (oninit)listen_verify 0x0000000100c6133c (oninit)spawn_thread 0x0000000100fea500 (oninit)th_init_initgls 0x00000001017c2860 (oninit)startup Thread states in onstat -g ath: 363 700000116120028 70000011356f728 1 cond wait smx pipe1 11cpu smx 369 7000001161ff850 700000113572b78 1 sleeping secs: 1 31cpu smxRecvSnd 370 700000114a56148 700000113573430 3 cond wait smx pipe1 11cpu smxsnd hdr_ausp3 When examining the condition "smx pipe1" (there can be multiple conditions with the same name so have to use the thread ids of the waiters to find the correct one) Conditions with waiters: cid addr name waiter waittime ... 1568 7000001161e6648 smx pipe1 363 1048602 370 1048602 321 1048602 We also see a different thread also waiting on this "smx pipe1" condtion, thread 321. If we look at what that thread is and it's stack we see the following: Stack for thread: 321 Notification 0x0000000100062b94 (oninit)yield_processor_mvp 0x000000010006e25c (oninit)mt_wait 0x0000000100674524 (oninit)smx_connect 0x00000001006b0de8 (oninit)cloneWakeupRSSRetry 0x00000001002bbf80 (oninit)cdrExstmt 0x00000001006b0fe0 (oninit)cloneRSSRetry 0x00000001006927b0 (oninit)cloneNotificationThread 0x0000000100fea500 (oninit)th_init_initgls 0x00000001017c2860 (oninit)startup The presence of this "Notification" thread is likely key in trying to identify hitting this problem as well.
Local fix
Problem summary
**************************************************************** * USERS AFFECTED: * * Users of Informix Server prior to 12.10.xC15 and 14.10.xC5. * **************************************************************** * PROBLEM DESCRIPTION: * * See Error Description * **************************************************************** * RECOMMENDATION: * * Upgrade to Informix Server 12.10.xC15 (when available) or * * 14.10.xC5. * ****************************************************************
Problem conclusion
Fixed in Informix Server 12.10.xC15 and 14.10.xC5.
Temporary fix
Comments
APAR Information
APAR number
IT33316
Reported component name
INFORMIX SERVER
Reported component ID
5725A3900
Reported release
C10
Status
CLOSED PER
PE
NoPE
HIPER
NoHIPER
Special Attention
NoSpecatt / Xsystem
Submitted date
2020-06-24
Closed date
2021-01-06
Last modified date
2021-01-06
APAR is sysrouted FROM one or more of the following:
APAR is sysrouted TO one or more of the following:
Fix information
Fixed component name
INFORMIX SERVER
Fixed component ID
5725A3900
Applicable component levels
[{"Line of Business":{"code":"LOB10","label":"Data and AI"},"Business Unit":{"code":"BU053","label":"Cloud & Data Platform"},"Product":{"code":"SSGU8G","label":"Informix Servers"},"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"C10"}]
Document Information
Modified date:
11 January 2021