APAR status
Closed as program error.
Error description
A failover occurs from primary to standby queue managers in a HA/DR environment. This is soon followed by failover back to the primary queue manager. The IBM Appliance system log contains entries such as this: Dec 16 16:50:51 (none) pengine[24787]: warning: unpack_rsc_op: Processing failed op monitor for QMGR on x.y.a.b: not running (7) Dec 16 16:50:51 (none) pengine[24787]: error: color_instance: Pre-allocation failed: got x.y.a.b instead of x.y.a.b Dec 16 16:50:51 (none) pengine[24787]: notice: LogActions: Demote QMGR_drbd:0#011(Master -> Slave x.y.a.b) Dec 16 16:50:51 (none) pengine[24787]: notice: LogActions: Promote QMGR_drbd:1#011(Slave -> Master x.y.a.b) Dec 16 16:50:51 (none) pengine[24787]: notice: LogActions: Stop QMGR_fs#011(x.y.a.b) Dec 16 16:50:51 (none) pengine[24787]: notice: LogActions: Stop QMGR#011(x.y.a.b) Dec 16 16:50:51 (none) pengine[24787]: notice: LogActions: Move QMGR_DR_IP#011(Started x.y.a.b -> x.y.a.b) Dec 16 16:50:51 (none) pengine[24787]: notice: LogActions: Demote QMGR_DR_drbd:0#011(Master -> Slave x.y.a.b - blocked) Dec 16 16:50:51 (none) pengine[24787]: notice: LogActions: Move QMGR_DR_drbd:0#011(Slave x.y.a.b -> x.y.a.b)
Local fix
It may be possible to work around or decrease the likelihood of this problem occurring by fixing any network issues that may be affecting connectivity between IBM MQ Appliances
Problem summary
**************************************************************** USERS AFFECTED: Users of the IBM MQ Appliance who have configured a combination of HA and DR and who have unreliable network connectivity may be affected by this problem. Platforms affected: MultiPlatform **************************************************************** PROBLEM DESCRIPTION: A transient DR ping failure could have resulted in a queue manager briefly starting on the HA secondary appliance before switching back to the HA primary appliance. If a network is particularly unreliable then this failover behaviour may have happened frequently.
Problem conclusion
The code that detects whether the DR secondary IBM MQ Appliance can be contacted was modified so that it is more tolerant of transient network failures. --------------------------------------------------------------- The fix is targeted for delivery in the following PTFs: Version Maintenance Level v8.0 8.0.0.7 v9.0 CD 9.0.2 The latest available maintenance can be obtained from 'WebSphere MQ Recommended Fixes' http://www-1.ibm.com/support/docview.wss?rs=171&uid=swg27006037 If the maintenance level is not yet available information on its planned availability can be found in 'WebSphere MQ Planned Maintenance Release Dates' http://www-1.ibm.com/support/docview.wss?rs=171&uid=swg27006309 ---------------------------------------------------------------
Temporary fix
Comments
APAR Information
APAR number
IT19169
Reported component name
IBM MQ APPL M20
Reported component ID
5725S1400
Reported release
800
Status
CLOSED PER
PE
NoPE
HIPER
NoHIPER
Special Attention
NoSpecatt / Xsystem
Submitted date
2017-02-08
Closed date
2017-02-28
Last modified date
2017-06-01
APAR is sysrouted FROM one or more of the following:
APAR is sysrouted TO one or more of the following:
Fix information
Fixed component name
IBM MQ APPL M20
Fixed component ID
5725S1400
Applicable component levels
R800 PSY
UP
[{"Business Unit":{"code":"BU053","label":"Cloud & Data Platform"},"Product":{"code":"SS5K6E","label":"IBM MQ Appliance"},"Component":"","ARM Category":[],"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"8.0","Edition":"","Line of Business":{"code":"LOB36","label":"IBM Automation"}}]
Document Information
Modified date:
01 June 2017