Reorg replay may cause performance issue on a HADR standby

Question & Answer

Question

Why is there a huge HADR standby log gap and the replay is very slow?

Cause

In an HADR configuration, DB2 on the standby server can suffer from slow replay of REORG TABLE operations on tables with clustered indexes or tables where the reorg command on the primary had an index specified.
Both of these REORG operations use Row ID(RID) LIST processing to reproduce the REORG result on the standby instance. When tables get larger the RID LIST processing can begin to show performance degradation.
For example, you may have the following trace output.
58 0.004625687 sqldReorgRID entry [eduid 64657 eduname db2agent]
59 0.004625976 | sqldFetchDirect entry [eduid 64657 eduname db2agent]
60 0.004626843 | | sqlbfix entry [eduid 64657 eduname db2agent]
61 0.004627103 | | sqlbfix data [probe 100]
62 0.004628500 | | sqlbfix exit
63 0.004628937 | sqldFetchDirect exit
64 0.004629113 | sqldReorgRecord entry [eduid 64657 eduname db2agent]
65 0.004629281 | | sqldReorgFormat entry [eduid 64657 eduname db2agent]
66 0.004629406 | | sqldReorgFormat exit
67 0.004629593 | | sqldCompressRecInternal entry [eduid 64657 eduname db2agent]
68 0.004629771 | | | SPCompressRec entry [eduid 64657 eduname db2agent]
69 0.004633664 | | | SPCompressRec exit
70 0.004633957 | | sqldCompressRecInternal exit
71 0.004634140 | sqldReorgRecord exit
72 0.004634316 sqldReorgRID exit

Answer

It's not defect. Here are some possible workarounds that can effectively result in a reorganized table on the standby without the performance impacts of attempting to do a REORG TABLE as described across the HADR pair.

1) We make sure that a standby machine is at least the same as the primary machine and that especially, the standby Buffer Pool (BP) is at least the same size as the primary BP. This is critical. If the standby BP size is less than the primary BP size then this will always present a bottle neck for RIDLIST replay processing. This configuration adjustment is clearly the best and most tested way to improve upon the RIDLIST replay processing.

2) The large tables that they are REORGing do NOT have a clustering index or that an Index is NOT specified on the REORG command. If there is no index specified in the reorg command and there is no clustering index defined on the table then there will not be any RIDLIST processing since there will be no RIDLIST log records associated with the reorg.

3) Use OTM (online table move) instead of REORG when we need to REORG a table that has a clustering index or where an index would be specified on the REORG command. This also avoids the RIDLIST log records. For example, the following command.
db2 "call SYSPROC.ADMIN_MOVE_TABLE('DB2INST1','SALES','','','MOVE')"

Here are additional documents on OTM(ADMIN_MOVE_TABLE)

- ADMIN_MOVE_TABLE procedure - Move tables online
http://www-01.ibm.com/support/knowledgecenter/SSEPGG_10.5.0/com.ibm.db2.luw.sql.rtn.doc/doc/r0055069.html?cp=SSEPGG_10.5.0%2F3-5-1-3-0-16

- Distributed DBA: Table movement made easy
http://www.ibm.com/developerworks/data/library/dmmag/DMMag_2011_Issue4/DistributedDBA/index.html?ca=..

[{"Product":{"code":"SSEPGG","label":"Db2 for Linux, UNIX and Windows"},"Business Unit":{"code":"BU058","label":"IBM Infrastructure w\/TPS"},"Component":"HADR - Other","Platform":[{"code":"PF002","label":"AIX"},{"code":"PF010","label":"HP-UX"},{"code":"PF016","label":"Linux"},{"code":"PF027","label":"Solaris"},{"code":"PF033","label":"Windows"}],"Version":"9.7;9.5;10.1;10.5","Edition":"","Line of Business":{"code":"LOB10","label":"Data and AI"}}]

Tips

Reorg replay may cause performance issue on a HADR standby

Question & Answer

Question

Cause

Answer

Was this topic helpful?

Document Information

UID

Share your feedback

Need support?