Technical Blog Post
Abstract
Why DB2 LUW online reorg is getting paused on it's own ?
Body
A common question comes about why an online reorg is getting paused even if nobody manually tried to pause it.
There is a manual way to pause and resume online reorg with the help of commands,
reorg table <tab-name> inplace pause
reorg table <tab-name> inplace resume
When an online table reorg operation is paused, one cannot run a new reorganization of that table. One must either resume or stop the paused operation before beginning a new reorganization process
There are situations when an online reorg could be paused on it's own.
If command "db2pd -reorgs -db <db-name>" is run it might show tables in "Online Paused" instead of "Online Started"
Sometimes, it's thought to be paused as part of auto-maintenance window limit reached. But, that is not usually the case as the data/index reorg happens during offline windows. Only index reorgs are run during online windows.
So the reason of reorg pause is due to some errors or, similar inconveniences, if experienced by the reorg.
One common example is lock timeout.
Checking db2diag.log might show following kind of messages,
2016-06-10-20.27.59.185456-240 I13042820031A481 LEVEL: Severe
PID : 18481200 TID : 90311 PROC : db2sysc 0
INSTANCE: db2inst1 NODE : 000 DB : MYDB
APPHDL : 0-4450 APPID: *LOCAL.DB2.170718000001
AUTHID : DB2INST1 HOSTNAME: host1
EDUID : 80211 EDUNAME: db2reorg (MYDB) 0
FUNCTION: DB2 UDB, data management, sqldOnlineTableReorg, probe:95
MESSAGE : Table: MYSCHEMA.MYTABLE
2016-06-10-20.27.59.185770-240 I13042820513A888 LEVEL: Severe
PID : 18481200 TID : 90311 PROC : db2sysc 0
INSTANCE: db2inst1 NODE : 000 DB : MYDB
APPHDL : 0-4450 APPID: *LOCAL.DB2.170718000001
AUTHID : DB2INST1 HOSTNAME: host1
EDUID : 80211 EDUNAME: db2reorg (MYDB) 0
FUNCTION: DB2 UDB, trace services, sqlt_logerr_data (secondary logging
function), probe:30
DATA #1 : SQLCA, PD_DB2_TYPE_SQLCA, 136 bytes
sqlcaid : SQLCA sqlcabc: 136 sqlcode: -911 sqlerrml: 2
sqlerrmc: 68
sqlerrp : SQLDOLRT
sqlerrd : (1) 0x80100044 (2) 0x00000044 (3) 0x00000000
(4) 0x00000000 (5) 0x00000000 (6) 0x00000000
sqlwarn : (1) (2) (3) (4) (5) (6)
(7) (8) (9) (10) (11)
sqlstate:
It's an SQL0911 rc=68 which is a lock timeout.
Consider increasing the lock timeout value in the database if it's too low provided increasing lock timeout time is permitted.
Otherwise, try to reduce concurrency during the run of the online reorg.
If paused, just resume manually.
UID
ibm13286743