IBM Support

BACKUP might be affected by fragmentations in table space

Troubleshooting


Problem

With too many fragmentations in some table space, the performance of BACKUP command on the table space or the entire database would be affected. The problem is particularly serious in table spaces defined with small extent size, for example DB2 database bundled in SAP systems where the default extent size is 2. The problem could happen to both "offline" backup and "online" backup.

Symptom

The BACKUP is affected because of slow "read" from table space containers.
By monitoring the IO activities ('iostat', 'topas' etc) on table space containers, a relatively low "read" speed will be observed.

Cause

The problem is caused by fragmented reads on table space containers, which is the result of "fragmentations" on the table space.
Here is an example of "fragmentations" in table space, which is a snippet of 'db2dart <DBName> /DHWM /TSI <TbSpaceID>' output:
===========================================
[100] == EMPTY == [101] 1251 0x00 [102] == EMPTY == [103] 406 0x00
[104] == EMPTY == [105] 1578 0x00 [106] == EMPTY == [107] == EMPTY ==
[108] == EMPTY == [109] == EMPTY == [110] 971 0x00 [111] == EMPTY ==
[112] == EMPTY == [113] 1578 0x00 [114] == EMPTY == [115] 1578 0x00
===========================================

The fragmentations were caused by free extents (e.g. the extent #102 in above example) among the data extents.
Data extents are extents that belong to some object, for example the extent #101 (which belongs to object ID 1251) in above example.

In above example, when BACKUP command retrieves data between extent #100 and extent #115 from this table space, it needs to split the range into multiple slots and read each of them individually:
1) read data from extent #101
2) read data from extent #103
3) read data from extent #105
4) read data from extent #110
5) read data from extent #113
6) read data from extent #115
So, DB2 needs 6 individual reads to retrieve all the data.
However if all free extents were data extents, all extents between #100 and #115 would be regarded as contiguous data extents and could be retrieved in a single read operation, which is much more efficient than 6 individual reads.

Please note: not all "EMPTY" extents appear in 'db2dart /DHWM' output are "free" extents.
"EMPTY" extents could actually be in "pending delete" state. More details about the "pending delete" state can be found in following TechNote:
How is free space in a DMS tablespace managed?
The "pending delete" state is an internal flag. End users usually issue the "list tablespaces show detail" command to guarantee the "EMPTY" extents are freed, as described in above document.
Only "free" extents will cause fragmentations. Extents in "pending delete" state will not cause fragmentations.

Diagnosing The Problem

1. Collect "db2diag.log" file that covered the slow BACKUP, which contains backup statistics similar to the following:

===========================================
2014-01-01-01.01.01.123456+480 E246213A1722 LEVEL: Info
PID : 1234567 TID : 12345 PROC : db2sysc 0
INSTANCE: db2inst1 NODE : 000 DB : SAMPLE
APPHDL : 0-123 APPID: *LOCAL.db2cip.140412121211
AUTHID : DB2INST1
EDUID : 12345 EDUNAME: db2agent (SAMPLE) 0
FUNCTION: DB2 UDB, database utilities, sqluxLogDataStats, probe:281
MESSAGE : Performance statistics
DATA #1 : String, 1261 bytes

Number of buffers = 11
Buffer size = 16777216 (4096 4K pages)

BM# Total I/O MsgQ WaitQ Buffers MBytes
--- -------- -------- -------- -------- -------- --------
000 64685.97 45779.56 18630.33 17.18 185355 2963126
001 64672.37 35003.42 17322.15 12199.02 108901 1741089
002 64685.31 17266.52 16280.71 30993.29 102360 1636193
003 64685.97 17177.64 16160.14 31206.20 100025 1598866
004 64672.36 3985.99 9901.85 50761.61 15806 252758
005 64672.37 2351.99 6683.25 55617.14 15190 242813
006 64672.36 1415.86 6826.28 56410.19 14984 239537
007 64672.37 1274.24 6956.77 56411.07 15197 242925
--- -------- -------- -------- -------- -------- --------
TOT 517419.11 124255.26 98761.51 293615.72 557818 8917311

MC# Total I/O MsgQ WaitQ Buffers MBytes
--- -------- -------- -------- -------- -------- --------
000 64672.36 36039.70 28623.02 0.00 283292 4532656
001 64692.52 35974.11 28621.21 81.78 274529 4392416
--- -------- -------- -------- -------- -------- --------
TOT 129364.89 72013.81 57244.23 81.78 557821 8925072
===========================================

If above statistics report was not seen, please set up DB2 registry variable "DB2_BAR_STATS" by following step:
db2 terminate
db2set DB2_BAR_STATS=ON
db2stop force
db2start

The average speed for "reads" can be calculated by "MBytes/I/O" for each "BM".

2. Collect "high water mark" information of the table spaces:
db2dart <DBName> /DHWM /TSI <TbSpaceID> /RPT . /RPTN <OutputFile>
where:
"<DBName>" is the name of the database
"<TbSpaceID>" is the ID of the table space
"<OutputFile>" is the name of the output file

In the output file if many "EMPTY" extents were seen among data extents, there could be fragmentations.

3. Collect DB2 trace data for 10 seconds when the BACKUP is running:
db2trc on -t -i 1G
sleep 10
db2trc stop
db2trc dump db2trc.dmp
db2trc off
db2trc flw -t db2trc.dmp db2trc.flw
db2trc fmt db2trc.dmp db2trc.fmt

The DB2 trace data need to be analyzed by DB2 support team to identify if the slowness was caused by fragmentations.

Resolving The Problem

The fragmentations can be eliminated by lowering the high water mark:
-- For table spaces enabled with "Reclaimble Storage" (table spaces created in DB2 V9.7 or higher versions), it can be done via "ALTER TABLESPACE LOWER HIGH WATER MARK" statement.
More details about "Reclaimble Storage" and "ALTER TABLESPACE LOWER HIGH WATER MARK" statement can be found in following links:
"Reclaimable storage"
"ALTER TABLESPACE statement"
"Reclaiming unused storage in automatic storage table spaces"
-- For table spaces without "Reclaimble Storage" (table spaces created in version V9.5 or older versions), it can be done via "db2dart /LHWM" command which will come up with ways to reduce the high water mark.


Or, the fragmentations can also be eliminated by rebuilding the table space.

[{"Product":{"code":"SSEPGG","label":"Db2 for Linux, UNIX and Windows"},"Business Unit":{"code":"BU058","label":"IBM Infrastructure w\/TPS"},"Component":"Recovery - Backup","Platform":[{"code":"PF002","label":"AIX"},{"code":"PF010","label":"HP-UX"},{"code":"PF016","label":"Linux"},{"code":"PF027","label":"Solaris"},{"code":"PF033","label":"Windows"}],"Version":"10.1;10.5;9.5;9.7;9.8","Edition":"","Line of Business":{"code":"LOB10","label":"Data and AI"}}]

Document Information

Modified date:
16 June 2018

UID

swg21678274