Technical Blog Post
Abstract
DB2 Insert Performance due to FSCR and how to identify that
Body
It's well documented that one of the major reasons for DB2's slow insert performance is how the Free Space Cache Record (FSCR) are searched before a record is inserted.
This Technote explains it very well,
Also, a nice description in the DB2's Knowledge Center,
But, how to identify if this FSCR issue is being experienced ?
A collection of db2's stacks will help to identify that the issue could be related to FSCR.
Focusing on a specific INSERT might not help.
That is the reason a wider stack "db2pd -stack all" should be collected and reviewed.
Look for stacks showing up with function sqldSearchPageCache() in large number of sessions.
Any stack with sqldSearchPageCache is "normal", and means that we're trying to find free space, likely for an insert. The problem is when lots of agents are in the same stack, then it likely means that they're contending on the actual page that has free space. (ie, many agents trying to insert into the same page). This can be fixed by enabling append mode as pointed out in the documents referred.
Another scenario is similar, but all the agents are spread out over different pages. This is more of an efficiency statement, in that we're constantly scanning for (and not finding) free space, and will end up in temporary append mode.
The following stack indicates a first case -- we've found free space but the page with the free space isn't in memory. Agents will be waiting on the I/O.
0x090000000003AD18 pread64 + 0x38
0x090000001899DD00 sqloReadBlocks + 0x1BC
0x090000001899B4D4 sqlbReadPageInternal + 0x530
0x090000001899A314 sqlbReadPageInternal + 0xEC8
0x0900000018990AA0 sqlbGetPageFromDisk__FP11SQLB_FIX_CBi + 0xBDC
0x090000001899073C sqlbGetPageFromDisk__FP11SQLB_FIX_CBi + 0x878
0x09000000185A8838 sqlbfix__FP11SQLB_FIX_CB + 0x26E4
0x0900000018730744 sqldSearchPageCache__FP13SQLD_DFM_WORKP15SQLD_PAGE_CACHEP16SQLD_TABLE_CACHET3 + 0x5218
0x09000000187476D8 sqldSearchPageCache__FP13SQLD_DFM_WORKP15SQLD_PAGE_CACHEP16SQLD_TABLE_CACHET3 + 0x14
0x09000000186F7DD8 sqldInsertRow__FP13SQLD_DFM_WORKi + 0x30C
The following stack is similar to the first, but we are one step behind -- we are stuck trying to find a victim slot to use when we read the page from disk,
0x0900000000110E94 thread_wait + 0x94
0x0900000017BC4998 getConflictComplex__17SQLO_SLATCH_CAS64FCUl + 0x290
0x09000000184C8A08 getConflict__17SQLO_SLATCH_CAS64FCUl + 0x118
0x09000000189797BC sqlbGetVictimSlot__FP11SQLB_FIX_CBiPUiPP8SQLB_BPDUlCP13SQLB_PAGE_KEY + 0x2CF0
0x090000001899E6A0 sqlbGetPageFromDisk__FP11SQLB_FIX_CBi + 0x3C8
0x09000000185A8838 sqlbfix__FP11SQLB_FIX_CB + 0x26E4
0x090000001870F228 sqldSearchPageCache__FP13SQLD_DFM_WORKP15SQLD_PAGE_CACHEP16SQLD_TABLE_CACHET3 + 0x2A4
0x0900000018710B3C sqldSearchPageCache__FP13SQLD_DFM_WORKP15SQLD_PAGE_CACHEP16SQLD_TABLE_CACHET3 + 0x16C
0x09000000186F7DD8 sqldInsertRow__FP13SQLD_DFM_WORKi + 0x30C
0x09000000186F6404 sqldRowInsert__FP8sqeAgentUsT2UcUliPP10SQLD_VALUEP8SQLZ_RIDPPv + 0x26C
These are just potential indicators of "slow insert" - waiting on I/O and waiting on victim slot to read a page that might end up getting used for the insert. And, actions suggested against FCDR documentations above should help.
UID
ibm11140664