Technical Blog Post
Abstract
[One page problem in plan optimization] Will an access plan be remain unchanged after collecting stats if no further stats is corrected and auto_runstats is disabled ?
Body
We are sometimes asked questions about plan stability. For example.
-We can correct stats and check the access plan, but if we no longer correct stats, will the plan be stable ?
-Does applying db2look -m output generates the same plan on another databases ?
The answers are no, not always. Even after collecting stats, DB2 may fabricate stats which may change the access plan if the data fits into initially allocated one page of the table and exceeded later on.
Here's the outlined repro scenario of the symptom. Note, It can be even re-created since v8.2 and still re-creatable on db2 v111.1.1.
1. create a table or truncate a table.
2. import/insert/load to the table so that only a page is used to store the data and run runstats against the table.
3. correct db2exfmt output for an update query's access plan and see IXSCAN is chosen.
4. import much more data, say 100,000 rows
5. correct db2exfmt output for the update query's access plan and see TBSCAN is now chosen. The plan has been changed without running additional runstats.
Note, diagnostics message indicated
The reason of the plan change is due to stats fabrication. DB2 maintains Insert/Update/Delete (IUD) counter and fabricate table cardinality based on the initially corrected stats.
However, filter factor is preserved in the scenario, as the result, the optimal plan may change from the initial one.
To avoid such problems, you should correct stats after loading/importing/inserting a certain amount of rows which enough to exceed initial one page of the table.
Note, you can check this situation with Extended Diagnostic Information in db2exfmt output telling fabricated statistics. This can lead to poor cardinality and predicate filtering estimates.
The size of the table changed significantly since the last time the RUNSTATS command was run.
// repro steps: consists of two steps
//step1: import four rows into a brand new table called a.a and collect db2exfmt with db2caem utility
db2 connect to sample
db2 -tvf $HOME/sqllib/misc/EXPLAIN.DDL
db2 update db cfg for sample using AUTO_SAMPLING off AUTO_STATS_VIEWS off AUTO_STMT_STATS off AUTO_RUNSTATS off AUTO_TBL_MAINT off AUTO_MAINT off
db2 terminate
db2stop force
db2start
db2 connect to sample
db2 -tvf exp1.sql
db2 "drop table a.a"
db2 "create table a.a ( key integer not null, val char(254) not null with default) in userspace1"
db2 "create index a.ix1a on a.a (key)"
db2 "insert into a.a values ( 0,'1'),( 1,'1'),( 2,'2'),(3,'3'),(4,'4')"
db2 "runstats on table a.a and indexes all"
db2look -d sample -e -z a -t a -m -o db2look.1.out
mkdir u1caemout
db2caem -d sample -sf u1.sql -o u1caemout
// Access plan of the UPDATE is INDEX SCAN as follows in the first db2caem output.
Rows
Rows Actual
RETURN
( 1)
Cost
I/O
|
3
0
UPDATE
( 2)
31.3216
NA
/---+----\
3 5
3 NA
FETCH TABLE: A
( 3) A
9.27544 Q1
NA
/---+----\
3 5
3 NA
IXSCAN TABLE: A
( 4) A
1.80484 Q2
NA
|
5
NA
INDEX: A
IX1A
Q2
Predicates:
----------
2) Stop Key Predicate,
Comparison Operator: Less Than (<)
Subquery Input Required: No
Filter Factor: 0.6
Predicate Text:
--------------
(Q2.KEY < 3)
//step2. importing additional 1,000,000 rows and correct db2exfmt with db2caem utility without running runstats.
db2 -tvf exp1.sql
db2 "import from expa.del of del commitcount 10000 insert into a.a"
db2look -d sample -e -z a -t a -m -o db2look.2.out
mkdir u1caemout2
db2caem -d sample -sf u1.sql -o u1caemout2
// Access plan of the UPDATE becomes TABLE SCAN as follows in the second db2caem output. You can see Extended Diagnostic Information.
Rows
Rows Actual
RETURN
( 1)
Cost
I/O
|
60027.6
0
UPDATE
( 2)
453339
NA
/---+----\
60027.6 100046
3 NA
TBSCAN TABLE: A
( 3) A
12568.6 Q1
NA
|
100046
NA
TABLE: A
A
Q2
Extended Diagnostic Information:
--------------------------------
Diagnostic Identifier: 1
Diagnostic Details: EXP0045W The table named "A "."A" has
fabricated statistics. This can lead to poor
cardinality and predicate filtering estimates. The
size of the table changed significantly since the
last time the RUNSTATS command was run.
// supplemental data
// u1.sql file containing UPDATE statement
update a.a set val='a'||substr(val,2,253) where key between 1 and 2
;
//exp1.sql file containing export statement to generate 1,000,000 rows to import
export to expa.del of del
with w1 as (
select row_number() over( order by a.tabname ) number from syscat.columns a, syscat.columns b
fetch first 100000 rows only
)
, w2 as (
select
(number + 4 ) key
,char(int(number + 4)) val
from w1
)
select key, val from w2
;
UID
ibm11140418