Technical Blog Post
Abstract
Why does event monitor for statistics only work for workload at position 1 and not for all the other workloads ?
Body
Here is an example of the problem description:
Create Workload:
--
db2 "SET WORKLOAD TO SYSDEFAULTADMWORKLOAD"
db2 "CREATE WORKLOAD WL_A SESSION_USER('DB2V10')"
db2 "CREATE WORKLOAD WL_B SESSION_USER('JKGERA')"
db2 "ALTER WORKLOAD WL_A COLLECT AGGREGATE ACTIVITY DATA BASE"
db2 "ALTER WORKLOAD WL_B COLLECT AGGREGATE ACTIVITY DATA BASE"
db2 "GRANT USAGE ON WORKLOAD WL_A TO PUBLIC"
db2 "GRANT USAGE ON WORKLOAD WL_B TO PUBLIC"
db2 "SET WORKLOAD TO AUTOMATIC"
Create an event monitor :
---
db2 "create event monitor mon_wlm_statistics
for statistics
write to table
qstats (table db.mon_statistics_qstats_pmr in userspace1),
scmetrics (table db.mon_statistics_scmetrics_pmr in userspace1),
wcstats (table db.mon_statistics_wcstats_pmr in userspace1),
wlmetrics (table db.mon_statistics_wlmetrics_pmr in userspace1),
scstats (table db.mon_statistics_scstats_pmr in userspace1),
wlstats (table db.mon_statistics_wlstats_pmr in userspace1),
control (table db.mon_statistics_control_pmr in userspace1)
nonblocked autostart"
db2 "set event monitor mon_wlm_statistics state 1"
db2 "UPDATE DATABASE CONFIGURATION USING WLM_COLLECT_INT 5 IMMEDIATE"
Run workload:
--
As both users 'DB2V10' and 'JKGERA' running:
db2 "select count(*) from syscat.tables, syscat.tables, syscat.tables"
1
-----------
267089984
1 record(s) selected.
We can see all the sessions that are available:
$ db2 "select substr(workload_name,1,10) as workload_name, system_auth_id from table (wlm_get_service_class_workload_occurrences('', '',-2)) as scinfo"
WORKLOAD_NAME SYSTEM_AUTH_ID
------------- -----------------
WL_A DB2V10
WL_A DB2V10
WL_B JKGERA
But when we query the event monitor, only the workload at the lowest position shows up:
$ db2 "select statistics_timestamp, substr(workload_name,1,10) as workload_name, total_cpu_time from db.mon_statistics_wlstats_pmr order by statistics_timestamp desc"
STATISTICS_TIMESTAMP WORKLOAD_NAME TOTAL_CPU_TIME
-------------------------- ------------- --------------------
2016-02-15-15.50.00.949172 WL_A 128297
1 record(s) selected.
>><<
$ db2 "select statistics_timestamp, substr(workload_name,1,10) as workload_name, total_cpu_time from db.mon_statistics_wlstats_pmr order by statistics_timestamp desc"
STATISTICS_TIMESTAMP WORKLOAD_NAME TOTAL_CPU_TIME
-------------------------- ------------- --------------------
2016-02-15-15.55.01.023037 WL_A 202158290
2016-02-15-15.50.00.949172 WL_A 128297
2 record(s) selected.
Now if we alter the position for WL_B to 1:
$ db2 "alter workload WL_B position at 1"
$ db2 "select statistics_timestamp, substr(workload_name,1,10) as workload_name, total_cpu_time from db.mon_statistics_wlstats_pmr order by statistics_timestamp desc"
STATISTICS_TIMESTAMP WORKLOAD_NAME TOTAL_CPU_TIME
-------------------------- ------------- --------------------
2016-02-15-16.00.00.077792 WL_B 76213905
2016-02-15-15.55.01.023037 WL_A 202158290
2016-02-15-15.50.00.949172 WL_A 128297
If I alter the Positions as WL_A as 10 and WL_B as 20:
db2 "alter workload WL_A position at 10"
DB20000I The SQL command completed successfully.
db2 "alter workload WL_B position at 20"
DB20000I The SQL command completed successfully.
The event monitor only lists the one at the lower position:
$ db2 "select statistics_timestamp, substr(workload_name,1,10) as workload_name, total_cpu_time from db.mon_statistics_wlstats_pmr order by statistics_timestamp desc"
STATISTICS_TIMESTAMP WORKLOAD_NAME TOTAL_CPU_TIME
-------------------------- ------------- --------------------
2016-02-15-16.05.00.133662 WL_A 16678
2016-02-15-16.00.00.077792 WL_B 76213905
2016-02-15-15.55.01.023037 WL_A 202158290
2016-02-15-15.50.00.949172 WL_A 128297
=====
This is because the event monitor is created with "nonblocked" option which can loss record if the buffer is not big enough to hold all the records.
db2 "create event monitor mon_wlm_statistics
for statistics
write to table
qstats (table db.mon_statistics_qstats_pmr in userspace1),
scmetrics (table db.mon_statistics_scmetrics_pmr in userspace1),
wcstats (table db.mon_statistics_wcstats_pmr in userspace1),
wlmetrics (table db.mon_statistics_wlmetrics_pmr in userspace1),
scstats (table db.mon_statistics_scstats_pmr in userspace1),
wlstats (table db.mon_statistics_wlstats_pmr in userspace1),
control (table db.mon_statistics_control_pmr in userspace1)
nonblocked autostart"
If you look at the event monitor control table, you should see the overflows message in it.
$ db2 "select * from db.mon_statistics_control_pmr"
PARTITION_KEY EVENT_MONITOR_NAME MESSAGE MESSAGE_TIME PARTITION_NUMBER
------------- -------------------------------------------------------------------------------------------------------------------------------- ---------------------------------- ---------------------------------------------------------------------------------------------- -------------------------- ----------------
0 MON_WLM_STATISTICS FIRST_CONNECT 2016-02-25-15.26.47.388091 0
0 MON_WLM_STATISTICS EVMON_START 2016-02-25-15.33.26.881491 0
0 MON_WLM_STATISTICS OVERFLOWS:129 2016-02-25-15.35.00.226871 0
In v105, the statistics event monitor was changed to use fast writer to write out the monitor record, which is always blocking so the "nonblocked" option will be ignore in v10.5
Changing the event monitor to 'blocked" will resolve the issue and we should be able to get all the records.
UID
ibm11140874