Monitoring and troubleshooting using db2pd command
The db2pd command is used for troubleshooting because it can return quick and immediate information from the Db2® memory sets.
Overview
The tool collects information without acquiring any latches or using any engine resources. It is therefore possible (and expected) to retrieve information that is changing while db2pd is collecting information; hence the data might not be completely accurate. If changing memory pointers are encountered, a signal handler is used to prevent db2pd from ending abnormally. This can result in messages such as "Changing data structure forced command termination" to appear in the output. Nonetheless, the tool can be helpful for troubleshooting. Two benefits to collecting information without latching include faster retrieval and no competition for engine resources.
If you want to capture information about the database management system when a specific SQLCODE, ZRC code or ECF code occurs, this can be accomplished using the db2pdcfg -catch command. When the errors are caught, the db2cos (callout script) is launched. The db2cos script can be dynamically altered to run any db2pd command, operating system command, or any other command needed to resolve the problems. The template db2cos script file is located in sqllib/bin on UNIX and Linux®. On the Windows operating system, db2cos is located in the $DB2PATH\bin directory.
When adding a new node, you can monitor the progress of the operation on the
database partition server, that is adding the node, using the db2pd -addnode
command with the optional oldviewapps
and detail
parameters for
more detailed information.
If you require a list of event monitors that are currently active or have been, for some reason, deactivated, run the db2pd -gfw command. This command also returns statistics and information about the targets, into which event monitors write data, for each fast writer EDU.
Examples
- Example 1: Diagnosing a lockwait
- Example 2: Using the -wlocks parameter to capture all the locks being waited on
- Example 3: Displaying the table name and schema name of locks
- Example 4: Using the -apinfo parameter to capture detailed runtime information about the lock owner and the lock waiter
- Example 5: Using the callout scripts when considering a locking problem
- Example 6: Mapping an application to a dynamic SQL statement
- Example 7: Monitoring memory usage
- Example 8: Determine which application is using up your table space
- Example 9: Monitoring recovery
- Example 10: Determining the amount of resources a transaction is using
- Example 11: Monitoring log usage
- Example 12: Viewing the sysplex list
- Example 13: Generating stack traces
- Example 14: Viewing memory statistics for a database partition
- Example 15: Monitoring the progress of index reorganization
- Example 16: Displaying the top EDUs by processor time consumption and displaying EDU stack information
- Example 17: Displaying agent event metrics
- Example 18: Displaying the extent movement status
The results text show in the examples is an extract of the db2cmd command ouput for better readability.
Example 1: Diagnosing a lockwait
Locks:
TranHdl Lockname Type Mode Sts Owner Dur HldCnt Att ReleaseFlg
3 00020002000000040000000052 Row ..X G 3 1 0 0x0000 0x40000000
2 00020002000000040000000052 Row ..X W* 2 1 0 0x0000 0x40000000
For
the database that you specified using the -db database name option, the first
results show the locks for that database. The results show that TranHdl 2 is waiting on a lock held
by TranHdl 3.Transactions:
AppHandl [nod-index] TranHdl Locks State Tflag Tflag2 Firstlsn Lastlsn Firstlso Lastlso LogSpace SpaceReserved TID AxRegCnt GXID
11 [000-00011] 2 4 READ 0x00000000 0x00000000 0x000000000000 0x000000000000 0x000000000000 0x000000000000 0 0 0x0000000000B7 1 0
12 [000-00012] 3 4 WRITE 0x00000000 0x00000000 0x00000002AC04 0x00000002AC04 0x000000FA000C 0x000000FA000C 113 154 0x0000000000B8 1 0
We
can see that TranHdl 2 is associated with AppHandl 11 and TranHdl 3 is associated with AppHandl
12.Applications:
AppHandl [nod-index] NumAgents CoorPid Status C-AnchID C-StmtUID L-AnchID L-StmtUID Appid
12 [000-00012] 1 1073336 UOW-Waiting 0 0 17 1 *LOCAL.burford.060303225602
11 [000-00011] 1 1040570 UOW-Executing 17 1 94 1 *LOCAL.burford.060303225601
We
can see that AppHandl 12 last ran dynamic statement 17, 1. AppHandl 11 is currently running dynamic
statement 17, 1 and last ran statement 94, 1.Dynamic SQL Statements:
AnchID StmtUID NumEnv NumVar NumRef NumExe Text
17 1 1 1 2 2 update pdtest set c1 = 5
94 1 1 1 2 2 set lock mode to wait 1
We
can see that the text column shows the SQL statements that are associated with the lock timeout.Example 2: Using the -wlocks parameter to capture all the locks being waited on
venus@boson:/home/venus =>db2pd -wlocks -db pdtest
Database Partition 0 -- Database PDTEST -- Active -- Up 0 days 00:01:22
Locks being waited on :
AppHandl TranHdl Lockname Type Mode Conv Sts CoorEDU AppName AuthID AppID
47 8 00020004000000000840000652 Row ..X G 5160 db2bp VENUS *LOCAL.venus.071207213730
46 2 00020004000000000840000652 Row .NS W 5913 db2bp VENUS *LOCAL.venus.071207213658
Example 3: Displaying the tables name and schema name of locks
db2pd -locks showlocks
command to display the table name and schema
name of locks that are held by applications. You can use this information to diagnose the table and
schema that contain the application lock. The table name is displayed in the TableNm column and the
schema name is displayed in the SchemaNm column as shown in the following
output.Database Member 0 -- Database PDTEST -- Active -- Up 0 days 00:00:10 -- Date 2012-11-06-10.57.18.025767
Locks:
Address TranHdl Lockname Type Mode Sts Owner Dur HoldCount Att ReleaseFlg rrIID TableNm SchemaNm
0x00002AAAFFFA5F68 3 02000400000020000000000062 MdcBlockLock ..X G 3 1 0 0x00200000 0x40000000 0 T1 YUQZHANG 02000400000020000000000062 SQLP_MDCBLOCK (obj={2;4}, bid=d(0;32;0), x0000200000000000)
0x00002AAAFFFA7198 3 41414141414A4863ADA1ED24C1 PlanLock ..S G 3 1 0 0x00000000 0x40000000 0 N/A N/A 41414141414A4863ADA1ED24C1 SQLP_PLAN ({41414141 63484A41 24EDA1AD}, loading=0)
db2pd -wlocks detail
command to display the table name, schema name, and
application node of locks that are being waited on as shown in the following
output.Database Member 0 -- Database PDTEST -- Active -- Up 0 days 00:00:35 -- Date 2012-11-06-11.11.32.403994
Locks being waited on :
AppHandl [nod-index] TranHdl Lockname Type Mode Conv Sts CoorEDU AppName AuthID AppID TableNm SchemaNm AppNode
19 [000-00019] 3 02000400000000000000000054 TableLock ..X G 18 db2bp YUQZHANG *LOCAL.yuqzhang.121106161112 PDTEST YUQZHANG hotel71
21 [000-00021] 15 02000400000000000000000054 TableLock .IS W 45 db2bp YUQZHANG *LOCAL.yuqzhang.121106161114 PDTEST YUQZHANG hotel71
Example 4: Using the -apinfo parameter to capture detailed runtime information about the lock owner and the lock waiter
venus@boson:/home/venus =>db2pd -apinfo 47 -db pdtest
Database Partition 0 -- Database PDTEST -- Active -- Up 0 days 00:01:30
Application :
Address : 0x0780000001676480
AppHandl [nod-index] : 47 [000-00047]
Application PID : 876558
Application Node Name : boson
IP Address: n/a
Connection Start Time : (1197063450)Fri Dec 7 16:37:30 2007
Client User ID : venus
System Auth ID : VENUS
Coordinator EDU ID : 5160
Coordinator Partition : 0
Number of Agents : 1
Locks timeout value : 4294967294 seconds
Locks Escalation : No
Workload ID : 1
Workload Occurrence ID : 2
Trusted Context : n/a
Connection Trust Type : non trusted
Role Inherited : n/a
Application Status : UOW-Waiting
Application Name : db2bp
Application ID : *LOCAL.venus.071207213730
ClientUserID : n/a
ClientWrkstnName : n/a
ClientApplName : n/a
ClientAccntng : n/a
List of inactive statements of current UOW :
UOW-ID : 2
Activity ID : 1
Package Schema : NULLID
Package Name : SQLC2G13
Package Version :
Section Number : 203
SQL Type : Dynamic
Isolation : CS
Statement Type : DML, Insert/Update/Delete
Statement : insert into pdtest values 99
venus@boson:/home/venus =>db2pd -apinfo 46 -db pdtest
Database Partition 0 -- Database PDTEST -- Active -- Up 0 days 00:01:39
Application :
Address : 0x0780000000D77A60
AppHandl [nod-index] : 46 [000-00046]
Application PID : 881102
Application Node Name : boson
IP Address: n/a
Connection Start Time : (1197063418)Fri Dec 7 16:36:58 2007
Client User ID : venus
System Auth ID : VENUS
Coordinator EDU ID : 5913
Coordinator Partition : 0
Number of Agents : 1
Locks timeou t value : 4294967294 seconds
Locks Escalation : No
Workload ID : 1
Workload Occurrence ID : 1
Trusted Context : n/a
Connection Trust Type : non trusted
Role Inherited : n/a
Application Status : Lock-wait
Application Name : db2bp
Application ID : *LOCAL.venus.071207213658
ClientUserID : n/a
ClientWrkstnName : n/a
ClientApplName : n/a
ClientAccntng : n/a
List of active statements :
*UOW-ID : 3
Activity ID : 1
Package Schema : NULLID
Package Name : SQLC2G13
Package Version :
Section Number : 201
SQL Type : Dynamic
Isolation : CS
Statement Type : DML, Select (blockable)
Statement : select * from pdtest
Example 5: Using the callout scripts when considering a locking problem
Lock Timeout Caught
Thu Feb 17 01:40:04 EST 2006
Instance DB2
Database: SAMPLE
Partition Number: 0
PID: 940
TID: 2136
Function: sqlplnfd
Component: lock manager
Probe: 999
Timestamp: 2006-02-17-01.40.04.106000
AppID: *LOCAL.DB2...
AppHdl:
...
Database Partition 0 -- Database SAMPLE -- Active -- Up 0 days 00:06:53
Locks:
Address TranHdl Lockname Type Mode Sts Owner Dur HldCnt Att Rlse
0x402C6B30 3 00020003000000040000000052 Row ..X W* 3 1 0 0 0x40
In the output, W*
indicates the lock that experienced the timeout. In this case,
a lockwait has occurred. A lock timeout can also occur when a lock is being converted to a higher
mode. This is indicated by C*
in the output.
You can map the results to a transaction, an application, an agent, or even an SQL statement with the output provided by other db2pd commands in the db2cos file. You can narrow down the output or use other commands to collect the information that you need. For example, you can use the db2pd -locks wait parameters to print only locks with a wait status. You can also use the -app and -agent parameters.
Example 6: Mapping an application to a dynamic SQL statement
The command db2pd -applications -dynamic reports the current and last anchor ID and statement unique ID for dynamic SQL statements. This allows direct mapping from an application to a dynamic SQL statement.
Applications:
Address AppHandl [nod-index] NumAgents CoorPid Status
0x00000002006D2120 780 [000-00780] 1 10615 UOW-Executing
C-AnchID C-StmtUID L-AnchID L-StmtUID Appid
163 1 110 1 *LOCAL.burford.050202200412
Dynamic SQL Statements:
Address AnchID StmtUID NumEnv NumVar NumRef NumExe Text
0x0000000220A02760 163 1 2 2 2 1 CREATE VIEW MYVIEW
0x0000000220A0B460 110 1 2 2 2 1 CREATE VIEW YOURVIEW
Example 7: Monitoring memory usage
The db2pd -memblock command can be useful when you are trying to understand memory usage, as shown in the following sample output:
All memory blocks in DBMS set.
Address PoolID PoolName BlockAge Size(Bytes) I LOC File
0x0780000000740068 62 resynch 2 112 1 1746 1583816485
0x0780000000725688 62 resynch 1 108864 1 127 1599127346
0x07800000001F4348 57 ostrack 6 5160048 1 3047 698130716
0x07800000001B5608 57 ostrack 5 240048 1 3034 698130716
0x07800000001A0068 57 ostrack 1 80 1 2970 698130716
0x07800000001A00E8 57 ostrack 2 240 1 2983 698130716
0x07800000001A0208 57 ostrack 3 80 1 2999 698130716
0x07800000001A0288 57 ostrack 4 80 1 3009 698130716
0x0780000000700068 70 apmh 1 360 1 1024 3878879032
0x07800000007001E8 70 apmh 2 48 1 914 1937674139
0x0780000000700248 70 apmh 3 32 1 1000 1937674139
...
This is followed by the sorted 'per-pool' output:
Memory blocks sorted by size for ostrack pool:
PoolID PoolName TotalSize(Bytes) TotalCount LOC File
57 ostrack 5160048 1 3047 698130716
57 ostrack 240048 1 3034 698130716
57 ostrack 240 1 2983 698130716
57 ostrack 80 1 2999 698130716
57 ostrack 80 1 2970 698130716
57 ostrack 80 1 3009 698130716
Total size for ostrack pool: 5400576 bytes
Memory blocks sorted by size for apmh pool:
PoolID PoolName TotalSize(Bytes) TotalCount LOC File
70 apmh 40200 2 121 2986298236
70 apmh 10016 1 308 1586829889
70 apmh 6096 2 4014 1312473490
70 apmh 2516 1 294 1586829889
70 apmh 496 1 2192 1953793439
70 apmh 360 1 1024 3878879032
70 apmh 176 1 1608 1953793439
70 apmh 152 1 2623 1583816485
70 apmh 48 1 914 1937674139
70 apmh 32 1 1000 1937674139
Total size for apmh pool: 60092 bytes
...
The final section of output sorts the consumers of memory for the entire memory set:
All memory consumers in DBMS memory set:
PoolID PoolName TotalSize(Bytes) %Bytes TotalCount %Count LOC File
57 ostrack 5160048 71.90 1 0.07 3047 698130716
50 sqlch 778496 10.85 1 0.07 202 2576467555
50 sqlch 271784 3.79 1 0.07 260 2576467555
57 ostrack 240048 3.34 1 0.07 3034 698130716
50 sqlch 144464 2.01 1 0.07 217 2576467555
62 resynch 108864 1.52 1 0.07 127 1599127346
72 eduah 108048 1.51 1 0.07 174 4210081592
69 krcbh 73640 1.03 5 0.36 547 4210081592
50 sqlch 43752 0.61 1 0.07 274 2576467555
70 apmh 40200 0.56 2 0.14 121 2986298236
69 krcbh 32992 0.46 1 0.07 838 698130716
50 sqlch 31000 0.43 31 2.20 633 3966224537
50 sqlch 25456 0.35 31 2.20 930 3966224537
52 kerh 15376 0.21 1 0.07 157 1193352763
50 sqlch 14697 0.20 1 0.07 345 2576467555
...
You can also report memory blocks for private memory on UNIX and Linux operating systems. For example, if you run db2pd -memb pid=159770, results similar to the following ones are generated:
All memory blocks in Private set.
PoolID PoolName BlockAge Size(Bytes) I LOC File
88 private 1 2488 1 172 4283993058
88 private 2 1608 1 172 4283993058
88 private 3 4928 1 172 4283993058
88 private 4 7336 1 172 4283993058
88 private 5 32 1 172 4283993058
88 private 6 6728 1 172 4283993058
88 private 7 168 1 172 4283993058
88 private 8 24 1 172 4283993058
88 private 9 408 1 172 4283993058
88 private 10 1072 1 172 4283993058
88 private 11 3464 1 172 4283993058
88 private 12 80 1 172 4283993058
88 private 13 480 1 1534 862348285
88 private 14 480 1 1939 862348285
88 private 80 65551 1 1779 4231792244
Total set size: 94847 bytes
Memory blocks sorted by size:
PoolID PoolName TotalSize(Bytes) TotalCount LOC File
88 private 65551 1 1779 4231792244
88 private 28336 12 172 4283993058
88 private 480 1 1939 862348285
88 private 480 1 1534 862348285
Total set size: 94847 bytes
Example 8: Determine which application is using up your table space
Using db2pd -tcbstats command, you can identify the number of inserts for a table. The following example shows sample information for a user-defined global temporary table called TEMP1:
TCB Table Information:
TbspaceID TableID PartID MasterTbs MasterTab TableName SchemaNm ObjClass DataSize LfSize LobSize XMLSize
3 2 n/a 3 2 TEMP1 SESSION Temp 966 0 0 0
TCB Table Stats:
TableName Scans UDI PgReorgs NoChgUpdts Reads FscrUpdates Inserts Updates Deletes OvFlReads OvFlCrtes
TEMP1 0 0 0 0 0 0 43968 0 0 0 0
You can then obtain the information for table space 3 by using the db2pd -tablespaces command. Sample output is as follows:
Tablespace 3 Configuration:
Type Content PageSz ExtentSz Auto Prefetch BufID BufIDDisk FSC RSE NumCntrs MaxStripe LastConsecPg Name
DMS UsrTmp 4096 32 Yes 32 1 1 On Yes 1 0 31 TEMPSPACE2
Tablespace 3 Statistics:
TotalPgs UsablePgs UsedPgs PndFreePgs FreePgs HWM State MinRecTime NQuiescers
5000 4960 1088 0 3872 1088 0x00000000 0 0
Tablespace 3 Autoresize Statistics:
AS AR InitSize IncSize IIP MaxSize LastResize LRF
No No 0 0 No 0 None No
Containers:
ContainNum Type TotalPgs UseablePgs StripeSet Container
0 File 5000 4960 0 /home/db2inst1/tempspace2a
MinRecTime
column returns a value that is a UNIX timestamp in a UTC timezone
format. To convert the value to a GMT time zone format you can use the Db2 timestamp function.
For example, if MinRecTime
returns a value of 1369626329, to convert this value to
a GMT format run the following
statement:db2 "values timestamp('1970-01-01-00.00.00') + 1369626329 seconds"
The
query will return a GMT value of 2013-05-27-03.45.29.000000.You
can see if the reclaimable space feature is enabled in the Reclaimable Space Enabled (RSE) column.
The FreePgs
column shows that space is filling up. As the free pages value
decreases, there is less space available. Notice also that the value for FreePgs
plus the value for UsedPgs
equals the value of UsablePgs
.
Once this is known, you can identify the dynamic SQL statement that is using the table TEMP1 by running the db2pd -db sample -dyn:
Database Partition 0 -- Database SAMPLE -- Active -- Up 0 days 00:13:06
Dynamic Cache:
Current Memory Used 1022197
Total Heap Size 1271398
Cache Overflow Flag 0
Number of References 237
Number of Statement Inserts 32
Number of Statement Deletes 13
Number of Variation Inserts 21
Number of Statements 19
Dynamic SQL Statements:
AnchID StmtUID NumEnv NumVar NumRef NumExe Text
78 1 2 2 3 2 declare global temporary table temp1 (c1 char(6)) not logged
253 1 1 1 24 24 insert into session.temp1 values('TEST')
Finally, you can map the information from the preceding output to the applications output to identify the application by running db2pd -db sample -app.
Applications:
AppHandl [nod-index] NumAgents CoorPid Status C-AnchID C-StmtUID
501 [000-00501] 1 11246 UOW-Waiting 0 0
L-AnchID L-StmtUID Appid
253 1 *LOCAL.db2inst1.050202160426
You can use the anchor ID (AnchID
) value that identified the dynamic SQL
statement to identify the associated application. The results show that the last anchor ID
(L-AnchID
) value is the same as the anchor ID (AnchID
) value. You
use the results from one run of db2pd in the next run of
db2pd.
Rowsread
column) and rows written (in the Rowswrtn
column) by the
application. These values give you an idea of what the application has completed and what the
application still has to complete, as shown in the following sample
output:AppHandl [nod-index] AgentPid Priority Type DBName
501 [000-00501] 11246 0 Coord SAMPLE
State ClientPid Userid ClientNm Rowsread Rowswrtn LkTmOt
Inst-Active 26377 db2inst1 db2bp 22 9588 NotSet
You can map the values for AppHandl
and AgentPid
resulting from
running the db2pd -agent command to the corresponding values for
AppHandl
and CoorPiid
resulting from running the db2pd
-app command.
The steps are slightly different if you suspect that an internal temporary table is filling up the table space. You still use db2pd -tcbstats to identify tables with large numbers of inserts, however. Following is sample information for an implicit temporary table:
TCB Table Information:
Address TbspaceID TableID PartID MasterTbs MasterTab TableName SchemaNm ObjClass DataSize ...
0x0780000020CC0D30 1 2 n/a 1 2 TEMP (00001,00002) <30> <JMC Temp 2470 ...
0x0780000020CC14B0 1 3 n/a 1 3 TEMP (00001,00003) <31> <JMC Temp 2367 ...
0x0780000020CC21B0 1 4 n/a 1 4 TEMP (00001,00004) <30> <JMC Temp 1872 ...
TCB Table Stats:
Address TableName Scans UDI PgReorgs NoChgUpdts Reads FscrUpdates Inserts ...
0x0780000020CC0D30 TEMP (00001,00002) 0 0 0 0 0 0 43219 ...
0x0780000020CC14B0 TEMP (00001,00003) 0 0 0 0 0 0 42485 ...
0x0780000020CC21B0 TEMP (00001,00004) 0 0 0 0 0 0 0 ...
In this example, there are a large number of inserts for tables with the naming convention
TEMP (TbspaceID, TableID)
. These are implicit temporary tables. The values in the
SchemaNm
column have a naming convention of the value for AppHandl
concatenated with the value for SchemaNm
, which makes it possible to identify the
application doing the work.
You can then map that information to the output from db2pd -tablespaces to see
the used space for table space 1. Take note of the relationship between the UsedPgs
and UsablePgs
values in the table space statistics in the following output:
Tablespace Configuration:
Id Type Content PageSz ExtentSz Auto Prefetch BufID BufIDDisk FSC RSE NumCntrs MaxStripe LastConsecPg Name
1 SMS SysTmp 4096 32 Yes 320 1 1 On Yes 10 0 31 TEMPSPACE1
Tablespace Statistics:
Id TotalPgs UsablePgs UsedPgs PndFreePgs FreePgs HWM State MinRecTime NQuiescers
1 6516 6516 6516 0 0 0 0x00000000 0 0
Tablespace Autoresize Statistics:
Id AS AR InitSize IncSize IIP MaxSize LastResize LRF
1 No No 0 0 No 0 None No
Containers:
...
You can then identify application handles 30 and 31 (because you saw them in the -tcbstats output) by using the command db2pd -app:
Applications:
AppHandl [nod-index] NumAgents CoorPid Status C-AnchID C-StmtUID L-AnchID L-StmtUID Appid
31 [000-00031] 1 4784182 UOW-Waiting 0 0 107 1 *LOCAL.db2inst1.051215214142
30 [000-00030] 1 8966270 UOW-Executing 107 1 107 1 *LOCAL.db2inst1.051215214013
Finally, map the information from the preceding output to the Dynamic SQL output obtained by running the db2pd -dyn command:
Dynamic SQL Statements:
AnchID StmtUID NumEnv NumVar NumRef NumExe Text
107 1 1 1 43 43 select c1, c2 from test group by c1,c2
Example 9: Monitoring recovery
If you run the command db2pd -recovery, the output shows several counters that
you can use to verify that recovery is progressing, as shown in the following sample output. The
Current Log
and Current LSO
values provide the log position. The
CompletedWork
value is the number of bytes completed thus far.
Recovery:
Recovery Status 0x00000401
Current Log S0000005.LOG
Current LSN 0000001F07BC
Current LSO 000002551BEA
Job Type ROLLFORWARD RECOVERY
Job ID 7
Job Start Time (1107380474) Wed Feb 2 16:41:14 2005
Job Description Database Rollforward Recovery
Invoker Type User
Total Phases 2
Current Phase 1
Progress:
Address PhaseNum Description StartTime CompletedWork TotalWork
0x0000000200667160 1 Forward Wed Feb 2 16:41:14 2005 2268098 bytes Unknown
0x0000000200667258 2 Backward NotStarted 0 bytes Unknown
Example 10: Determining the amount of resources a transaction is using
If
you run the command db2pd -transactions, the output shows the number of locks,
the first log sequence number (LSN), the last LSN, the first LSO, the last LSO, the log space that
is used, and the space reserved. The output also displays the total number of application commits
and the total number of application rollbacks. Knowing the total number of application commits and
rollbacks can be useful for understanding the behavior of a transaction. The following is a sample
output of the db2pd -transactions
command.
Transactions:
Address AppHandl [nod-index] TranHdl Locks State Tflag
0x000000022026D980 797 [000-00797] 2 108 WRITE 0x00000000
0x000000022026E600 806 [000-00806] 3 157 WRITE 0x00000000
0x000000022026F280 807 [000-00807] 4 90 WRITE 0x00000000
Tflag2 Firstlsn Lastlsn Firstlso Lastlso
0x00000000 0x0000001A4212 0x0000001C2022 0x000001072262 0x0000010B2C8C
0x00000000 0x000000107320 0x0000001S3462 0x000001057574 0x0000010B3340
0x00000000 0x0000001BC00C 0x0000001X2F03 0x00000107CF0C 0x0000010B2FDE
LogSpace SpaceReserved TID AxRegCnt GXID
4518 95450 0x000000000451 1 0
6576 139670 0x0000000003E0 1 0
3762 79266 0x000000000472 1 0
Total Application commits : 23
Total Application rollbacks : 39
Example 11: Monitoring log usage
The command db2pd -logs is useful for monitoring log usage for a database. By
using thePages Written
value, as shown in the following sample output, you can
determine whether the log usage is increasing:
Logs:
Current Log Number 2
Pages Written 846
Cur Commit Disk Log Reads 0
Cur Commit Total Log Reads 0
Method 1 Archive Status Success
Method 1 Next Log to Archive 2
Method 1 First Failure n/a
Method 2 Archive Status Success
Method 2 Next Log to Archive 2
Method 2 First Failure n/a
Log Chain ID 0
Extraction Status n/a
Current Log to Extract n/a
Current LSO 28672000
Current LSN 0x00000022F032
Address StartLSN StartLSO State Size Pages Filename
0x000000023001BF58 0x00000022F032 0x000001B58000 0x00000000 1000 1000 S0000002.LOG
0x000000023001BE98 0x000000000000 0x000001F40000 0x00000000 1000 1000 S0000003.LOG
0x0000000230008F58 0x000000000000 0x000002328000 0x00000000 1000 1000 S0000004.LOG
- If the most recent log archive fails,
Archive Status
is set to a value ofFailure
. If there is an ongoing archive failure, preventing logs from being archived at all,Archive Status
is set to a value ofFirst Failure
. - If log archiving is proceeding very slowly, the
Next Log to Archive
value is lower than theCurrent Log Number
value. If archiving is very slow, space for active logs might run out, which in turn might prevent any data changes from occurring in the database.
Example 12: Viewing the sysplex list
Without the db2pd -sysplex command showing the following sample output, the only other way to report the sysplex list is by using a Db2 trace.
Sysplex List:
Alias: HOST
Location Name: HOST1
Count: 1
IP Address Port Priority Connections Status PRDID
1.2.34.56 400 1 0 0
Example 13: Generating stack traces
You can use the db2pd -stack all command for Windows operating systems or the -stack command for UNIX operating systems to produce stack traces for all processes in the current database partition. You might want to use this command iteratively when you suspect that a process or thread is looping or hanging.
You can obtain the current call stack for a particular engine dispatchable unit (EDU) by issuing the command db2pd -stack eduid, as shown in the following example:
Attempting to dump stack trace for eduid 137.
See current DIAGPATH for trapfile.
If the call stacks for all of the Db2 processes are desired, use the command db2pd -stack all, for example (on Windows operating systems):
Attempting to dump all stack traces for instance.
See current DIAGPATH for trapfiles.
If you are using a partitioned database environment with multiple physical nodes, you can obtain the information from all of the partitions by using the command db2_all "; db2pd -stack all". If the partitions are all logical partitions on the same machine, however, a faster method is to use db2pd -alldbp -stacks.
You can also redirect the output of the db2pdb -stacks command for db2sysc processes to a specific directory path with the dumpdir parameter. The output can be redirected for a specific duration only with the timeout parameter. For example, to redirect the output of stack traces for all EDUs in db2sysc processes to /home/waleed/mydir for 30 seconds, issue the following command:
db2pd -alldbp -stack all dumpdir=/home/waleed/mydir timeout=30
Example 14: Viewing instance memory statistics for a database partition
The db2pd -dbptnmem command shows how much memory the Db2 server is currently consuming and, at a high level, which areas of the server are using that memory. Instance memory usage includes not only actual system memory consumption/commitment but also configured allowances that may not be in use/committed. Therefore, memory usage statistics for the Db2 instance are not directly comparable with memory usage statistics reported by operating system monitoring tools.
The following example shows the output from running db2pd -dbptnmem on an AIX® machine:
Database Partition Memory Controller Statistics
Controller Automatic: Y
Memory Limit: 122931408 KB
Current usage: 651008 KB
HWM usage: 651008 KB
Cached memory: 231296 KB
- Controller Automatic
- Indicates the memory controller setting. When set to "Y" (instance_memory configuration parameter is set to AUTOMATIC), the calculated limit is only enforced when the product license contains a memory limit. When set to "N" the stated limit is enforced.
- Memory Limit
- If an instance memory limit is enforced, the value of the instance_memory configuration parameter is the upper bound limit of instance memory that can be consumed.
- Current usage
- The amount of instance memory the server is currently consuming.
- HWM usage
- The high water mark (HWM) or peak instance memory usage that has been consumed since the activation of the database partition (when the db2start command was run).
- Cached memory
- The amount of the current usage that may be reclaimed . This applies when instance memory usage is approaching an enforced limit, and cached usage by one or more consumers may need to be reduced to allow some other consumer to grow.
Following is the continuation of the sample output from running db2pd -dbptnmem on an AIX operating system:
Individual Memory Consumers:
Name Mem Used (KB) HWM Used (KB) Cached (KB)
===========================================================
APPL-DBONE 160000 160000 159616
DBMS-name 38528 38528 3776
FMP_RESOURCES 22528 22528 0
PRIVATE 13120 13120 740
FCM_RESOURCES 10048 10048 0
LCL-p606416 128 128 0
DB-DBONE 406656 406656 67200
consumersof instance memory within the Db2 server are listed with the amount of the total instance memory they are consuming. The column descriptions are as follows:
- Name
- A short, distinguishing name of a consumer of instance memory, such as the following ones:
- APPL-dbname
- Application memory consumed for database dbname
- DBMS-name
- Global database manager memory requirements
- FMP_RESOURCES
- Memory required to communicate with db2fmps
- PRIVATE
- Miscellaneous private memory requirements
- FCM_RESOURCES
- Fast Communication Manager resources
- LCL-pid
- The memory segment used to communicate with local applications
- DB-dbname
- Database memory consumed for database dbname
- Mem Used (KB)
- The amount of instance memory that is currently allotted to the consumer
- HWM Used (KB)
- The high-water mark (HWM) or the peak instance memory, that the consumer has used
- Cached (KB)
- Of the Mem Used (KB), the amount of instance memory that may be reclaimed for this consumer.
Example 15: Monitoring the progress of index reorganization
- The db2pd -reorgs index command reports index reorg progress for partitioned indexes (Fix Pack 1 introduced support for only non-partitioned indexes).
- The db2pd -reorgs index command supports the monitoring of index reorg at the partition level (that is, during reorganization of a single partition).
- The reorg progress for non-partitioned and partitioned indexes is reported in separate outputs. One output shows the reorg progress for non-partitioned indexes, and the following outputs show the reorg progress for partitioned indexes on each table partition; the index reorg statistics of only one partition is reported in each output.
- Non-partitioned indexes are processed first, followed by partitioned indexes in serial fashion.
- The db2pd -reorgs index command displays the following additional information
fields in the output for partitioned indexes:
- MaxPartition - Total number of partitions for the table being processed. For partition-level reorg, MaxPartition will always have a value of 1 since only a single partition is being reorganized.
- PartitionID - The data partition identifier for the partition being processed.
Index Reorg Stats:
Retrieval Time: 02/08/2010 23:04:21
TbspaceID: -6 TableID: -32768
Schema: ZORAN TableName: BIGRPT
Access: Allow none
Status: Completed
Start Time: 02/08/2010 23:03:55 End Time: 02/08/2010 23:04:04
Total Duration: 00:00:08
Prev Index Duration: -
Cur Index Start: -
Cur Index: 0 Max Index: 2 Index ID: 0
Cur Phase: 0 ( - ) Max Phase: 0
Cur Count: 0 Max Count: 0
Total Row Count: 750000
Retrieval Time: 02/08/2010 23:04:21
TbspaceID: 2 TableID: 5
Schema: ZORAN TableName: BIGRPT
PartitionID: 0 MaxPartition: 2
Access: Allow none
Status: Completed
Start Time: 02/08/2010 23:04:04 End Time: 02/08/2010 23:04:08
Total Duration: 00:00:04
Prev Index Duration: -
Cur Index Start: -
Cur Index: 0 Max Index: 2 Index ID: 0
Cur Phase: 0 ( - ) Max Phase: 0
Cur Count: 0 Max Count: 0
Total Row Count: 375000
Retrieval Time: 02/08/2010 23:04:21
TbspaceID: 2 TableID: 6
Schema: ZORAN TableName: BIGRPT
PartitionID: 1 MaxPartition: 2
Access: Allow none
Status: Completed
Start Time: 02/08/2010 23:04:08 End Time: 02/08/2010 23:04:12
Total Duration: 00:00:04
Prev Index Duration: -
Cur Index Start: -
Cur Index: 0 Max Index: 2 Index ID: 0
Cur Phase: 0 ( - ) Max Phase: 0
Cur Count: 0 Max Count: 0
Total Row Count: 375000
Example 16: Displaying the top EDUs by processor time consumption and displaying EDU stack information
If you issue the db2pd command with the -edus parameter option, the output lists all engine dispatchable units (EDUs). Output for EDUs can be returned at the level of granularity you specify, such as at the instance level or at the member. On Linux and UNIX operating systems only, you can also specify the interval parameter suboption so that two snapshots of all EDUs are taken, separated by an interval you specify. When the interval parameter is specified, two additional columns in the output indicate the delta of processor user time (USR DELTA column) and the delta of processor system time (SYS DELTA column) across the interval.
In the following example, the deltas for processor user time and processor system time are given across a five-second interval:
$ db2pd -edus interval=5
Database Partition 0 -- Active -- Up 0 days 00:53:29 -- Date 06/04/2010 03:34:59
List of all EDUs for database partition 0
db2sysc PID: 1249522
db2wdog PID: 2068678
EDU ID TID Kernel TID EDU Name USR SYS USR DELTA SYS DELTA
===================================================================================================
6957 6957 13889683 db2agntdp (SAMPLE ) 0 58.238506 0.820466 1.160726 0.014721
6700 6700 11542589 db2agent (SAMPLE) 0 52.856696 0.754420 1.114821 0.015007
5675 5675 4559055 db2agntdp (SAMPLE ) 0 60.386779 0.854234 0.609233 0.014304
3088 3088 13951225 db2agntdp (SAMPLE ) 0 80.073489 2.249843 0.499766 0.006247
3615 3615 2887875 db2loggw (SAMPLE) 0 0.939891 0.410493 0.011694 0.004204
4900 4900 6344925 db2pfchr (SAMPLE) 0 1.748413 0.014378 0.014343 0.000103
7986 7986 13701145 db2agntdp (SAMPLE ) 0 1.410225 0.025900 0.003636 0.000074
2571 2571 8503329 db2ipccm 0 0.251349 0.083787 0.002551 0.000857
7729 7729 14168193 db2agntdp (SAMPLE ) 0 1.717323 0.029477 0.000998 0.000038
7472 7472 11853991 db2agnta (SAMPLE) 0 1.860115 0.032926 0.000860 0.000012
3358 3358 2347127 db2loggr (SAMPLE) 0 0.151042 0.184726 0.000387 0.000458
515 515 13820091 db2aiothr 0 0.405538 0.312007 0.000189 0.000178
7215 7215 2539753 db2agntdp (SAMPLE ) 0 1.165350 0.019466 0.000291 0.000008
6185 6185 2322517 db2wlmd (SAMPLE) 0 0.061674 0.034093 0.000169 0.000100
6442 6442 2756793 db2evmli (DB2DETAILDEADLOCK) 0 0.072142 0.052436 0.000092 0.000063
4129 4129 15900799 db2glock (SAMPLE) 0 0.013239 0.000741 0.000064 0.000001
2 2 11739383 db2alarm 0 0.036904 0.028367 0.000009 0.000009
4386 4386 13361367 db2dlock (SAMPLE) 0 0.015653 0.001281 0.000014 0.000003
1029 1029 15040579 db2fcms 0 0.041929 0.016598 0.000010 0.000004
5414 5414 14471309 db2pfchr (SAMPLE) 0 0.000093 0.000002 0.000000 0.000000
258 258 13656311 db2sysc 0 8.369967 0.263539 0.000000 0.000000
5157 5157 7934145 db2pfchr (SAMPLE) 0 0.027598 0.000177 0.000000 0.000000
1543 1543 2670647 db2fcmr 0 0.004191 0.000079 0.000000 0.000000
1286 1286 8417339 db2extev 0 0.000312 0.000043 0.000000 0.000000
2314 2314 14360813 db2licc 0 0.000371 0.000051 0.000000 0.000000
5928 5928 3137537 db2taskd (SAMPLE) 0 0.004903 0.000572 0.000000 0.000000
3872 3872 2310357 db2lfr (SAMPLE) 0 0.000126 0.000007 0.000000 0.000000
4643 4643 11694287 db2pclnr (SAMPLE) 0 0.000094 0.000002 0.000000 0.000000
1800 1800 5800175 db2extev 0 0.001212 0.002137 0.000000 0.000000
772 772 7925817 db2thcln 0 0.000429 0.000072 0.000000 0.000000
2057 2057 6868993 db2pdbc 0 0.002423 0.001603 0.000000 0.000000
2828 2828 10866809 db2resync 0 0.016764 0.003098 0.000000 0.000000
To provide information only about the EDUs that are the top consumers of processor time and to reduce the amount of output returned, you can further include the top parameter option. In the following example, only the top five EDUs are returned, across an interval of 5 seconds. Stack information is also returned, and can be found stored separately in the directory path specified by DUMPDIR, which defaults to diagpath.
$ db2pd -edus interval=5 top=5 stacks
Database Partition 0 -- Active -- Up 0 days 00:54:00 -- Date 06/04/2010 03:35:30
List of all EDUs for database partition 0
db2sysc PID: 1249522
db2wdog PID: 2068678
EDU ID TID Kernel TID EDU Name USR SYS USR DELTA SYS DELTA
===============================================================================================================
3358 3358 2347127 db2loggr (SAMPLE) 0 0.154906 0.189223 0.001087 0.001363
3615 3615 2887875 db2loggw (SAMPLE) 0 0.962744 0.419617 0.001779 0.000481
515 515 13820091 db2aiothr 0 0.408039 0.314045 0.000658 0.000543
258 258 13656311 db2sysc 0 8.371388 0.264812 0.000653 0.000474
6700 6700 11542589 db2agent (SAMPLE) 0 54.814420 0.783323 0.000455 0.000310
$ ls -ltr
total 552
drwxrwxr-t 2 vbmithun build 256 05-31 09:59 events/
drwxrwxr-t 2 vbmithun build 256 06-04 03:17 stmmlog/
-rw-r--r-- 1 vbmithun build 46413 06-04 03:35 1249522.3358.000.stack.txt
-rw-r--r-- 1 vbmithun build 22819 06-04 03:35 1249522.3615.000.stack.txt
-rw-r--r-- 1 vbmithun build 20387 06-04 03:35 1249522.515.000.stack.txt
-rw-r--r-- 1 vbmithun build 50426 06-04 03:35 1249522.258.000.stack.txt
-rw-r--r-- 1 vbmithun build 314596 06-04 03:35 1249522.6700.000.stack.txt
-rw-r--r-- 1 vbmithun build 94913 06-04 03:35 1249522.000.processObj.txt
Example 17: Displaying agent event metrics
The db2pd command supports returning event metrics for agents. If you need to determine whether an agent changed state during a specific period of time, use the event option together with the -agents parameter. The AGENT_STATE_LAST_UPDATE_TIME(Tick Value) column that is returned shows the last time that the event being processed by the agent was changed. Together with a previously obtained value for AGENT_STATE_LAST_UPDATE_TIME(Tick Value), you can determine whether an agent has moved on to a new task or whether it continues to process the same task over an extended period of time.
db2pd -agents event
Database Partition 0 -- Active -- Up 0 days 03:18:52 -- Date 06/27/2011 11:47:10
Agents:
Current agents: 12
Idle agents: 0
Active coord agents: 10
Active agents total: 10
Pooled coord agents: 2
Pooled agents total: 2
AGENT_STATE_LAST_UPDATE_TIME(Tick Value) EVENT_STATE EVENT_TYPE EVENT_OBJECT EVENT_OBJECT_NAME
2011-06-27-14.44.38.859785(5622972377924968075) IDLE WAIT REQUEST n/a
Example 18: Displaying the extent movement
db2pd -extentmovement -db
dbName
command.$ db2pd -extentmovement -db PDTEST
Database Member 0 -- Database PDTEST -- Active -- Up 0 days 00:04:33 -- Date 2012-10-26-11.19.52.056414
Extent Movement:
Address TbspName Current Last Moved Left TotalTime
0x00002AAB356D4BA0 DAVID 1168 1169 33 426 329636