The db2fodc utility captures symptom-based
data about the DB2® instance
to help in problem determination situations. It is intended to collect
information about potential hangs, severe performance issues, and
various types of errors.
Purpose
The db2fodc command
is used to collect performance data on issues that do not trigger
automatic FODC (first occurrence data collection).
The command
is used in two main ways. The first method collects data immediately
by running the db2fodc command as the issue occurs.
The second method, available in Version 9.7 Fix Pack 5 and later fix
packs, triggers data collection when the environment reaches the state
you described with threshold parameters.
You build the command
with three basic components: one main data collection parameter, secondary
data collection parameters, and threshold parameters.
Begin
by choosing the main data collection parameter and its data collection
mode: either basic or full. You can run the command in this state
to immediately collect data.
Or you can add secondary data collection
parameters to choose what part or parts of the system you want to
scan. In addition, you can specify where the output goes and set a
timeout value. Data is collected immediately, if you run the command
this way.
Or you can add threshold parameters, available in
Version 9.7 Fix Pack 5 and later fix packs. Specify the -detect parameter,
along with one or more threshold rules, to set a conditional threshold.
The system is monitored and data is collected when the thresholds
are met.
If you choose to add threshold parameters, the command
continues to run until the user ID the command is running against
is logged off or the environment reaches the state that is described
by the threshold parameters. To keep the command active in the background
regardless of user's subsequent logout, add nohup to
the start of the command and & to the end. For
example, nohup db2fodc -memory basic -detect "avm>=5242880" &.
Regardless
of the collection method, the captured data is placed inside an FODC
package directory. This directory is created either in the default
diagnostic path or in an FODC directory path you specify with the -fodcpath parameter.
You
can review the output or you can send the directory, with the collected
diagnostic data, to IBM support
for analysis.
Authorization
One of the following authority
levels is required:
- On Linux and UNIX systems, the SYSADM authority level. You
must also be the instance owner.
- On Windows operating
systems, the SYSADM authority level.
Command syntax
>>-db2fodc----| Main data collection options |------------------>
>--+---------------------------------------+-------------------->
'-| secondary data collection options |-'
>--+-------------------------------------------------------------------------------------------------------------------------------+-><
| .-,------------------------------------------------------------------------------------. |
| V | |
'- -detect----threshold_rule" <comparison_operator> threshold_value"-+---------------------------+-+--+-----------------------+-'
'-condition=condition_value-' '-| threshold_options |-'
Main data collection options
.-basic-.
|--+- -hang--+-full--+--------------------------------------------+--|
+- -profile--profileName---------------------------------------+
| .-basic-. |
+- -perf--+-full--+--------------------------------------------+
| .-basic-. |
+- -cpu--+-full--+---------------------------------------------+
| .-basic-. |
+- -memory--+-full--+------------------------------------------+
| .-basic-. |
+- -connections--+-full--+-------------------------------------+
+- -clp--------------------------------------------------------+
+- -preupgrade--+-----------------------------+----------------+
| '- -par DYN_SQL_FILE--sqlFile-' |
+- -hadr-------------------------------------------------------+
| .-basic-. |
+- -fodcerror--FODC_[Index|Data|Col]Error_Directory--+-full--+-+
'- -help-------------------------------------------------------'
Secondary data collection options
.- -alldb--------------.
|--+----------------------+------------------------------------->
| .------------. |
| V | |
'- -db------dbName---+-'
>--+--------------------------------------------------+--------->
+- -dbp-- -dbpartitionnum -------------------------+
+- -alldbp-- -alldbpartitionnum -------------------+
| .-,--------. |
| V | |
+- -host--+---hostname-+-+-------------------------+
| '-all----------' |
| .-,------------------------------. |
| V | |
+- -member--+---member_number--|--member_range-+-+-+
| '-all--------------------------------' |
'- -allmembers-------------------------------------'
>--+------------------------+--+--------------------------+-----|
'-timeout--timeout_value-' '-fodcpath--fodc_path_name-'
Main data collection parameters
Choose only
one main data collection parameter per command.
- -hang
- Collects FODC data that is related to a potential hang situation
or a serious performance issue. The -hang parameter
is intended for use when the instance is considered unusable and needs
restarting. Data is collected as quickly as possible, although, the
process might not complete if the instance or database is already
hanging. The full or basic collection mode can be used without user
interaction.
- A new directory that is prefixed with FODC_Hang_ is
created under the current diagnostic path. The script, db2cos_hang,
is executed to collect FODC data into one or more files and deposited
into the directory.
- For example, db2fodc -hang
- -perf
- Collects data that is related to a performance issue. The -perf parameter
is similar to the-hang parameter, but uses less
resources. Therefore, this option is employed when the instance is
still usable and restarting is not needed. The full or the basic collection
mode can be run without user interaction.
- A new directory that is prefixed with FODC_Perf_ is
created under the current diagnostic path. The script, db2cos_perf,
is executed to collect FODC data into one or more files and deposited
into the directory.
- For example, db2fodc -perf
- -profile profileName
- Collects FODC data on a potential hang situation. Data is collected
based on the parameters that you specify in the db2fodc.profile file
in the ~/sqllib/cfg directory. You can modify
one of the existing profiles or add a new one. The full list of parameters
that can be specified is in the sqllib/bin/db2cos_hang script.
It is recommended that you use the -profile parameter
only under the guidance of IBM® support.
- A new directory that is prefixed with FODC_Hang_ is
created under the current diagnostic path. The customized profile
is executed to collect FODC data into one or more files and deposited
into the directory.
- This parameter is not supported on Windows NT operating system.
- For example, db2fodc -profile sample_profile.
- -cpu
- In Version 9.7 Fix Pack 5 and later fix packs, collects processor-related
performance and diagnostic data. The data can be used to diagnose
problems that are related to high processor use, a high number of
running processes, or high processor wait times. The full or the basic
collection mode can be run without user interaction.
- A new directory that is prefixed with FODC_Cpu_ is
created under the current diagnostic path. The script, db2cos_threshold,
is executed to collect FODC data into one or more files and deposited
into the directory.
- For example, db2fodc -cpu
- -memory
- In Version 9.7 Fix Pack 5 and later fix packs, collects memory-related
diagnostic data. Problems such as no free memory available, swap space
that is used at a high rate, excessive paging or a suspected a memory
leak can be diagnosed. The full or the basic collection mode can be
run without user interaction.
- A new directory that is prefixed with FODC_Memory_ is
created under the current diagnostic path. The script, db2cos_threshold,
is executed to collect FODC data into one or more files and deposited
into the directory.
- For example, db2fodc -memory
- -connections
- In Version 9.7 Fix Pack 5 and later fix packs, collects connection-related
diagnostic data. The data can be used to diagnose problems such as
spikes in the number of applications in the executing or compiling
state and new database connections that were denied.
- A new directory that is prefixed with FODC_Connections_ is
created under the current diagnostic path. The script, db2cos_threshold,
is executed to collect FODC data into one or more files and deposited
into the directory.
- For example, db2fodc -connections
- -clp
- In Version 9.7 Fix Pack 5 and later fix packs, collects operating
system and configuration information that is related to instance creation.
The command does not support the -member parameter,
but does support the -host parameter. The -clp parameter
is supported only on Linux and UNIX operating systems. If you issue
this command on a Windows operating
system, no data is collected.
- A new directory that is prefixed with FODC_Clp_ is
created under the current diagnostic path. The script, db2cos_clp,
is executed to collect FODC data into one or more files and deposited
into the directory.
- For example, db2fodc -clp
- -preupgrade
- In Version 9.7 Fix Pack 5 and later fix packs, collects performance-related
information before a critical upgrade or update. The use of the -preupgrade parameter
is precautionary. After the upgrade or update, any problems that might
occur can be potentially diagnosed with the assistance of the collected
data and IBM support. To obtain
sufficient performance data to troubleshoot any future problems issue
the command several times, both at peak and idle usage times. This
parameter must be specified with a database and can take a long time
to complete.
- A new directory that is prefixed with FODC_Preupgrade_ is
created under the current diagnostic path. The script, db2cos_preupgrade,
is executed to collect FODC data into one or more files and deposited
into the directory.
- For example, db2fodc -preupgrade -dbdbname
- -par DYN_SQL_FILE=sqlFile
- In Version 9.7 Fix Pack 5 and later fix packs, collects FODC data
that is related to an SQL file that you specify. Make sure the SQL
file contains the SQL statements that are most representative of the
workload your system performs. Run the command before an upgrade or
update. After the upgrade, run the command again. Compare the outputs
to determine the impact of the upgrade.
- This option is only available with the -preupgrade parameter.
If you do not specify the -par DYN_SQL_FILE=sqlFile option
with the -upgrade parameter, 20 dynamic queries
are retrieved from the dynamic SQL cache.
- For example, db2fodc -preupgrade -db dbname -par
DYN_SQL_FILE=sqlFile
- -hadr
- In Version 9.7 Fix Pack 7 and later fix packs, collects diagnostic
data that is related to HADR problems. You can use this parameter
with the -detect option to detect HADR congestion
and automatically collect the related diagnostic information. If
the -hadr and -detect options
are both specified, you cannot use any threshold rules. Threshold
options for the -hadr option have different defaults
and only the following are available:
Table 1. . Default
values for -HADR parameter threshold optionsAvailable threshold options |
Default value |
iteration= |
1 |
interval= |
30 |
sleeptime= |
0 |
triggercount= |
10 |
-nocollect |
n/a |
off |
n/a |
- The collected information is stored in a new directory named FODC_Hadr_timestamp_hostname_Primary|Standby|Standard in
the DIAGPATH directory, where timestamp is
the time when the command was run, hostname is
the host that the collection was performed upon, and Primary, Standby,
and Standard denote the HADR role of the database
at the time of the collection. The script, db2cos_HADR,
is executed to collect FODC data into one or more files and deposited
into the directory.
- For example, db2fodc -hadr -dbdbname -detect
- -fodcerror FODC_[Index|Data|Col]Error_directory
- FODC_[Index|Data|Col]Error_directory collects
data that is related to:
- an index error (FODC_IndexError_directory),
- a database manager error (FODC_DataError_directory),
or
- a column-organized table error (FODC_ColError_directory).
(FODC_DataError_directory and FODC_ColError_directory were
added in DB2 Cancun
Release 10.5.0.4.) The FODC_[Index|Data|Col]Error_directory folder
is required and must contain the db2cos_[index|data|col]error_short(.bat) script
or the db2cos_[index|data|col]error_long(.bat) script.
For example, the db2cos_[index|data|col]error_short(.bat) script
or the db2cos_[index|data|col]error_long(.bat).
The
BASIC mode invokes db2cos_[index|data|col]error_short script.
The FULL mode invokes db2cos_[index|data|col]error_short and db2cos_[index|data|col]error_long scripts.
If the mode is not specified, the BASIC mode is the default.
Do
not rename or move the FODC_[Index|Data|Col]Error_directory directory.
The db2dart commands in the scripts need this directory
path to correctly generate reports.
- If you must run this command parameter manually,
check the directory for any existing db2dart reports.
Reports have the extension .rpt and .rpthex.
If there are reports in the directory, before you start the command
manually, rename the reports or move them to a subdirectory under
the FODC_[Index|Data|Col]Error_directory directory.
The full or the basic collection mode can be run without user interaction.
The output and log files are in the db2diag log
file.
- -help
- Displays help information. When this option is specified, all
other options are ignored, and only the help information is displayed.
Main data collection parameter modes
You
can specify the collection mode as a suboption for some of the main
data collection parameters.
- basic
- The basic collection mode is run, without user interaction. Less
data is collected than in the full mode, but with less resources used.
- full
- The full collection mode is run, without user interaction. This
option requires more resources and time to run than basic collection
mode, but gathers more data.
Secondary data collection parameters
One
or more of the following secondary data collection parameters can
be specified with one of the main data collection parameters.
- -db dbname
- Collects FODC data that is related to the specified database or
databases. Multiple databases can be specified in a comma-separated
list.
- For example, db2fodc -hang -db sample,dbsample
- -alldbs
- Collects FODC data that is related to all active databases. This
option is active by default.
- -member member_number|member_range
- In Version 9.7 Fix Pack 5 and later fix packs, specifies the member
or members on which the command is issued. If this option is not specified,
the command is issued on the current member. Multiple members can
be specified as a comma-separated list, as a range of members, or
any combination thereof.
- For example, db2fodc -hang -member 1-3,5-7
- all
- Specifies that the command is issued on all members that are defined
in db2nodes.cfg file. This option cannot be combined
with the -host option.
- -allmembers
- Specifies that this command is to run on all members of the local
host. To illustrate the difference between the -allmember parameter
and the -all suboption for the -member parameter,
consider the following example:
- Members 1 and 2 are on host A
- Members 3 and 4 are on host B
- Members 1, 2, 3, and 4 are all defined in the db2nodes.cfg file
- If you run the command with the -allmember parameter
specified on host A, data is collected on members 1 and 2
- If you run the command with the -all suboption
for the -member parameter, data is collected
on all four members
- The -allmember parameter is equal to the -alldbp parameter,
except that the -allmember parameter is suitable
for use in pureScale® environments.
- -dbp-dbpartitionnum
- Collects FODC data that is related to all the specified database
partition numbers. The -dbp parameter is equal
to the -member parameter, except that the -dbp parameter
is suitable for use in non-pureScale environments.
- -alldbp-alldbpartitionnums
- Specifies that the command is to run on all active database partition
servers in the instance. Data is collected from the database partition
servers on the same physical computer that the db2fodc command
is being run. The -alldbp parameter is equal
to the -allmember parameter, except that the -alldbp parameter
is suitable for use in non-pureScale environments.
- If you use the -alldbp parameter, data is
collected from only local members.
- -timeout timeout_value
- Specifies a timeout period for the callout script that is started
by the db2fodc command. If the timeout is reached
before the callout script completes diagnostic data collection, the
script process is stopped. There is no default timeout. Therefore,
if no timeout value is specified, the command runs in perpetuity.
The timeout is specified as nh ym xs,
where n represents hours, y represents
minutes, and x represents seconds. If no h, m,
or n suffix is specified, the timeout is in seconds.
- For example, -timeout 2h 30m 45s and -timeout
600.
- -fodcpath fodc_path_name
- In Version 9.7 Fix Pack 4 and later fix packs, specifies the full
path to the directory where the FODC data package is created. The
path that you specify must be writable by the members on the database
and by the fmp processes running on the member or partition. If you
do not specify the -fodcpath parameter and do
not specify a list of partitions or members in your command, the -fodcpath parameter
setting for the current partition or member is used. If this value
is not set, the instance level setting is used. If this value is not
set, FODC data is sent to the current diagnostic directory path (diagpath or alt_diagpath).
- -host hostname
- In Version 9.7 Fix Pack 4 and later fix packs, specifies the host
or hosts on which the command is issued. The command is issued for
all members that are on the specified host. If a host name is not
specified, the command is issued on the local host for the default
member. Multiple host can be specified as a comma-separated list of
hosts. If you run the command on a remote host, the collection mode
(basic or full) must be specified. Also, ensure that $HOME/.rhosts is
set up between hosts. The -host option cannot
be combined with the -member option.
- For example, db2fodc -hang basic -host hostA,hostB
- all
- Specifies that the command is issued on all hosts that are defined
in db2nodes.cfg.
- For example, db2fodc -hang basic -host all
Threshold parameters
To collect data when
the environment reaches certain thresholds, use the -detect parameter
and one or more threshold parameters.
- -detect threshold_rule "<comparison_operator> threshold_value" threshold_options
- In Version 9.7 Fix Pack 5 and later fix packs, specifies a set
of conditions that must exist before data collection is triggered.
The -detect parameter can be combined with one
or more variables of a threshold rule that is combined with a threshold
value and separated by a comparison operator. Data collection is further
specified with the addition of threshold options. For example of threshold
options is the number of times a threshold must be detected or the
length of time for which a threshold must be detected before data
collection is triggered. At least one valid threshold rule must be
specified, except for -hadr data collection.
Detection is logged in the db2diag log files.
- The -detect parameter is compatible with
the following main data collection parameters:
- -cpu
- -memory
- -connections
- -hadr
- -hang
- -perf
- -cpl
- -preupgrade
- threshold_rule
- The condition of the system that the threshold is to detect. You
can specify multiple threshold rules. The following are supported
threshold rules:
- swapused
- On AIX® operating systems,
percentage value that is located under the Percent Used column
from the output of "lsps -s command.
- On Linux operating systems,
used swap space that is divided by the total swap space, which is
multiplied by 100%.
- On HP-UX operating systems, total percentage value that is located
under the PCT USED column of the swapinfo
-tam command output.
- On Solaris operating systems, used swap space that is divided
by the total swap space (available swap space plus used swap space),
multiplied by 100%.
- Not available on Windows operating
systems.
- rqueue
- The number of processes that are currently in the run queue.
- bqueue
- The number of processes that are currently in the block queue.
This option is not available on the Windows operating
systems.
- avm
- On AIX and HP-UX operating
systems, the number of active virtual pages.
- On Linux and Windows operating systems, the amount of
active memory.
- Not available on Solaris operating system.
- free
- On AIX operating systems,
the amount of idle memory. A large portion of real memory is used
as a cache for file system data. It is not unusual for the size of
the free list to remain small. A page is 4096 bytes.
- On Linux operating system,
the amount of idle memory. All Linux blocks
are 10247 bytes.
- On HP-UX, Solaris, and Windows operating
systems, the amount of idle memory.
- pi
- On AIX and Windows operating systems, the number of
pages that are paged in from the paging space.
- Not available on Linux operating
systems.
- On HP-UX operating systems, the number of pages that are paged
in.
- On Solaris operating systems, the amount of kb paged in.
- po
- On AIX and Windows operating systems, the number of
pages that are paged out to the paging space.
- Not available on Linux operating
systems.
- On HP-UX operating systems, the number of pages that are paged
out.
- On Solaris operating systems, the amount of kb paged out.
- si
- Not available on AIX and Windows operating systems.
- On Linux operating systems,
the amount of memory that is swapped in from the disk per second.
- On HP-UX operating systems, the number of processes that are swapped
in.
- On Solaris operating systems, process swap ins that are measured
in kbps.
- so
- Not available on AIX and Windows operating systems.
- On Linux operating systems,
the amount of memory that is swapped to the disk per second.
- On HP-UX operating systems, the number of processes that are swapped
out.
- On Solaris operating systems, process swap outs that are measured
in kbps.
- sr
- On AIX operating systems,
pages that are scanned by page replacement algorithm.
- Not available on Linux and Windows operating systems.
- On HP-UX operating systems, the number of pages that are scanned
by clock algorithm per second.
- On Solaris operating systems, the number of pages that are scanned
by clock algorithm.
- us
- Time spent running user (non-kernel) code, expressed in processor
ticks
- sy
- Time spent running kernel code, expressed in processor ticks
- us_sy
- Time spent running kernel and user (non-kernel) code, expressed
in processor ticks
- id
- Processor idle time, expressed in processor ticks.
- CS
- Number of context switches.
- connections
- Number of connected applications in a status that is specified
by the connstatus option. The db2pd -application command
is invoked to determine the number of database connections.
- comparison_operator
- One of the supported comparison operators, either >= (greater
than or equal to) or <= (less than or equal
to).
- threshold_value
- A numerical value for the specified threshold rule. Only non-negative
integers can be specified.
The current value for a threshold_rule parameter
can help you decide where to set thethreshold_value.
For example,
To determine the current number of connections (connected
applications), run the db2pd -application command.
To determine the current value for swapused,
refer to Table 2.
For all other threshold_rule values,
refer to the command in the second row of the table 1, based on your
operating system. The first column refers to the threshold rule that
you are including in the db2fodc command. The proceeding
columns display the code to look for in the output of the command.
For example, if you believe that the amount of active memory is affecting
the performance of your AIX system,
run the vmstat command. The current amount of active
memory is in the output of the command, represented by avm.
You can use this number to determine what is an appropriate threshold
to detect.
- condition=condition_value
- If you specify more than one threshold rule, you can use a logical
operator to join them into one threshold. The default is to AND threshold
rules. The following are valid condition values:
- AND
- Data collection is triggered if all the threshold rules are true.
For example, db2fodc -memory -detect free"<=10" connections">=1000"
condition="AND". In this example, free memory must be equal
to or less than 10 and the number of connections must be greater than
or equal to 1000. Both conditions must be true for data collection
to be triggered. Because AND is the default,
it is not needed in the example.
- OR
- Data collection is triggered if just one of the threshold rules
is true. For example, db2fodc -memory -detect free"<=10"
connections">=1000" condition="OR". In this example, free
memory must be equal to or less than 10 or the number of connections
must be greater than or equal to 1000. Only one condition must be
true for data collection to be triggered.
- threshold_options
- duration=duration_value
- Specifies the length of time, in hours, during which threshold
detection and diagnostic data collection is enabled. In other words,
the maximum amount of time that the db2fodc command
runs. The clock starts as soon as the command is issued.
- iteration=iteration_value
- Specifies the maximum number of times to perform threshold detection
and diagnostic data collection. The default is one iteration.
- sleeptime=sleeptime_value
- Specifies the time to wait, in seconds, before the next iteration
is started. The default is 1 second.
- triggercount=triggercount_value
- Specifies the consecutive number of times the threshold condition
must be detected in one iteration before diagnostic data collection
is triggered. The default is five times. After each detection of the
trigger condition, detection pauses for the interval value
that is specified.
- interval=interval_value
- Specifies the time, in seconds, between each triggercount,
within one iteration. The default value is 1 second. The interval multiplied
by the triggercount equals the total length of
time the condition must exist to trigger data collection.
- For example, the parameters are set astriggercount=3 and interval=5.
For data collection to start, the condition that is specified by the
threshold rule must be met three consecutive times with 5 seconds
between each detection. So, the condition must exist for 15 seconds
(3 x 15) to trigger data collection.
- connstatus=status_value
- Specifies the status of the connection in the connections threshold
rule. The default is to count all connection statuses for applications
that are connected to the database. Or, choose one of the following
valid connection statuses:
- CommitActive
- The unit of work is committing its database changes.
- Compiling
- The database manager is compiling an SQL statement or precompiling
a plan on behalf of the application.
- ConnectCompleted
- The application has initiated a database connection and the request
has completed.
- ConnectPending
- The application has initiated a database connection but the request
has not yet completed.
- CreatingDatabase
- The agent has initiated a request to create a database and that
request has not yet completed.
- Decoupled
- There are no agents that are currently associated with the application.
This is a normal state. When the Connection Concentrator is enabled,
there is no dedicated coordinator agent, so an application can be
decoupled on the coordinator partition. In non-concentrator environments,
an application cannot be decoupled on the coordinator partition as
there will always be a dedicated coordinator agent.
- DisconnectPending
- The application has initiated a database disconnect but the command
has not yet completed running. The application may not have explicitly
run the database disconnect command. The database manager disconnects
from a database if the application ends without disconnecting.
- FederatedRequestPending
- The application is waiting for results from a federated data source.
- HeuristicallyCommitted
- The unit of work is part of a global transaction that has been
heuristically committed.
- HeuristicallyAborted
- The unit of work is part of a global transaction that has been
heuristically rolled-back.
- Lock-wait
- The unit of work is waiting for a lock. After the lock is granted,
the status is restored to its previous value.
- PendingRemoteRequest
- The application is waiting for a response from a remote partition
in a partitioned database instance.
- PerformingLoad
- The application is performing a load of data into the database.
- PerformingBackup
- The application is performing a backup of the database.
- PerformingUnload
- The application is performing an unload of data from the database.
- QuiescingTablespace
- The application is performing a quiesce table space request.
- RequestInterrupted
- An interrupt of a request is in progress.
- Recompiling
- The database manager is recompiling (that is, rebinding) a plan
on behalf of the application.
- RestartingDatabase
- The application is restarting a database in order to perform crash
recovery.
- RestoringDatabase
- The application is restoring a backup image to the database.
- RollbackActive
- The unit of work is rolling back its database changes.
- RollbackToSavepoint
- The application is rolling back to a savepoint.
- TransactionEnded
- The unit of work is part of a global transaction that has ended
but has not yet entered the prepared phase of the two-phase commit
protocol.
- TransactionPrepared
- The unit of work is part of a global transaction that has entered
the prepared phase of the two-phase commit protocol.
- UOW-Executing
- The database manager is waiting on behalf of the unit of work
in the application. This status typically means that the system is
running in the application's code.
- UOW-Waiting
- The database manager is waiting on behalf of the unit of work
in the application. This status typically means that the system is
running in the application's code.
- UOWQueued
- The unit of work is queued waiting for another activity to complete
execution. The unit of work is queued because the threshold for the
number of concurrently running activities has been reached.
- Unknown
- Wait-Autonomous
- The application is waiting for an autonomous routine to complete.
- WaitToDisableTablespace
- The application has detected an I/O error and is attempting to
disable a particular table space. The application must wait for all
other active transactions on the table space to complete before it
can disable the table space.
- off
- Stops all threshold detection and turns off currently active threshold
rules. If other options are also specified when off is
specified, the other options are ignored. Turning off threshold detection
requires up to 60 seconds to take full effect and shuts down all running db2fodc
-detect commands.
- -nocollect
- Specifies that diagnostic data is not collected. Threshold detection
is logged in the db2diag log files. This option
is often used if you want to know whether the threshold is being met
but do want to congest the system with data collection. If you want
to start collecting data, use the off parameter
to stop the command and then reissue the command without the -nocollect parameter.
Table 2. Command to determine threshold valueOperating system |
AIX |
Linux |
HP-UX |
Solaris |
Windows |
Command used |
vmstat |
vmstst -a |
vmstat -S |
vmstat -S |
db2pd -vmstat |
Run queue (rqueue) |
r |
r |
r |
r |
r |
Block queue (bqueue) |
b |
b |
b |
b |
Not applicable |
Active memory (avm) |
avm |
active |
avm |
Not applicable |
used |
Free memory (free) |
fre |
free |
free |
free |
free |
Paging in (pi) |
pi |
Not applicable |
pi |
pi |
pi |
Paging out (po) |
po |
Not applicable |
po |
po |
po |
Swapping in (si) |
Not applicable |
si |
si |
si |
Not applicable |
Swapping out (so) |
Not applicable |
so |
so |
so |
Not applicable |
Page scanned (sr) |
sr |
Not applicable |
sr |
sr |
Not applicable |
User CPU (us) |
us |
us |
us |
us |
usr |
System CPU (sy) |
sy |
sy |
sy |
sy |
sys |
User and system CPU (us_sy) |
us+sy |
us+sy |
us+sy |
us+sy |
us+sy |
Idle CPU (id) |
id |
id |
id |
id |
idl |
Context switches (cs) |
cs |
cs |
cs |
cs |
cs/s |
Table 3. Command to determine used swap spaceOperating system |
Command used |
Used swap space (swapused) |
AIX |
lsps -s |
Percentage value that is located under the Percent
Used column |
Linux |
free |
(total swap space/used swap space)*100% |
HP-UX |
swapinfo -tam |
Total percentage value that is located under
the PCT USED column |
Solaris |
swap -s |
(used swap space/(available swap space+used
swap space))*100% |
Windows |
Not applicable |
Not applicable |
Figure 1. DB2FODC command example
Immediate collection examples
These basic examples
of the db2fodc command illustrate how to manually
collect diagnostic data, as the problems occurs.
-hang
Consider a potential
hang situation. DB2 software
is running stable, but when you update or select multiple records
from a particular table, the application hangs. You restart DB2 software and again the system
is stable. However, one week later the same situation occurs.
To
troubleshoot the problem yourself or with IBM Support's help, collect diagnostic data with
the
db2fodc command. To collect data on all active
databases, run the following command, while the potential hang is
occurring:
db2fodc -hang –alldbs
By adding
the
-hang parameter basic operating system, configuration,
and diagnostic information is collected that can assist IBM support in analyzing the potential hang.
The
following examples illustrate alternative methods to collect diagnostic
information about a potential hang situation.
-profile
The
-profile parameter
is an advanced option. It allows more control on how the data is collected
than using the
-hang parameter. By customizing
the profile in the
db2fodc.profile file you can
tweak various parameters that you would otherwise not be able to control.
The following example is a customized profile:
[collectstack]
db2pdstack_iterations=9
db2pdstack_sleep=30
<end>
After you create your profile, specify the name
of the profile in the command while the performance issue is occurring,
such as in the following example:
db2fodc -profile collectstack
As
a result, a stack trace is generated with 9 iterations and 30 seconds
of sleep time. The full list of parameters that can be specified in
the profile can be found in the
sqllib/bin/db2cos_hang script.
It is recommended that the profile parameters be used only with the
guidance of IBM support.
-perf
Consider a situation
in which your application processes are progressing slowly or a resource
is being heavily used on one particular database. Because the database
is still usable and not hanging, the
-perf main
data collection parameter can be used to collect diagnostic information
before you contact support. While the performance issue is occurring,
run the following command:
db2fodc -db SAMPLE -perf full
Snapshots,
stacktraces, virtual memory, input and output information, and traces
are some of the data that is collected when you include the
-perf parameter.
In this example, the full data collection mode is started and is restricted
to the database
SAMPLE. The full mode requires
more resources and can take longer to run.
The following command
is an example of how to limit the data collection to specific members:
db2fodc -perf -member 10-13,15
in
this example, the
db2fodc -perf command is started
in the default basic collection mode on members 10, 12, 13, and 15.
-cpu
Consider a situation,
in Version 9.7 Fix Pack 5 and later fix packs, in which you suspect
the processor has an unusual number of running processes. To collect
diagnostic data for problems that are related to processor usage,
issue the following command:
db2fodc –cpu full
You
run the command several more times during peak use and during idle
time to generate an accurate conclusion about whether the symptoms
are persisting over time. The full data collection mode is started
and the default
DB2FODC registry variables and
parameters are used.
-memory
If you suspect,
in Version 9.7 Fix Pack 5 and later fix packs, that there is no free
memory available, that swap space is being used at a high rate, that
excessive paging is occurring or that a memory leak is occurring,
use the
-memory parameter to collect memory-related
diagnostic data. The following command is an example:
db2fodc –memory full -member 3
In
this example, the full data collection mode is started and is restricted
to the third member.
-connections
In Version
9.7 Fix Pack 5 and later fix packs, for performance problems related
to database connections, you might observe sudden spikes in the number
of applications in the executing or compiling state or new database
connections are being denied. If these symptoms are observed, you
can run the following command:
db2fodc –connections
-clp
You might encounter
an error after you upgrade or create an instance. This error might
be, for instance, DBI1281E. This error code does not provide a root
cause of the problem and further diagnostic information is needed.
To further troubleshoot the problem, in Version 9.7 Fix Pack 5 and
later fix packs, run the following command:
db2fodc –clp full
This
command collects environment and configuration-related information
that is targeted to diagnosing an instance creation problem. After
collection is completed the information is stored in a newly created
directory named
FODC_Clp_timestamp_member,
which can be sent to IBM support
for analysis.
-preupgrade
In Version
9.7 Fix Pack 5 and later fix packs, before you create or upgrade an
instance and before you update to the next fix pack, gather diagnostic
information to help troubleshoot any problem, including any impact
to your SQL statements, that might arise after the upgrade. To collect
performance-related information before an update or upgrade, run the
following command:
db2fodc -preupgrade -db SAMPLE -par DYN_SQL_FILE=sqlFile
Where,
SAMPLE is
the name of the database from which you are collecting information
from. Where
sqlFile is a file that contains the
SQL statements that are most representative of your workload. If you
do not include
-par DYN_SQL_FILE=sqlFile option
with the
-upgrade parameter, 20 dynamic queries
are retrieved from the dynamic SQL cache. To gather optimal performance
information, you can issue the
db2fodc -preupgrade command
multiple times, at high usage times and at idle times. After the upgrade,
run the same command again. If there is a performance issue after
you upgrade, compare the output of the command before and after the
upgrade. For more assistance, contact IBM Support.
-hadr
In Version 9.7
Fix Pack 7 and later fix packs, you can use the
db2fodc command
to collect data on HADR congestion. If you suspect that there is HADR
congestion, issue one or more of following
db2fodc commands
with the
–hadr parameter, as the HADR congestion
is happening. Assume that the system consists of a primary host
hostA and
a standby host
hostB.
- To manually collect HADR-related data, run the following command:
db2fodc -hadr -db sample
This
command starts the db2cos_hadr (db2cos_hadr.bat on Windows) script and places the
collected data in the FODC_Hadr_timestamp_hostname_Primary|Standby|Standard directory,
which is created in the DIAGPATH directory.
- To collect HADR diagnostic data on all hosts, run the following
command:
db2fodc –hadr –db sample –host all
This
command collects data on all hosts and places the collected data in
the DIAGPATH directory on each host.
- To collect HADR diagnostic data on specifically the primary and
standby hosts, run the following command:
db2fodc –hadr –db sample –host hostA,hostB
- To collect HADR diagnostic data on hostB and
places the FODC_Hadr package into DIAGPATH directory
on hostB, run the following command on hostA:
db2fodc –hadr –db sample –host hostB
- To place the FODC_Hadr package into another directory,
the /TMP/hadrdata/ directory for example, run the
following command:
db2fodc –hadr –db sample –host all –fodcpath /TMP/hadrdata/
-fodcerror
FODC_IndexError_directory
Consider a situation
in which you receive an index error. The error informs you that an
index used all the free space. To collect data on the index error,
without stopping the database manager, issue the following command:
db2fodc -fodcerror FOCE_IndexError_directory
The
basic data collection mode is started, as it is the default. Data
is collected into one or more files that are deposited into the
FODC_IndexError_directory directory.
Review the output for possible factors that can lead to an index error
or send the directory to IBM support
for analysis.
Threshold collection examples
By specifying
the -detect parameter, along with one or more threshold
rules, you can set a value against cpu performance, memory, and connections.
The system is monitored and data is collected when the threshold rules
are met.
The following examples illustrate the use of the -detect parameter
to collect diagnostic information:
-cpu
In Version 9.7
Fix Pack 5 and later fix packs, the
-cpu main
data collection parameter collects processor-related performance and
diagnostic data.
- To detect an intermittent issue with the processor that might
be tied to the number of processes in the run queue, run the following
command:
db2fodc -cpu basic -detect us">=90" rqueue">=1000" condition="AND" triggercount="3" interval="2" iteration="4" sleeptime="100" duration="500" -member all
You
can specify your own rules for the -detect parameter
to determine when to start diagnostic data collection. In this example,
both (condition="AND" is the default) trigger conditions
(us">=90" rqueue">=1000"rqueue">=1000") must exist
for 6 seconds (triggercount="3" X interval="2" =
6 seconds) to trigger diagnostic data collection on all members. If
the trigger conditions occur, then diagnostic data is collected. If
the trigger conditions do not occur, trigger condition detection continues
within the iteration. The iteration option is set to four to specify
that trigger condition detection followed by diagnostic data collection
is performed four times, with a sleep time of 100 seconds in between.
The command exits after all four iterations are successfully completed
or after 500 hours. (duration="500")
- If you suspect that your processor is not performing well because
of an issue with processor time spent running kernel and user code,
run the following command:
db2fodc -cpu full -detect us”>=20” sy”>=10” condition=”OR”
The -detect parameter,
which is combined with the threshold rules, delays the collection
of processor-related information until the trigger conditions specified
by the threshold rule are detected. The operator OR is
chosen, which means only one of the thresholds must be tripped to
trigger data collection. Because the trigger count value and interval
value are not specified, the default values (triggercount=5 and interval=1)
are used. Therefore, if one of the threshold rules is met five consecutive
times in 5 seconds (triggercount=5 X interval=1),
CPU-related data collection is triggered. If the threshold rules are
never met, the command runs indefinitely. To stop the process, run
the following command:db2fodc -detect off
The db2fodc
-detect off command stops all threshold detection and turns
off any currently active threshold rules. This process can take up
to 60 seconds to complete and stops all db2fodc -detect commands
that are running on the server.
-memory
In Version
9.7 Fix Pack 5 and later fix packs, the
–memory parameter
can also be useful to help debug memory spikes, paging issues, or
memory over-commits.
- Consider a situation in which your system is performing poorly
because the total number of virtual-memory working segment pages on
your AIX operating system might
be too high. You can run the following command:
nohup db2fodc -memory basic -detect "avm>=5242880" duration=1000 &
In
this example, the nohup mode enables the command
to ignore the HUP (hang up) signal so that the subsequent logout does
not stop the command from running in the background. For that reason,
the duration of 1000 hours is specified in the command. The duration parameter
does not have a default, so, if duration is not specified, the command
runs forever, if the conditions are never met.
- You can detect a threshold condition and trigger automatic diagnostic
data collection when the threshold condition is exceeded multiple
times. Consider the following command example:
db2fodc -memory basic -detect free"<=10" connections">=1000" interval=10 triggercount=4 duration=5 sleeptime=30 iteration=10 -member 3
This
example monitors the number of free memory blocks (free"<=10")
AND the number of application connections to the database (connections">=1000").
The operator is AND by default. Only member 3
is monitored for the conditions (-member 3). There
are 10 iterations with 30 seconds of rest between each iteration.
Data collection is tripped if both conditions are met four consecutive
times over 40 seconds (triggercount=4 X interval=10).
- To trigger data collection when the amount of free memory drops
to a specified amount, run the following example command:
db2fodc -memory basic -detect free"<=386236" so">=0" sleeptime=30 iteration=10 interval=10 triggercount=4 duration=5
If
the number of free memory pages drops to or below 386236, and the
amount of memory that is swapped out is greater than zero, the following
output is an example of the data collection:> db2fodc -memory basic -detect free"<=386236" so">=0" sleeptime=30 iteration=10 interval=10 triggercount=4 duration=5
"db2fodc": List of active databases: "SAMPLE"
"db2fodc": Starting detection ...
"db2fodc": "4" consecutive threshold hits are detected.
"db2fodc": Triggering collection "1".
Script is running with following parameters
COLLECTION_MODE : LIGHT
COLLECTION_TYPE : MEMORY
COLLECTION_DURATION : 5
COLLECTION_ITERATION : 5
DATABASE/MEMBER : -alldbs
FODC_PATH : /home/inst1/sqllib/db2dump/FODC_Memory_2013-04-02-15.47.56.969013_0000
db2pd_options : -agent -apinfo -active -tran -locks -bufferpools -dbptnmem -memset -mempool -sort -fcm hwm -dyn
SNAPSHOT : 2
STACKTRACE : 2
TRACELIMIT : 20
SNAPSHOT_TYPE : ALL
...
In db2diag.log:
2013-04-02-15.47.55.154348-240 I200475E548 LEVEL: Event
PID : 8944 TID : 46912890796352 KTID : 8944
PROC : db2fodc
INSTANCE: inst1 NODE : 000
HOSTNAME: coralxib11
FUNCTION: DB2 UDB, RAS/PD component, pdFodcDetectAndRunCollection, probe:100
CHANGE :
Hostname: coralxib11 Member(s): 0 Iteration: 1
Thresholds hit 0: so(0)>=0 free(159972)<=386236
Thresholds hit 1: so(0)>=0 free(157872)<=386236
Thresholds hit 2: so(0)>=0 free(129572)<=386236
Thresholds hit 3: so(0)>=0 free(142952)<=386236
.....
2013-04-02-15.47.56.969683-240 E201708E703 LEVEL: Warning
PID : 9519 TID : 46912890796352 KTID : 9519
PROC : db2fodc
INSTANCE: inst1 NODE : 000
HOSTNAME: coralxib11
FUNCTION: DB2 UDB, RAS/PD component, pdDb2FODCMain, probe:30
MESSAGE : ADM14003W FODC has been invoked by the user from db2fodc tool for
symptom "memory" and diagnostic information has been recorded in
directory
"/home/inst1/sqllib/db2dump/FODC_Memory_2013-04-02-15.47.56.969013_0
000". Please look in this directory for detailed evidence about what
happened and contact IBM support if necessary to diagnose the
problem.
-hadr
To monitor HADR
congestion and start automatic diagnostic data collection, specify
the
-hadr parameter with the
-detect option
for the
db2fodc command. In the following examples,
assume that the system consists of a primary host
hostA and
a standby host
hostB.
- To automatically start diagnostic data collection if HADR congestion
is detected, run the following command:
db2fodc –hadr –db sample –detect
The
command starts a process that monitors the HADR database to see whether
there is enough HADR congestion to start data collection. If there
is enough HADR congestion, the db2cos_hadr (db2doc_hadr.bat on Windows operating systems) script
is started and diagnostic data collection occurs. The process ends
after diagnostic data collection completes. If not enough HADR congestion
is ever detected, the monitor runs until the detection duration exceeds
the duration parameter value, or the user ends it by issuing the following
command:db2fodc -detect off
- To automatically start diagnostic data collection on all hosts
if a certain amount of HADR congestion is detected on the local host,
run the following command:
db2fodc –hadr –db sample -detect -host all
If
congestion is detected, HADR diagnostic data is collected on all hosts.
- Consider that you might want to know whether HADR congestion is
occurring, however, you do not want to slow down your system by collecting
diagnostic data. Run the following command example:
db2fodc -hadr -db sample -detect -nocollect
Diagnostic
data is not collected. However, if HADR congestion is detected, the
event is logged in the db2diag log files.
- To check for HADR congestion over a specific amount of time, run
the following command:
db2fodc -hadr -db sample -detect duration=24
HADR
congestion is monitored for 24 hours (duration=24).
Because the default for iteration is 1, for triggercount is
10, and for interval is 30, if HADR congestion
is detected 10 consecutive times over 300 seconds, data is collected
and the command exits.
- Threshold options can be applied to the –hadr parameter.
For example, run the following command:
db2fodc -hadr -db sample -detect iteration=2 sleeptime=3600 triggercount=8 interval=15 duration=24
The
effect of this threshold rule is as follows:- HADR congestion is monitored for at most two iterations, with
sleep time for 1 hour between each iteration.
- If the threshold rules are met eight consecutive times, every
15 seconds, then data is collected and the iteration exits. If it
is the first iteration, then the monitoring process sleeps for 1 hour
before next iteration. If it is the second iteration, then the monitoring
process exits.
- The detection monitoring process runs for 24 hours maximum.
- If HADR congestion is not detected after 24 hours, the detection
process stops and exits the monitoring process.
- If no HADR congestion is detected, the first iteration runs for
24 hours and then exits. If that happens, the 1 hour sleeptime and
the second iteration never run.
-connections
In Version
9.7 Fix Pack 5 and later fix packs, to trigger data collection when
certain types of connections reach a threshold, issue the
db2pdfodc
-connections command with the
-detect parameter,
as in the following example:
db2fodc –connections –db sample –detect connections">=10" connstatus=Compiling
If
the number of applications that are connected to the SAMPLE database
in a compiling state is equal to or greater than 10, then connection-related
diagnostic data is collected.
Multi-partitioned environments
The
db2fodc
-hang and
db2fodc -perf commands can
be run in a multi-partitioned environment with multiple physical nodes.
In this environment, the following example runs successfully:
db2fodc -perf full -member all other_options
Or
db2fodc -perf -alldbpartitionnums other_options
During
a potential hang or severe performance issue, in a partitioned database
environment, these parameters can be used to start the
db2fodc command
at all nodes in a single invocation. Options
-alldbpartitionnums and
-dbpartitionnum are
suitable for only logical partition numbers. If the
-dbp or
-member options
are not specified, by default, only information from the current partition
number is collected.
The
db2fodc -fodcerror FODC_IndexError_directorycan
also be run in a multi-partitioned environment with multiple physical
or logical nodes. To collect information for an index error at a specific
partition number in a partitioned database environment, the following
example runs successfully:
db2_all "<<+node#< db2fodc -fodcerror FODC_IndexError_directory [basic | full]
Where
the
node# is the number of the specific node. This
number is the last number in the directory
FODC_IndexError_timestamp_PID_EDUID_Node#.
An absolute path must be used, if you ran the
db2fodc -fodcerror FODC_IndexError_directory command
with the
db2_all command. The output and log files
are in the
db2diag log file. The
FODC_IndexError_directory folder
is required and must contain the
db2cos_indexerror_short(.bat) script
or the
db2cos_indexerror_long(.bat) script. Do not
rename or move the
FODC_IndexError_directory directory.
The
db2dart commands in the scripts need this directory
path to correctly generate reports. If you must run
db2fodc
-fodcerror FODC_IndexError_directory manually,
check the
FODC_IndexError_directory directory for
any existing
db2dart reports. Reports have the
extension
.rpt and
.rpthex. If there
are reports in the directory, rename them or move them to a subdirectory
under the
FODC_IndexError_directory directory before
you start
db2fodc -fodcerror FODC_IndexError_directory manually.