You can enable advanced pruning of the Datamart database to control the size of the
database.
About this task
Transaction tracking data is stored in multiple tables in the Datamart database. Pruning
automatically occurs for the transaction instance data in the database table names that start with
F_ and for aggregated transaction data in the table names that start with
AGG_. However, pruning does not automatically occur for the static data
stored in the table names that start with D_. In some environments, the data
in a subset of the D_xxx tables is dynamic and causes the size of the
Datamart database to continually grow. You can run a script to enable advanced pruning which deletes
unreferenced rows in the D_xxxx to control the size of the Datamart
database.
Before you begin
The enable_dmart_pruning.sh
script can be run at any time to enable pruning of
the unreferenced rows in the D_xxx tables in the Datamart database. The
pruning occurs nightly when the Datamart daily cron job runs between 3 am and 4 am based on the time
zone of the Cloud
APM server.
Note: Before you run the
enable_dmart_pruning.sh
script, you need to install
IBM® Cloud Application Performance Management V8.1.4.0 interim fix 10 or a later server
interim fix. You can download the interim fixes for
IBM Cloud Application Performance Management V 8.1.4 from
IBM Fix Central.
If there is a not a large amount of data in the Datamart database tables, the
enable_dmart_pruning.sh
script automatically enables pruning of the
D_xxx tables and no further action is required on your part. However, if
there is a large amount of transaction tracking data in the Datamart database, then the
enable_dmart_pruning.sh
does not enable pruning and instead displays a message
indicating that you should plan a maintenance window to execute the initial pruning to prevent a
performance impact to your Cloud
APM server and database
server.
The enable_dmart_pruning.sh
script indicates the Datamart database contains a
large amount of data if the D_xxxx tables contain over 100 million rows, or
the D_xxxx tables contain over 10 million rows and the
F_xxxx tables contain over 100 million rows. The script displays the number
of rows in each D_xxxx table and also writes the information to the latest
/opt/ibm/dtl/datalayer/logs/enable_dmart_pruning.sh-*.log file.
During the pruning of the D_xxx tables, the D_xxx
table and the F_xxx tables that reference the D_xxx
table are locked which prevents these tables from being updated with new data until the pruning is
complete. By default, the pruning deletes up to 1 million rows in the D_xxxx
tables to prevent the tables from being locked too long. Pruning 1 million rows might take 30 to 60
minutes. If there are more than 1 million rows in the D_xxx tables to prune,
the next 1 million rows are pruned when the cron job runs the next day or if you manually run the
Datamart daily cron job. The following procedure describes how you can configure the max number or
rows that are deleted per pruning job if you want to adjust the number of rows that are pruned at
one time. Also for the initial pruning of a large Datamart database, the steps in the procedure
describe how to configure the pruning job to prune one large D_xxxx table at
a time.
When the enable_dmart_pruning.sh
script indicates there is a large amount of
transaction tracking data in the Datamart database, the initial pruning of the D_xxxx
tables might also require you to increase the Datamart database transaction log size or
increase the Db2 LOCKTIMEOUT value for the initial pruning of the
D_xxx tables. The steps in the procedure indicate what Db2 error messages are
displayed for those scenarios and how to increase the Db2 transaction log size or
LOCKTIMEOUT value.
Complete the following steps as the root user on the Cloud
APM server:
Procedure
- Change to the /opt/ibm/dtl/datalayer/cron directory and run the
following command:
./enable_dmart_pruning.sh
Note:
/opt/ibm/ is the default Cloud
APM server install directory. If you installed Cloud APM
into a different directory, use your install directory instead of
/opt/ibm/.
- If your DATAMART database does not contain a large amount of data, the script
automatically enables advanced pruning of the D_xxx tables and displays
messages similar to the following ones. No further action is required. The pruning will run daily as
part of the existing Datamart daily cron job.
The DATAMART database has low usage for transaction tracking data.
Editing /opt/ibm/dtl/datalayer/cron/delete_unreferenced_rows.cfg to enable datamart pruning.
Changing _CHECK_FOR_UNREFERENCED_ROWS_ENABLED to Yes.
DATAMART database pruning is already enabled for D_REQUEST.
DATAMART database pruning is already enabled for D_TRANSACTION.
DATAMART database pruning is already enabled for D_URL_DETAILS.
DATAMART database pruning is already enabled for D_SOURCE_IP_ADDRESS.
DATAMART database pruning is already enabled for D_USER.
DB20000I The TERMINATE command completed successfully.
.... Done
- If your DATAMART database contains a large amount of transaction tracking data, the
script does not enable advanced pruning and displays messages similar to the following
ones:
"The DATAMART database has high usage for transaction tracking data.
The D_xxx tables contain 12345678 rows for database DATAMART.
The F_xxx tables contain 87654321 rows for database DATAMART."
"We recommend that you cleanup the unreferenced rows in the DATAMART D_xxx tables.
Edit the /opt/ibm/dtl/datalayer/cron/delete_unreferenced_rows.cfg file.
For detailed instructions refer to the topic "Enable advanced pruning of the Datamart database" in the Knowledge Center at URL:
https://www.ibm.com/support/knowledgecenter/SSHLNR_8.1.4/com.ibm.pm.doc/welcome.html.
If
you see the messages, plan a maintenance window to perform the initial pruning of the
D_xxx tables.
- During the maintenance window, perform these steps to perform pruning:
- Confirm you configured the recommended number of Datamart buffer pools for a large
Cloud
APM server environment. You can display the
current number of buffer pools by running these commands on the Cloud
APM server if a local Db2 server is being used or on the
remote Db2 server:
- Logon as the Db2 instance user
su - db2apm
Where
db2apm is the default instance user.
-
db2 connect to DATAMART
Note: You must connect as the instance ID. Connecting as ITMUSER will not have
the authority to update bufferpools.
-
db2 "select * from syscat.bufferpools where BPNAME ='IBMDEFAULTBP' "
Note: If this command fails when you cut and paste it, try typing it manually.
If the number of buffer pools is less than
300000, you should set it to the
recommended value by running these commands:
db2 alter bufferpool IBMDEFAULTBP immediate size 300000
db2 connect reset; db2 terminate
- On the Cloud
APM server, logon as root and
change to the /opt/ibm/dtl/datalayer/cron directory and run the following
command to get the latest count of the number of rows in the D_xxxx
tables.
./enable_dmart_pruning.sh
The script displays the number of rows in each D_xxxx table that can be
pruned.
- Only prune one D_ table with over 1 million rows at a time during the maintenance
window to minimize the impact on Db2 resources. The
/opt/ibm/dtl/datalayer/cron/delete_unreferenced_rows.cfg file has a line for
each D_xxxx table that can be pruned.
- If the number of rows for a D_xxxx table is less than 1 million, set the
value for that table to YES in the
delete_unreferenced_rows.cfg file.
- Choose one D_xxxx table with more than 1 million rows to prune and set
the value for the other large D_xxxx tables to NO. We
recommend waiting to enable pruning for the largest D_xxxx table until the
other D_xxxx tables have been pruned.
For example, if the
enable_dmart_pruning.sh displays these messages for the
number of rows in the
D_xxx
tables:
1200000 rows exist in table D_REQUEST for DB DATAMART
1500000 rows exist in table D_TRANSACTION for DB DATAMART
1400000 rows exist in table D_URL_DETAILS for DB DATAMART
10000 rows exist in table D_SOURCE_IP_ADDRESS for DB DATAMART
5000 rows exist in table D_USER for DB DATAMART
Then edit
opt/ibm/dtl/datalayer/cron/delete_unreferenced_rows.cfg.
Ensure
_DATAMART_clean_D_SOURCE_IP_ADDRESS and
_DATAMART_clean_D_USER are set to
YES because they have less than
1 million rows and set
_DATAMART_clean_D_REQUEST to
YES
as the first large
D_xxx table to prune because it is not the table with the
most rows. Then set
_DATAMART_clean_D_TRANSACTION and
_DATAMART_clean_D_URL_DETAILS to
NO as the
following:
_DATAMART_clean_D_REQUEST=YES
_DATAMART_clean_D_TRANSACTION=NO
_DATAMART_clean_D_URL_DETAILS=NO
_DATAMART_clean_D_SOURCE_IP_ADDRESS=YES
_DATAMART_clean_D_USER=YES
Also if you want to change the max number of rows to prune at a time then change the value of the
_MAX_NUM_ROWS_TO_DELETE config property in the
delete_unreferenced_rows.cfg file.
- Run the following command to run the Datamart daily cron job to perform the
pruning:
/etc/cron.daily/datamart.daily
The cron job prunes up to 1 million unreferenced rows from the D_xxx
tables that are enabled for pruning. The cron job might take 30 to 60 minutes to perform the pruning
when there are more than 1 million unreferenced rows to delete.
- In a separate console window, run the following command to view the latest log file
for the cron job:
ls -ltr /opt/ibm/dtl/datalayer/logs/partition-maintenance.sh-*.log | tail -1
- Locate the following line and check for any error messages that occur after that
line:
********************* Running Delete Unreferenced Rows ******************
- If you see the following message, it means Db2 timed out before the pruning could be
completed.
DB21034E The command was processed as an SQL statement because it was not a valid
Command Line Processor command.
During SQL processing it returned:
SQL0911N The current transaction has been rolled back because of a deadlock or timeout.
Reason code "2". SQLSTATE=40001"
You
should increase the value for the Db2 LOCKTIMEOUT setting by performing these steps:
- If the Db2 server is local, enter the following command:
su - db2apm
- If the Db2 server is remote, logon to the remote Db2 server as the Db2 instance
user.
db2 connect to DATAMART
db2 get db cfg for DATAMART | grep LOCKTIMEOUT
- Enter this command to update the LOCKTIME value so that Db2 does not timeout
locks:
db2 update db cfg for DATAMART using LOCKTIMEOUT -1
db2 connect reset; db2 terminate
- If you see the following message, the Db2 transaction log size should be increased.
DB21034E The command was processed as an SQL statement because it was not a valid Command Line Processor command.
During SQL processing it returned:
SQL0964C The transaction log for the database is full. SQLSTATE=57011
You
should increase the values for the Db2 LOGPRIMARY and
LOGSECOND configuration values. Db2 does not provide an algorithm for
determining the value that should be set. You should double the LOGSECOND value
(which is for temporary space). If you still run out of transaction log space, double the value for
the LOGPRIMARY config value.Perform the following steps to change the Db2
transaction log size:
- If the Db2 server is local, enter the following command:
su - db2apm
- If the Db2 server is remote, logon to the remote Db2 server as the Db2 instance user and record
the current
values.
db2 connect to DATAMART
db2 get db cfg for DATAMART | egrep "LOGFILSIZ|LOGPRIMARY|LOGSECOND"
- To update the
LOGSECOND
config value, enter this
command:db2 update db cfg for DATAMART using LOGSECOND 50
Where you should
replace 50 with 2 times the LOGSECOND
value displayed by the db2 get db cfg
command.
- To update
LOGPRIMARY
,
run:db2 update db cfg for DATAMART using LOGPRIMARY 20
Where you should
replace 20 with 2 times the LOGPRIMARY
value displayed by the db2 get db
cfg
command.
- The LOGFILESIZ config value is normally between 4K and 25K. There is no
requirement to change it unless it is less than 4K. To update
LOGFILESIZ
, enter
this command:db2 update db cfg for DATAMART using LOGFILESIZE 16000
Where
16000 is larger than the current value.
- After you have completed the Db2 configuration updates, enter this
command:
db2 connect reset; db2 terminate
- Ignore the following Db2 messages in the
/opt/ibm/dtl/datalayer/logs/partition-maintenance.sh-*.log file:
- sqldEscalateLocks: DB2 locks several 100 thousand rows at a time. When it
reaches a high number of rows being locked for a given table, it escalates to the point where it
locks the entire table. This is normal.
- SQL0598W: Existing index ITMUSER.D_SRCIP_SRCIPADDID_IDX
is used as the index for the primary key or a unique key. SQLSTATE=01550. This
is normal.
- SQL0598W: Existing index ITMUSER.D_USER_USERID_IDX is
used as the index for the primary key or a unique key. SQLSTATE=01550. This is
normal.
- SQL fails when attempting to add the primary key or foreign key constraint. This happens because
the tables are large and there is not enough Db2 memory to add the constraint. The constraints are
dropped to improve performance when deleting rows. It takes very little memory to disable a
constraint, but it takes a lot of memory to enable it. When the tables become smaller, the
constraints are restored.
- If you saw error messages in the log file and performed Db2 config updates then
perform steps 4d through 4f again to re-run the Datamart daily cron job as root on the Cloud APM
server.
- If the log file shows that pruning is successfully deleting rows then the Datamart
daily cron job should automatically reduce the number of unreferenced rows by up to 1 million rows
each night until there are no more unreferenced rows for the tables that you enabled for pruning in
step 4c . You can also manually run "/etc/cron.daily/datamart.daily" multiple times per day to
reduce the number of rows.
- If you have additional D_xxxx tables to enable for pruning, repeat steps 4b through 4h
to enable the next large D_xxxx table for pruning. You can either enable the next D_xxxx table for
pruning in the current maintenance window or schedule another maintenance window to enable pruning
for the other large D_xxxx tables. The Datamart daily cron job will continue to prune the tables
that have already been enabled for pruning in the
/opt/ibm/dtl/datalayer/cron/delete_unreferenced_rows.cfg file.
- If you have issues enabling the advanced pruning then contact IBM Cloud APM support
for assistance.
What to do next
There are two additional setting in the
delete_unreferenced_rows.cfg file
that you can tune:
- _MAX_NUM_ROWS_TO_DELETE=1000000: This setting controls the time the
Datamart database tables are locked. It takes about 30 minutes for Db2 to delete 1 million rows.
Check the timestamps in the latest
/opt/ibm/dtl/datalayer/logs/partition-maintenance.sh-*.log file to determine
when the deletes start and when they finish. If the nightly deletes are finishing up before your
work day begins, you can increase this setting. A run time of 30-60 minutes is normal for large
customers until the pruning stabilizes.
- _SKIP_SMALL_TABLES=500000: This setting prevents the scanning for
unreferenced rows in the D_xxxx tables from being run every night. After the
unreferenced rows in the D_ tables have been cleaned up, you might not want to run the Datamart
daily cron job to scan for new unreferenced rows until you have a large number of rows to delete
again. When there are no more unreferenced rows to be deleted, the latest
opt/ibm/dtl/datalayer/logs/partition-maintenance.sh-*.log file indicates a low
number (less than 1 million) of rows have been deleted. At this point you can adjust the value of
this config property to be 2 million higher than the number of rows in your largest D_table. Then
the Datamart daily cron jobs skips the step to scan for unreferenced rows until there are at least 2
million new rows to be scanned.