APAR status
Closed as program error.
Error description
During an event storm, a lot of events were deleted at the same time (tens of thousands) which resulted in the object server locking up for several hours. In the aggregate object server log file there were tens of thousands of "full table scan" messages for the table alerts.ibm_cleared_event_cache. Triggers that update this table are shipped by the Situation Update Forwarder (SUF) in the itm_event_cache.sql file. Log Files: In the aggregate object server log files, there were tens of thousands of the message: 2022-06-12T07:33:54: Debug: D-STO-105-029: Using full table scan on table alerts.itm_cleared_event_cache
Local fix
Clear out the table alerts.itm_cleared_event_cache
Problem summary
OMNIbus aggregate object server locked up due to excessive table scans after large number of events deleted. During an event storm in OMNIbus, tens of thousands of ITM events were deleted at the same time which resulted in the object server locking up for several hours. In the aggregate object server log file there were tens of thousands of "full table scan" messages for the table alerts.ibm_cleared_event_cache. Triggers that update this table are shipped by the Situation Update Forwarder (SUF) in the itm_event_cache.sql file. See Install Actions section in the Problem Conclusion for additional installation steps required.
Problem conclusion
The problem was recreated when a lot of events were deleted at one time (e.g. 30,000) which results in the events being written to the table alerts.itm_cleared_event_cache. While the events are in the .cache (up to 2 hours), THRUNODE_CHANGED events arrive indicating the remote monitoring server (RTEMS) the agent is connected to has changed. The result is an inefficient update to the records in the .cache file to update the ITMThruNode. The problem was in the trigger itm_cleared_event_restore_update which is shipped in the itm_event_cachche.sql file. It issued a "for each" loop when it is not needed, resulting in a table scan through each iteration of the table. Install Actions: --------------- The file updated for this fix is itm_event_cache.sql. The Situation Update Forwarder is installed using the ESynch3000xxx.bin or upgraded using ESUpgrade30xxx.bin (where xxx is the operating system). Once installed/updated, the updated itm_event_cache.sql file will be installed into the Situation Update Forwarder omnibus directory. From there, the file should be copied and loaded into the Object Server database. See "Updating the OMNIbus database schema on single-tier or aggregation tier ObjectServers" (https://www.ibm.com/support/knowledgecenter/en/SSTFXA_6.3.0.2/c om.ibm.itm.doc_6.3fp2/install/config_omni2_dbschema.htm) for more details. Below is an example of loading the file using the nco_sql command: $OMNIHOME/bin/nco_sql -user username -password password -server server_name < path_to_file/itm_event_cache.sql The fix for this APAR is contained in the following maintenance packages: | service pack | 6.3.0.7-TIV-ITM-SP0014
Temporary fix
Clear the alerts.ibm_cleared_event_cache table.
Comments
APAR Information
APAR number
IJ42304
Reported component name
TEMS
Reported component ID
5724C04MS
Reported release
630
Status
CLOSED PER
PE
NoPE
HIPER
NoHIPER
Special Attention
NoSpecatt / Xsystem
Submitted date
2022-09-06
Closed date
2023-04-17
Last modified date
2023-04-17
APAR is sysrouted FROM one or more of the following:
APAR is sysrouted TO one or more of the following:
Fix information
Fixed component name
TEMS
Fixed component ID
5724C04MS
Applicable component levels
[{"Business Unit":{"code":"BU029","label":"Software"},"Product":{"code":"SSZ8F3","label":"IBM Tivoli Monitoring V6"},"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"630"}]
Document Information
Modified date:
18 April 2023