IBM Support

IJ42304: OMNIbus aggregate object server locked up due to excessive table scans after large number of events deleted.

Subscribe to this APAR

By subscribing, you receive periodic emails alerting you to the status of the APAR, along with a link to the fix after it becomes available. You can track this item individually or track all items by product.

Notify me when this APAR changes.

Notify me when an APAR for this component changes.

 

APAR status

  • Closed as program error.

Error description

  • During an event storm, a lot of events were deleted at the same
    time (tens of thousands) which resulted in the object server
    locking up for several hours.
    
    In the aggregate object server log file there were tens of
    thousands of "full table scan" messages for the table
    alerts.ibm_cleared_event_cache. Triggers that update this table
    are shipped by the Situation Update Forwarder (SUF) in the
    itm_event_cache.sql file.
    
    
    
    Log Files:
    In the aggregate object server log files, there were tens of
    thousands of the message:
    
    
    2022-06-12T07:33:54: Debug: D-STO-105-029: Using full table scan
     on table alerts.itm_cleared_event_cache
    

Local fix

  • Clear out the table alerts.itm_cleared_event_cache
    

Problem summary

  • OMNIbus aggregate object server locked up due to excessive table
     scans after large number of events deleted.
    
    
    During an event storm in OMNIbus, tens of thousands of ITM
    events were deleted at the same time which resulted in the
    object server locking up for several hours.
    
    In the aggregate object server log file there were tens of
    thousands of "full table scan" messages for the table
    alerts.ibm_cleared_event_cache.  Triggers that update this table
    are shipped by the Situation Update Forwarder (SUF) in the
    itm_event_cache.sql file.
    
    See Install Actions section in the Problem Conclusion for
    additional installation steps required.
    

Problem conclusion

  • The problem was recreated when a lot of events were deleted at
    one time (e.g.  30,000) which results in the events being
    written to the table alerts.itm_cleared_event_cache.  While the
    events are in the .cache (up to 2 hours), THRUNODE_CHANGED
    events arrive indicating the remote monitoring server (RTEMS)
    the agent is connected to has changed.  The result is an
    inefficient update to the records in the .cache file to update
    the ITMThruNode.
    
    The problem was in the trigger itm_cleared_event_restore_update
    which is shipped in the itm_event_cachche.sql file.  It issued a
    "for each" loop when it is not needed, resulting in a table scan
    through each iteration of the table.
    
    Install Actions:
    ---------------
    The file updated for this fix is itm_event_cache.sql.   The
    Situation Update Forwarder is installed using the
    ESynch3000xxx.bin or upgraded using ESUpgrade30xxx.bin (where
    xxx is the operating system).   Once installed/updated, the
    updated itm_event_cache.sql file will be installed into the
    Situation Update Forwarder omnibus directory.  From there, the
    file should be copied and loaded into the Object Server
    database.   See "Updating the OMNIbus database schema on
    single-tier or aggregation tier ObjectServers"
    (https://www.ibm.com/support/knowledgecenter/en/SSTFXA_6.3.0.2/c
    om.ibm.itm.doc_6.3fp2/install/config_omni2_dbschema.htm) for
    more details.
    
    Below is an example of loading the file using the nco_sql
    command:
        $OMNIHOME/bin/nco_sql -user username -password password
    -server server_name  < path_to_file/itm_event_cache.sql
    
    
    The fix for this APAR is contained in the following maintenance
    packages:
    
       | service pack | 6.3.0.7-TIV-ITM-SP0014
    

Temporary fix

  • Clear the alerts.ibm_cleared_event_cache table.
    

Comments

APAR Information

  • APAR number

    IJ42304

  • Reported component name

    TEMS

  • Reported component ID

    5724C04MS

  • Reported release

    630

  • Status

    CLOSED PER

  • PE

    NoPE

  • HIPER

    NoHIPER

  • Special Attention

    NoSpecatt / Xsystem

  • Submitted date

    2022-09-06

  • Closed date

    2023-04-17

  • Last modified date

    2023-04-17

  • APAR is sysrouted FROM one or more of the following:

  • APAR is sysrouted TO one or more of the following:

Fix information

  • Fixed component name

    TEMS

  • Fixed component ID

    5724C04MS

Applicable component levels

[{"Business Unit":{"code":"BU029","label":"Software"},"Product":{"code":"SSZ8F3","label":"IBM Tivoli Monitoring V6"},"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"630"}]

Document Information

Modified date:
18 April 2023