IBM Support

Why does the select query on YFS_OBJECT_LOCK from getJobs of an agent get fired without the NOWAIT clause at times?

Troubleshooting


Problem

As a part of getJobs fired by an agent to get the jobs eligible for processing, a select query is fired on the yfs_object_lock table.
This row level lock on the table is necessary for a thread to make sure that no other threads are involved in getJobs simultaneously.
SELECT /*YANTRA*/ YFS_OBJECT_LOCK.* FROM YFS_OBJECT_LOCK YFS_OBJECT_LOCK WHERE ( ( YFS_OBJECT_LOCK.LOCKED_OBJECT_TYPE = :1 ) AND ( YFS_OBJECT_LOCK.LOCKED_OBJECT = :2 ) AND ( YFS_OBJECT_LOCK.LOCKED_PARAMETERS = :3 ) ) FOR UPDATE
In some cases, the query fired is -
SELECT /*YANTRA*/ YFS_OBJECT_LOCK.* FROM YFS_OBJECT_LOCK YFS_OBJECT_LOCK WHERE ( ( YFS_OBJECT_LOCK.LOCKED_OBJECT_TYPE = :1 ) AND ( YFS_OBJECT_LOCK.LOCKED_OBJECT = :2 ) AND ( YFS_OBJECT_LOCK.LOCKED_PARAMETERS = :3 ) ) FOR UPDATE NOWAIT
Why is this so?

Symptom

It is suspected that the for update queries being fired without the NOWAIT clause may lead to performance issues since the thread will need to wait for the row lock to be released.

Cause

This is how getJobs in an agent works - Agent is triggered (manually through triggeragent.sh or automatically)

•getJobs (trigger) message is posted to JMS queue by agent trigger

•getJobs() method reads the above message

•Within getJobs() method, agent tries to acquire lock on YFS_OBJECT_LOCK table for agent Criteria ID

•If lock is not available then getJobs() method exits and does nothing.

•If lock is available then getJobs() method fetches records which needs to be processed.

•Above records are posted as execute message to JMS queue

•After the execute messages, one getJobs message is also posted with last record key.

•execute method picks execute message one by one and processes them.

•After all the execute messages are consumed then only getJobs message is left in queue

•getJobs() method picks up the getJobs message left in the queue.

When a getJobs returns 0 messages, the next batch of getJobs fires the query with NOWAIT , since the agent was idle and would indeed need the row lock on yfs_object_lock table to gather records for processing. As long as getJobs retreives a set of messages, the for update query is fired without NOWAIT , so that the agent does not go idle.

Diagnosing The Problem

Consider multiple instances of an agent being run - says two instances with 5 threads each.

(Thread 1-5 in Instance 1 and thread 6-10 in Instance 2)

Lets say that the buffer size is 100 and there are 600 jobs to be obtained

When the agents start, both the instances have getJobs fired with NOWAIT, one of the threads gets the lock and the rest of the threads process the messages. It is imperative that getJobs is fired with NOWAIT here so that only one  thread is engaged in getJobs while the rest of the threads process. In the first run 100 messages are obtained

 

When one of the threads reaches the last record key, it fires getJobs without NO WAIT

The getJobs is fired only when a lock is obtained. This is done to ensure that getJobs is run as soon as a lock is obtained.

Lets say thread 1 processes the last record key and fires the object lock query with wait, it gets the lock and immediately starts putting in messages.

The rest of the threads process these messages. But the processing rate is fast, lets say thread 2 now encounters the last record key but thread 1 has just about completed the getJobs and still not given up the lock from the getJobs. Here thread 2 has no work to do, hence it waits to get the lock so that it can commence the getJobs as soon as it gets the lock. If it were fired with NOWAIT, the agent will need to wait until the next getJobs to get messages that can be processed currently.This continues until the getJobs returns 0 jobs.This is necessary to ensure that the agent does not go idle when it has messages to be processed.

After this the agent goes idle. The next job commences when the trigger interval is reached or it is auto triggered. Here getting lock on the object_lock table is fired with no lock so that the other threads so not wait for the lock and start consuming the messages.

 

Resolving The Problem

The application's behavior is working as per the design :
1. The first getJobs after an agent is started or triggered fires the query with NOWAIT to get the lok on yfs_object_lock
SELECT /*YANTRA*/ YFS_OBJECT_LOCK.* FROM YFS_OBJECT_LOCK YFS_OBJECT_LOCK WHERE ( ( YFS_OBJECT_LOCK.LOCKED_OBJECT_TYPE = :1 ) AND ( YFS_OBJECT_LOCK.LOCKED_OBJECT = :2 ) AND ( YFS_OBJECT_LOCK.LOCKED_PARAMETERS = :3 ) ) FOR UPDATE NOWAIT
2. The getJobs that are fired after the previous getJobs retrieves a set of messages will not have the NOWAIT clause-
SELECT /*YANTRA*/ YFS_OBJECT_LOCK.* FROM YFS_OBJECT_LOCK YFS_OBJECT_LOCK WHERE ( ( YFS_OBJECT_LOCK.LOCKED_OBJECT_TYPE = :1 ) AND ( YFS_OBJECT_LOCK.LOCKED_OBJECT = :2 ) AND ( YFS_OBJECT_LOCK.LOCKED_PARAMETERS = :3 ) ) FOR UPDATE
3. The getJobs fired after the previous getJobs returns 0 messages for processing with fire the query with NOWAIT-
SELECT /*YANTRA*/ YFS_OBJECT_LOCK.* FROM YFS_OBJECT_LOCK YFS_OBJECT_LOCK WHERE ( ( YFS_OBJECT_LOCK.LOCKED_OBJECT_TYPE = :1 ) AND ( YFS_OBJECT_LOCK.LOCKED_OBJECT = :2 ) AND ( YFS_OBJECT_LOCK.LOCKED_PARAMETERS = :3 ) ) FOR UPDATE NOWAIT
If it is noticed that the waits on this table by a specific agent is high, it may indicate that too many threads are being run such that the processing rate is faster than the rate at which the jobs are obtained.

Document Location

Worldwide

[{"Business Unit":{"code":"BU059","label":"IBM Software w\/o TPS"},"Product":{"code":"SS6PEW","label":"Sterling Order Management"},"Component":"","Platform":[{"code":"PF002","label":"AIX"},{"code":"PF016","label":"Linux"},{"code":"PF043","label":"Red Hat"}],"Version":"9.4,9.4,10.0,OMoC","Edition":"","Line of Business":{"code":"LOB59","label":"Sustainability Software"}}]

Document Information

Modified date:
01 November 2021

UID

ibm11086789