IBM Support

Performing concurrent purge operations in Liberty Batch can result in deadlock

Troubleshooting


Problem

Performing concurrent purge operations in Liberty Batch can result in deadlock

Symptom

Error Code: 60
Call: DELETE FROM JBATCH.STEPTHREADEXECUTION WHERE (STEPEXECID = ?)
bind => [1 parameter bound]
Query: DeleteObjectQuery(For TopLevelStepExecutionEntity: step Name = generate, step exec id = 852)
at org.eclipse.persistence.internal.jpa.EntityManagerSetupImpl$1.handleException(EntityManagerSetupImpl.java:747)
at org.eclipse.persistence.transaction.AbstractSynchronizationListener.handleException(AbstractSynchronizationListener.java:275)
at org.eclipse.persistence.transaction.AbstractSynchronizationListener.beforeCompletion(AbstractSynchronizationListener.java:170)
at org.eclipse.persistence.transaction.JTASynchronizationListener.beforeCompletion(JTASynchronizationListener.java:68)
at com.ibm.tx.jta.impl.RegisteredSyncs.coreDistributeBefore(RegisteredSyncs.java:291)
at com.ibm.tx.jta.impl.RegisteredSyncs.distributeBefore(RegisteredSyncs.java:192)
at com.ibm.tx.jta.impl.TransactionImpl.prePrepare(TransactionImpl.java:1657)
at com.ibm.tx.jta.impl.TransactionImpl.stage1CommitProcessing(TransactionImpl.java:1050)
at com.ibm.tx.jta.impl.TransactionImpl.processCommit(TransactionImpl.java:1025)
at com.ibm.tx.jta.impl.TransactionImpl.commit(TransactionImpl.java:966)
at com.ibm.tx.jta.impl.TranManagerImpl.commit(TranManagerImpl.java:237)
at com.ibm.tx.jta.impl.TranManagerSet.commit(TranManagerSet.java:191)
at com.ibm.ws.transaction.services.TransactionManagerService.commit(TransactionManagerService.java:297)
at com.ibm.jbatch.container.services.impl.JPAPersistenceManagerImpl$TranRequest.commitIfNewTranWasStarted(JPAPersistenceManagerImpl.java:1751)
at com.ibm.jbatch.container.services.impl.JPAPersistenceManagerImpl$TranRequest.runInNewOrExistingGlobalTran(JPAPersistenceManagerImpl.java:1714)
at com.ibm.jbatch.container.services.impl.JPAPersistenceManagerImpl.purgeJobInstanceAndRelatedData(JPAPersistenceManagerImpl.java:1606)
at com.ibm.jbatch.container.ws.impl.WSJobOperatorImpl.purgeJobInstance(WSJobOperatorImpl.java:240)
at com.ibm.ws.jbatch.rest.internal.resources.JobInstances.purgeJobInstance(JobInstances.java:761)
at com.ibm.ws.jbatch.rest.internal.resources.JobInstances.purgeJobInstances(JobInstances.java:715)
at com.ibm.ws.jbatch.rest.internal.resources.JobInstances$JobInstancesHandler_v2.delete(JobInstances.java:240)

Cause

When multiple threads are attempting deletions in the database at the same time, they are locking resources in a manner that can lead to a deadlock in the StepThreadExecution table.

Diagnosing The Problem

The problem can be identified when one or more concurrent purge operations fail with the following stack error message in the server log:
Error Code: 60
Call: DELETE FROM JBATCH.STEPTHREADEXECUTION WHERE (STEPEXECID = ?)
bind => [1 parameter bound]
Query: DeleteObjectQuery(For TopLevelStepExecutionEntity: step Name = generate, step exec id = 852)
at org.eclipse.persistence.internal.jpa.EntityManagerSetupImpl$1.handleException(EntityManagerSetupImpl.java:747)
at org.eclipse.persistence.transaction.AbstractSynchronizationListener.handleException(AbstractSynchronizationListener.java:275)
at org.eclipse.persistence.transaction.AbstractSynchronizationListener.beforeCompletion(AbstractSynchronizationListener.java:170)
at org.eclipse.persistence.transaction.JTASynchronizationListener.beforeCompletion(JTASynchronizationListener.java:68)
at com.ibm.tx.jta.impl.RegisteredSyncs.coreDistributeBefore(RegisteredSyncs.java:291)
at com.ibm.tx.jta.impl.RegisteredSyncs.distributeBefore(RegisteredSyncs.java:192)
at com.ibm.tx.jta.impl.TransactionImpl.prePrepare(TransactionImpl.java:1657)
at com.ibm.tx.jta.impl.TransactionImpl.stage1CommitProcessing(TransactionImpl.java:1050)
at com.ibm.tx.jta.impl.TransactionImpl.processCommit(TransactionImpl.java:1025)
at com.ibm.tx.jta.impl.TransactionImpl.commit(TransactionImpl.java:966)
at com.ibm.tx.jta.impl.TranManagerImpl.commit(TranManagerImpl.java:237)
at com.ibm.tx.jta.impl.TranManagerSet.commit(TranManagerSet.java:191)
at com.ibm.ws.transaction.services.TransactionManagerService.commit(TransactionManagerService.java:297)
at com.ibm.jbatch.container.services.impl.JPAPersistenceManagerImpl$TranRequest.commitIfNewTranWasStarted(JPAPersistenceManagerImpl.java:1751)
at com.ibm.jbatch.container.services.impl.JPAPersistenceManagerImpl$TranRequest.runInNewOrExistingGlobalTran(JPAPersistenceManagerImpl.java:1714)
at com.ibm.jbatch.container.services.impl.JPAPersistenceManagerImpl.purgeJobInstanceAndRelatedData(JPAPersistenceManagerImpl.java:1606)
at com.ibm.jbatch.container.ws.impl.WSJobOperatorImpl.purgeJobInstance(WSJobOperatorImpl.java:240)
at com.ibm.ws.jbatch.rest.internal.resources.JobInstances.purgeJobInstance(JobInstances.java:761)
at com.ibm.ws.jbatch.rest.internal.resources.JobInstances.purgeJobInstances(JobInstances.java:715)
at com.ibm.ws.jbatch.rest.internal.resources.JobInstances$JobInstancesHandler_v2.delete(JobInstances.java:240)

Resolving The Problem

When a delete is issued from a purge operation for a job instance, this triggers additional delete requests on the corresponding tables that reference this job instance such as in the StepThreadExecution table. Without indexes on the foreign key, this causes the database to request a lock on the entire table instead of just the row.

This deadlock issue can be resolved by manually adding non-unique foreign key indexes to the JobExecution, StepThreadExecution, and StepThreadInstance tables. The following SQL example is for a database that has "jbatch" as its schema name.



CREATE INDEX FK_JOBINSTANCEID_idx ON jbatch.JobExecution (FK_JOBINSTANCEID);
CREATE INDEX FK_JOBEXECID_idx ON jbatch.StepThreadExecution(FK_JOBEXECID);
CREATE INDEX FK_TOPLVL_STEPEXECID_idx ON jbatch.StepThreadExecution(FK_TOPLVL_STEPEXECID);
CREATE INDEX FK_JOBINSTANCEID_Step_idx ON jbatch.StepThreadInstance(FK_JOBINSTANCEID);
CREATE INDEX FK_LATEST_STEPEXECID_idx ON jbatch.StepThreadInstance(FK_LATEST_STEPEXECID );

[{"Product":{"code":"SSEQTP","label":"WebSphere Application Server"},"Business Unit":{"code":"BU053","label":"Cloud & Data Platform"},"Component":"Batch applications","Platform":[{"code":"PF002","label":"AIX"},{"code":"PF010","label":"HP-UX"},{"code":"PF012","label":"IBM i"},{"code":"PF016","label":"Linux"},{"code":"PF014","label":"iOS"},{"code":"PF027","label":"Solaris"},{"code":"PF033","label":"Windows"},{"code":"PF035","label":"z\/OS"},{"code":"PF013","label":"Inspur K-UX"}],"Version":"8.5.5.7","Edition":"Liberty","Line of Business":{"code":"LOB45","label":"Automation"}}]

Document Information

Modified date:
15 June 2018

UID

swg21971002