IBM Support

WebSphere Application Server StaleConnectionExceptions

Technical Blog Post


Abstract

WebSphere Application Server StaleConnectionExceptions

Body

A StaleConnectionException is an exception that is generated by the WebSphere Application Server database connection code when a JDBC driver returns a fatal error from a connection request or operation.

At same time, you may find message like below:

J2CA0206W: A connection error occurred. To help determine the problem, enable the Diagnose Connection Usage option on the Connection Factory or Data Source. This is the multithreaded access detection option. Alternatively check that the Database or MessageProvider is available.

J2CA0056I: The Connection Manager received a fatal connection error from the Resource Adapter for resource {1}. The exception is: {0}

If you find J2CA0206W and J2CA0056I, it is best to enable multithreaded access detection, save, synchronize nodes of the data source scope if it is a network deployment environment. Then restart all servers of the JDBC provider scope:

CELL scope: dmgr, all nodeagents, all application servers,

Node scope: the nodeagent, all application servers,

Cluster scope: all cluster members, all nodeagents of them, dmgr,

Server scope: the application server.

When the issue occurs again, please search J2CA0167W or DSRA8720W in SystemOut.log of the server who reports the issue, if you find it, then engage your development team to not share to use same connection with multiple threads which is not supported.

If there is no J2CA0167W or DSRA8720W find in SystemOut.log when the issue reports, configuring a WebSphere Application Server data source includes specifying a data source helper class, which is selected depending on your JDBC driver type, for example, DB2UniversalDataStoreHelper for DB2, Oracle11gDataStoreHelper for Oracle 11g, etc. Each helper class contains a list of driver specific error codes associated with fatal exceptions. When a JDBC driver returns one of these codes, the WebSphere Application server will map the return code to a StaleConnectionException, the intent being to provide one generic error that will allow applications to recover database connections rather than have to code application recovery for each applicable database product or individual JDBC return code.

If the StaleConnectionException is triggered by external system and your application wants to finish the important current transaction, then your application should specifically catching StaleConnectionException and redo the transaction. Else, because applications are already required to catch java.sql.SQLException and StaleConnectionException extends SQLException, StaleConnectionException is automatically caught in the general catch-block with SQLException. However, specifically catching StaleConnectionException makes it possible for an application to recover from bad connections. The most common time for StaleConnectionException to be thrown is the time that a connection is used. Because connections are pooled, a database failure is not detected until the connection is used. And it is only when a failure is detected that the connection is marked stale. When a StaleConnectionException is caught from a connection in auto-commit mode, recovery is a simple matter of closing all of the associated JDBC resources and retrying the operation with a new connection by the application code.

In most cases StaleConnectionExceptions can be avoided by a few configuration changes. By far, the most common cause of StaleConnectionExceptions is due to retrieving connections from the free pool (connection reuse) and finding out that the connection has been timed out or dropped by a database server or firewall.

The configuration options below help minimize the chances of encountering StaleConnectionExceptions by helping to ensure that your WebSphere Application Server maintains control of database connections, specifically, the only one closing them, rather than other components in your network:

  1. The data source connection pool's minimum (MIN) connections should be set to 0. By setting a MIN > 0, you are telling the application server that this number of connections will be valid for the life of the application server JVM. Unless you can guarantee the MIN number of connections will never be timed out or dropped, setting a MIN will cause a StaleConnectionException when the connection becomes invalid.
     
  2. If a firewall exist between your WebSphere Application Server and database server, set the data source connection pool's Unused Timeout and Age Timeout to no greater than 1/2 the value configured for the firewall relative timeout. This is how long you will allow an unused connection to sit in the free pool waiting to be reused and how long at most it can stay in the pool. When a connection exists in pool longer than Age Timeout, if it is in free pool, it will be destroyed, if it is inuse by any thread or transaction, then only after the thread/transaction return it to free pool, it can be destroyed for Age Timeout. 
     
  3. Set the data source connection pool's Reap Timer to a value less than the Unused Timeout. The reap time is how often, in seconds, a pool maintenance thread is run to check if an Unused Timeout has occurred. The smaller the value, the more often the check occurs and the more accurate the Unused Timer becomes. However, the Reap Time interval also affects performance. Smaller intervals mean that the pool maintenance thread runs more often and can degrade performance.
     
  4. Set the data source connection pool's Purge Policy to entire pool. This option determines if you will mark only the offending connection as stale, should a fatal error occur (FailingConnectionOnly), or if all other connections waiting in the free pool should be flushed as a result as well (EntirePool). The best assumption is usually that if one is bad, they all are, and flush the free pool. This way, subsequent connection request will always get a new connection. Note that configuring the purge policy for FailingConnectionOnly will cause the WebSphere connection manager code to go into connection pre-test mode should a fatal error be encountered on a free pool connection.
  5. If you know your network or database system is not stable and any connection can be fatal connection easily at any time, you can also enable validate new and existing connection, then before allocate the connection get from pool to a thread or transaction, WAS will first validate the connection, if it is a valid connection, allocate to the thread or transaction, else, will destroy it and create a new connection to allocate to the thread or transaction.

For more information about StaleConnectionExceptions, including an example of how recovery can be coded into your application, see the IBM Technote Demystifying the WebSphere StaleConnectionException

Also see this related blog entry: WebSphere Application Server - Avoiding Stale Connections

title image (modified) credit: (cc) Some rights reserved by geralt

[{"Type":"MASTER","Line of Business":{"code":"LOB67","label":"IT Automation \u0026 App Modernization"},"Business Unit":{"code":"BU048","label":"IBM Software"},"Product":{"code":"SSEQTP","label":"WebSphere Application Server"},"ARM Category":[{"code":"a8m50000000CdUIAA0","label":"WebSphere Application Server traditional-All Platforms-\u003EJ2C-ConnectionPooling-JDBCDrivers-\u003EConnection Pooling-J2C-DB Connections-\u003EStale Connection problem"}],"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"All Versions"}]

UID

ibm11081179