Hung threads in Java Platform, Enterprise Edition applications
WebSphere® Application Server monitors thread activity and performs diagnostic actions if one has become inactive.
- Logs a warning in the WebSphere Application Server
log that indicates the name of the thread that is hung and how long
it has already been active. The following message is written to the
log:
WSVR0605W: Thread threadname has been active for hangtime and may be hung. There are totalthreads threads in total in the server that may be hung.
where: threadname is the name that appears in a JVM thread dump, hangtime gives an approximation of how long the thread has been active and totalthreads gives an overall assessment of the system threads. - Issues a Java™ Management Extensions (JMX)
notification. This notification enables third-party tools to catch
the event and take appropriate action, such as triggering a JVM thread
dump of the server, or issuing an electronic page or email. The following
JMX notification events are defined in the com.ibm.websphere.management.NotificationConstants
class:
- TYPE_THREAD_MONITOR_THREAD_HUNG This event is triggered by the detection of a (potentially) hung thread.
- TYPE_THREAD_MONITOR_THREAD_CLEAR This event is triggered if a thread that was previously reported as hung completes its work. Consult the section on false alarms for more information.
- Triggers changes in the performance monitoring infrastructure (PMI) data counters. These PMI data counters are used by various tools, such as the Tivoli® Performance Viewer, to provide a performance analysis.
- Triggers changes in the performance monitoring infrastructure (PMI) data counters. These PMI data counters are used by various tools, such as the Tivoli Performance Viewer, to provide a performance analysis.
False Alarms
WSVR0606W: Thread threadname was previously reported to be hung but has completed. It was active for approximately hangtime. There are totalthreads threads in total in the server that still may be hung.where threadname is the name that appears in a JVM thread dump, hangtime gives an approximation of how long the thread has been active and totalthreads gives an overall assessment of the system threads.
Automatic adjustment of the hang time threshold
WSVR0607W: Too many thread hangs have been falsely reported. The hang threshold is now being set to thresholdtime.where: thresholdtime is the time (in seconds) in which a thread can be active before it is considered hung.
You can prevent WebSphere Application Server from automatically adjusting the hang time threshold. See Configuring the hang detection policy
System Alarms
An application server monitors the activity of threads on which system alarms execute. When a system alarm thread has been active longer than the time defined by the alarm thread monitor threshold, the application server logs the following warning in the system log. This message indicates the name of the thread that is not responding, the length of time that the thread has already been active, and the exception stack of the thread, which identifies the system component.
UTLS0008W: The alarm thread threadname has been active for n milliseconds and may be hung. totalthreadsthreadstack
In this message, threadname is the name that appears in a JVM thread dump, n is approximately how long the thread was active, totalthreads is an overall assessment of the system threads, and threadstack is the exception stack of the thread.
If the alarm work eventually completes, the following message is written to the system log. This message indicates thread that produced the false alarm.
UTLS0009W: Alarm Thread threadname was previously reported to be hung but has completed. It was active for approximately n milliseconds.
In this message, threadname is the name that appears in a JVM thread dump, and n is approximately how long the thread was active.
Typically, system alarms do not process heavy loads because such activity might slow the processing of later system alarms, which in turn might impact server behavior. The UTLS0008W message is intended to help IBM Support personnel investigate problems potentially caused by system alarm behavior.
All of the system alarms share a common alarm thread pool. The properties which govern the monitoring of this thread pool can be tuned using the administrative console. You can reduce the frequency at which WebSphere generates alarm hung thread messages by adjusting the alarm thread monitor check interval or threshold. See the topic Configuring the hang detection policy for a description of how to change these settings.