Technical Blog Post
Abstract
Common status cache problems in the WebSphere Administrative Console - Part I
Body
This 2-part blog is focused on some commonly occurring exceptions for the status cache problem. With this problem, the status display of an application server, node agent, or an application is shown as 'red' or 'unknown' on the WebSphere Administrative Console. However, the actual process is up running.
In part 1, we will look at when this type of issue can occur due to one of the following areas:
- Configuration file
- Discovery ports with firewall
- Operating system level
Some of these exceptions can be seen in the SystemOut, SystemError, or in FFDC log files.
Configuration file corruption
These first examples are due to corrupted configuration files:
---------------
FFDC Exception:java.lang.NullPointerException
SourceId:com.ibm.ws.management.component.JMXConnectors.initDiscovery
ProbeId:202 Reporter:com.ibm.ws.management.component.JMXConnectors@74bffc3a java.lang.NullPointerException
at com.ibm.ws.management.component.JMXConnectors$EndPtCollector.<init>(JMXConnectors.java:1864) at com.ibm.ws.management.component.JMXConnectors.initDiscovery(JMXConnectors.java:392)
at com.ibm.ws.management.component.JMXConnectors.propertyChange(JMXConnectors.java:2117)
at java.beans.PropertyChangeSupport.firePropertyChange(PropertyChangeSupport.java:339)
at java.beans.PropertyChangeSupport.firePropertyChange(PropertyChangeSupport.java:347)
at java.beans.PropertyChangeSupport.firePropertyChange(PropertyChangeSupport.java:276)
at com.ibm.wsspi.runtime.component.WsComponentImpl.setState(WsComponentImpl .java:556)
-----
FFDC Exception:java.lang.NullPointerException
SourceId:com.ibm.ws.management.component.JMXConnectors.doManagedProcessDiscovery
ProbeId:714 Reporter:com.ibm.ws.management.component.JMXConnectors@45d14537
java.lang.NullPointerException: getServerDiscoveryEndPoint returned null value at
com.ibm.ws.management.component.JMXConnectors.doManagedProcessDiscovery(JMXConnectors.java:1566) at
com.ibm.ws.management.component.JMXConnectors.interprocessRegistration(JMXConnectors.java:1272) at
com.ibm.ws.management.component.JMXConnectors.initDiscovery(JMXConnectors.java:462) at
com.ibm.ws.management.component.JMXConnectors.propertyChange(JMXConnectors.java:2161)
----------------------------
Both examples are in the form of FFDC errors and a NullPointerException was thrown at the EndPoint as highlighted in bold. To resolve these types of exceptions, check the following areas:
- Review the serverindex.xml file to see if the node agent has all the required endpoints. See this dW Answers entry for instructions on how to fix the serverindex.xml file: https://developer.ibm.com/answers/questions/175543/how-to-fix-nodeagent-serverindexxml-file-that-has.html#answer-175545
- If the serverindex.xml file looks good, then check the server.xml file to verify the file size, and/or any malformed/missing entries, and so on. Compare the current file with a backed up version.
- Another example of configuration file to double check is the node-metadata.properties file. Make sure it is consistent with the other node-metadata.properties files for the other managed nodes under the same deployment manager configuration structure.
- Look for other configuration files under the WAS_home/profiles/config structure to see if it is missing, and/or corrupted.
Discovery ports with firewall
The following examples are due to the discovery port being blocked or has a firewall:
---------------------
FFDC Exception:java.io.IOException
SourceId:com.ibm.ws.management.discovery.DiscoveryService.sendQuery ProbeId:189
Reporter:com.ibm.ws.management.discovery.DiscoveryService@61596159 java.io.IOException: ADMD0004E: The TCP socket: 72,72 cannot be opened. Check if port is opened by remote process.
at com.ibm.ws.management.discovery.transport.TcpMessenger.<init>(TcpMessenger.java:69)
at com.ibm.ws.management.discovery.transport.MessengerFactory.createMessenger(MessengerFactory.java:29) at com.ibm.ws.management.discovery.transport.TcpTransport.getMessenger(TcpTransport.java:132)
at com.ibm.ws.management.discovery.Endpoint.getMessenger(Endpoint.java:265)
at com.ibm.ws.management.discovery.DiscoveryService.sendQuery(DiscoveryService.java:166)
at com.ibm.ws.management.discovery.DiscoveryAdapter$DiscoveryAlarm.runIt(DiscoveryAdapter.java:397)
at com.ibm.ws.management.discovery.DiscoveryAdapter$DiscoveryAlarm.alarm(DiscoveryAdapter.java:386)
at com.ibm.ejs.util.am._Alarm.run(_Alarm.java:133)
at com.ibm.ws.util.ThreadPool$Worker.run(ThreadPool.java:1604)
------------------
A few checkpoints for the previous exception:
- Run a telnet test against the host name and the port number to verify the connections. For example,
- Telnet to dmgr_hostname.com <CELL_DISCOVERY_ADDRESS port> on the NODE AGENT while the deployment manager is up and running.
- Telnet to nodeagent_hostname.com <NODE_DISCOVERY_ADDRESS port> on the DMGR while the node agent is up.
- Verify if the host names are pingable and/or resolve to correct IP address that matches the entry in the etc/hosts file and the serverindex.xml file.
Operating system level
Another commonly seen status cache problem is due to certain AIX level.
Note: If you are on an AIX operating system, the status indicator of the Java virtual machines (JVM) or applications is incorrect, and there are no known errors in the SystemOut.log files, then it is possible that you have encountered a known bug.
Verify the operating system level by running: oslevel -s. If the result is either AIX 6100-08-01-1245 or AIX 7100-02-01-1245, take one of these actions:
- Apply AIX OS patches with the following:
- From AIX 6100-08-01-1245 upgrade to 6100-08-02-1316
- From AIX 7100-02-01-1245 upgrade to 7100-02-02-1316
- Apply the AIX APAR: IV35893: UDP MULTICAST: SHORT PACKET FOR SOME LISTENERS. APPLIES TO AIX 7100-02.
See Part Two for more exceptions and suggestions to resolve the status cache error.
problem-problem-solution-solution (modified) credit: (cc) Some rights reserved by geralt
UID
ibm11080981