IBM Support

Preventing the nodeagent from restarting all application servers when the primary DNS is down

Troubleshooting


Problem

On 64 bit AIX, the nodeagent restarts the application servers when the first entry in the resolv.conf file is offline/unavailable (simulating a DNS failure with no failover). Because of this DNS failure with no failover, the nodeagent appears to not use the second entry (secondary DNS server) for DNS resolution and can cause a hang when the servers are restarted. This scenario occurs when localhost and local IP addresses are not defined in /etc/hosts and /etc/netsvc.conf is set to hosts=local4,bind4

Cause

The WebSphere Application Server runtime code calls the InetAddress.getLocalHost() method in several places which might perform more slowly when using the IBM® SDK for Java™ version 5 on AIX®, compared with earlier versions. In some cases, when multiple threads have been spawned, we have seen threads take up to 15 seconds to return. The thread processing happens sequentially which causes the thread pool to back up. If the thread pool is not exhausted within the default 300 second monitoring policy ping time-out, then the nodeagent will think that the server is hung and will perform (as designed) a restart of the servers.

Resolving The Problem

In Java, caching of localhost addresses does not occur by default. The JVM custom property com.ibm.cacheLocalHost=true needs to be set for localhost lookups to be stored in the cache. This is not to be confused with the networkaddress.cache.ttl value in java.security, which references the length of time to keep a value in the cache. This value is relevant to cacheLocalHost only when com.ibm.cacheLocalHost is set to true.

To turn on caching to change the behavior so that a full network DNS lookup is not performed every time getLocalHost() is called, set the following JVM Custom Property on the Application Server, Nodeagent and the Deployment Manager:

Open the administrative console and navigate to:

Application Servers:
Server > Application Servers > servername > Java and Process Management > Process Definition > Java Virtual Machine > Custom Properties > New

Name: com.ibm.cacheLocalHost
Value: true

Nodeagent:
System administration > Node agents> nodeagent > Java and Process Management > Process Definition > Java Virtual Machine > Custom Properties > New

Name: com.ibm.cacheLocalHost
Value: true

Deployment Manager:
System administration > Deployment Manager > Java and Process Management > Process Definition > Java Virtual Machine > Custom Properties > New

Name: com.ibm.cacheLocalHost
Value: true


Notes:

[{"Product":{"code":"SSEQTP","label":"WebSphere Application Server"},"Business Unit":{"code":"BU053","label":"Cloud & Data Platform"},"Component":"System Management\/Repository","Platform":[{"code":"PF002","label":"AIX"}],"Version":"7.0;6.1;6.0.2","Edition":"Network Deployment","Line of Business":{"code":"LOB45","label":"Automation"}}]

Document Information

Modified date:
23 June 2018

UID

swg21413670