IBM Support

Troubleshooting : WebSphere start/stop common issues

Troubleshooting


Problem

This document contains troubleshooting information for problems with WebSphere® Application Server start and stop operations. The information can help address common issues with this component without calling IBM suppor, saving you time.
 

Resolving The Problem


Troubleshooting Topics:

Scenario : Application Server startup fails but only the startServer.log file is created

Possible causes of the problem:

  • Java could be corrupted.
  • Invalid JVM arguments are set on the application server JVM.
  • The startServer launcher could be crashing. Check the native log files.
  • Javasharedcache is corrupted (Java class sharing)
  • A java.lang.UnsatisfiedLinkError occurred.
  • Non-root user problem, which is a permissions issue.
  • The ulimit value was set too low on the system
 

Things to check when Java is corrupted

Try running the “java –fullversion” command from the WAS_HOME/java/jre/bin directory. If this command fails, then Java likely has a problem. When you start the server, you might see a messages such as:

  • “The system cannot find the path specified.” –Check java classpath.
  • “No public JRE found”
  • java.lang.main() method could not be created

Check the native_stdout.log and native_stderr.log file to see whether there are any entries. If you happen to see any entries in these log files, then it says that Java has a problem initializing. If previous checks are true, then, as a work around, you can try copying the entire /WAS_HOME/java directory from another working system. However, make sure that the level of Java and the WebSphere Application Server versions match. It is recommended that you back up the failing system before copying over the directory.

Invalid JVM arguments are set on the application server JVM
  • You might have specified some Java arguments to the application server JVM due to an application requirement or another stack product requirement.
  • Check the minimum and maximum heap size values that are set.
  • Take a look at the application server server.xml file and see whether there are any invalid JVM arguments specified. If invalid arguments exist, try removing them and then see if the server starts. If server starts, then it is a problem with one of the JVM properties.
 

When the java class cache is corrupted

It is possible that the Java cache is corrupted. A class cache is an area of shared memory of a fixed size that persists beyond the lifetime of any JVM that is using it. No JVM owns the cache,  instead any number of JVMs can read and write to the cache concurrently. A cache is deleted either when it is explicitly destroyed using a JVM utility or when the operating system restarts. A cache cannot persist beyond an operating system restart. Its purpose is to reduce the virtual memory footprint and improve JVM startup time. By default, this option is enabled starting with SDK 1.5 on all IBM platforms.

Run the WAS_HOME\profiles\profile_name\bin>clearClassCache.bat/sh command to clear up the Java cache of this WebSphere Application Server node from the common location of Java cache on the system level. Alternatively, you can also delete the content of the following directories:

  • On UNIX-based platforms: /tmp/javasharedresources
  • On Windows: C:\Documents and Settings\<userid>\Local Settings\Application Data\ javasharedresources

However, keep in mind that deleting the Java cache from system level deletes the cache for every Java instance on the system.

You can disable Java class cache feature permanently using the -Xshareclasses:none argument. To delete it, complete these steps in the administrative console:


1. Click Servers > Application Servers > server_name> Java and process management > process definition >  Java virtual machine.
2. Under Generic JVM arguments, specify -Xshareclasses:none
3. Save and synchronize the changes with nodes.

When the OSGi cache is corrupted

The Equinox OSGi framework is used to manage class loading and relationships between the server component bundles. In some cases, the cached bundle data, which is maintained on a per profile basis and has a separate cache at the WAS_HOME level for installation-wide processes, can become out-of-sync with the actual binaries on the server. You can use the osgiCfgInit.sh(bat) script to clear and recreate the OSGi cache.

You should run the osgiCfgInit script on the command line from the WAS_HOME/bin or user_install_root/bin directory. The behavior of the script depends on the directory from which you run the script. If you run the script from a profile level bin directory, the script clears the OSGi cache for all servers within that profile. If you run the script from the WAS_HOME/bin directory, the script clears the OSGi cache for all servers within the default profile.

Avoid trouble: Before you run the osgiCfgInit script, stop the server on which the script will be run. If you run this script on a server that is active, the server might have problems trying to read or update the cache after the script is finished.

There might be cases after applying a fix or fix pack with the Update Installer or with IBM Installation Manager, that servers (deployment manager, node agent, and application servers) might fail to start. The SystemOut.log file will not be generated to indicate a reason. The startServer.log shows:

!MESSAGE Error reading configuration: /home/WebSphere/AppServer/profiles/Dmgr01/configuration/org.eclipse.osgi/.manager/.fileTableLock (Permission denied)
!STACK 0u

java.io.FileNotFoundException: /opt/WebSphere/AppServer/profiles/Dmgr01/configuration/org.eclipse.osgi/.manager/.fileTableLock (Permission denied)
    at java.io.FileOutputStream.openAppend(Native Method)
    at java.io.FileOutputStream.<init>(FileOutputStream.java:203)
    at org.eclipse.core.runtime.internal.adaptor.Locker_JavaNio.lock(Locker_JavaNio.java:34)

It is necessary that you run the osgiCfgInit.sh(bat) script before you start any server JVM for the first time after you install a fix pack when similar errors are thrown. The following documents describe some known issues with the OSGI cache in the context of root user and non-root user managing the WebSphere Application Server file systems:

 

java.lang.UnsatisfiedLinkError

This error gets thrown when the JVM cannot find the appropriate native library that is required for WebSphere Application Server to start.

Here are some causes:

  • A user is starting the server who does not have the right permissions to load native libraries (.so) .
  • You have edited some environment settings on the machine that might be causing the java.library.path to not set correctly.
ADMU3011E Server fail to start after applying fixes when runAs user or group configured in WebSphere Application Server V8.x

WebSphere Application Server with "runas" behavior if non-root is used


Scenario : Debugging the startServer Launcher

There may be a case where startServer.sh(bat), startManager.sh(bat), or startNode.sh(bat) script itself is having problem or corrupted. Following test can be done to isolate the problem between
launcher and target server JVM with a option of bypassing launcher and start target JVM directly with a -script parameter.

Start the server with –script option and then use the launch script to start the server.
  • “-script” option creates a launch script for server1, does not start the server.
  • startServer server1 –script launchServer1.sh
  • Launch Scripts can be used to start the server (JVM)
  • Reduces start time by not parsing configuration files
  • If the JVM settings are changed, create a new launch script

 
Scenario : Startup issues and a SystemOut.log file is generated

Common issues:

  •   The server start process takes longer time
  •   The server start process hangs.
  •   The server start process fails with errors.
  •   The server starts fine, but has errors.
  •   Port conflict issues occur during the startup process.
These problems will be addressed case by case based on the error messages in the server JVM SystemOut.log file / SystemErr.log files as the error message could be thrown by any WAS server component (or) by user application code etc.

You can ignore the following issues:

  • The variables.xml, virtualHosts.xml FileNotFound exceptions can be ignored in the startserver.log file.
  • Most of the warning messages can be ignored including FFDC messages.

Scenario : Server stops by itself (graceful shutdown)

The process to get a thread dump or Javacore during a server shut down follows:    

When you set the -Dcom.ibm.ws.runtime.dumpShutdown=true property, a thread dump is triggered during the server shut down process.

To set the property in the administrative console, complete these steps:

  • Click Servers > Application Servers > server_name > Server Infrastructure > Java and Process Management > Process Definition > Java virtual machine > Custom Properties > New.
  • Specify com.ibm.ws.runtime.dumpShutdown  for the property name and true for the value.

For platforms where an IBM Software Development Kit is used, a Javacore is generated in the working directory of the application server. For all other platforms, a thread dump is written to the native_stdout.log file for the application server. Solaris/HP thread thread dumps are written out to the native_stdout.log as well as verbosegc. In addition to the thread dump, the stack trace of the current thread that is processing the shut down is included in the SystemErr.log for the application server.

This information should help to determine the source of the problem that is causing the Application Server to shut down gracefully .

[{"Product":{"code":"SSEQTP","label":"WebSphere Application Server"},"Business Unit":{"code":"BU059","label":"IBM Software w\/o TPS"},"Component":"System Management\/Repository","Platform":[{"code":"PF002","label":"AIX"},{"code":"PF010","label":"HP-UX"},{"code":"PF016","label":"Linux"},{"code":"PF027","label":"Solaris"},{"code":"PF033","label":"Windows"}],"Version":"9.0.0.0;8.5.5;8.5;8.0;7.0","Edition":"Base;Network Deployment","Line of Business":{"code":"LOB45","label":"Automation"}}]

Document Information

Modified date:
25 January 2022

UID

swg21981959