IBM Support

Secondary server fails to start when replicating hibernated policies from Primary

Troubleshooting


Problem

On startup of a non-primary cluster member, artifacts are copied from the Primary server. One such artifact type is the in-memory state of Hibernated policies. If hibernated policies contained instantiated objects for non-serializable Java classes then Impact is unable to sync the data for these policies.

Symptom

The messages.log file has more detail on the error and will show a java.rmi.UnmarshalException error:
java.rmi.UnmarshalException: error unmarshalling return; nested exception is:
        java.io.WriteAbortedException: writing aborted; java.io.NotSerializableException: name of Java class
i.e.
java.rmi.UnmarshalException: error unmarshalling return; nested exception is:
        java.io.WriteAbortedException: writing aborted; java.io.NotSerializableException: java.io.FileReader

Cause

The problem only happens for awoken hibernations. In theory, awoken hibernations should not exist, because the RemoveHibernation function should be called to remove the in-memory Policy state after the policy is woken up. See here for details.

Diagnosing The Problem

For more detail on artifact synchronization in the impactserver.log, add the following to the etc/impactserver.log4j.properties on all backend Impact servers. Log4j changes are dynamic - no restart is required.

# for debug of secondary no sync on policy states
logger.secstartupdebug1.name=com.micromuse.response.server.PolicyStateMemoryRepository
logger.secstartupdebug1.level=debug
logger.secstartupdebug2.name=com.micromuse.response.broker.cluster.ClusterMember
logger.secstartupdebug2.level=debug
logger.secstartupdebug3.name=com.micromuse.response.server.ObjectRetriever
logger.secstartupdebug3.level=debug

Resolving The Problem

To resolve the problem, look at the Java class mentioned in the Unmarshalling error in the messages.log. From Fix Pack 28 onwards the impactserver.log will also clearly point to the issue, with an error like this:
An error occurred when retrieving in-memory data from cluster member: NCI_1
java.rmi.UnmarshalException: error unmarshalling return; nested exception is:
        java.io.WriteAbortedException: writing aborted; java.io.NotSerializableException: java.io.FileReader
        at sun.rmi.server.UnicastRef.invoke(UnicastRef.java:209) ˜[?:1.8.0]
The Java class can be used to identify which policy is causing the problem. Look for policies using this Java class, which are also using Hibernation and for which RemoveHibernation is missing.
The problem can be resolved by removing the in-memory state for awoken policy hibernations. This is done, for future executions, by adding a RemoveHibernation function call to the policy.
For already executed policies, their in-memory state can be cleaned up via the UI, from the Data Model tab. Navigate to the Statistics data source and expand Hibernations. Right-click and select "View data items". Manually delete the obsolete records.

Document Location

Worldwide

[{"Type":"MASTER","Line of Business":{"code":"LOB45","label":"Automation"},"Business Unit":{"code":"BU059","label":"IBM Software w\/o TPS"},"Product":{"code":"SSB2GF","label":"Netcool\/Impact"},"ARM Category":[{"code":"a8m500000008ZugAAE","label":"Impact-\u003EImpact Server-\u003ECluster"}],"ARM Case Number":"","Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"7.1.0"}]

Document Information

Modified date:
14 November 2022

UID

ibm16839129