Technical Blog Post
Abstract
Troubleshooting Invalid session context and other session errors
Body
Recently I've received a set of PMRs dealing with various protocol session related failures such as:
:[2017-05-07 06:01:20.306] ERROR [1494154880306] Invald session ID: ConnectDirectServerAdapternode1_CDSERVER_ADAPTER_node1P01302732805L6952:72843354; the session context is closed (or does not exist)
Or
For a session to be lost, there must be something going on in the environment to cause this to occur, so how does one go about troubleshooting this? The first place to start are the log files that correspond to the same date in time. In the case of the first error where "the session context is closed (or does not exist)" a review of the jetty.log showed the following
[2017-05-07 06:01:10.553] ALL 000000000000 GLOBAL_SCOPE 586447063 [Incoming-1,Sterling_NodeInfo_group_PROD,pedisfg02-6720] WARN
org.jgroups.protocols.TCP - failed to join
This is an indication of issues happening in your cluster. Later on in the day this was also reported:
[2017-05-07 11:35:05.064] ALL 000000000000 GLOBAL_SCOPE 606481574 [Incoming-1,Sterling_NodeInfo_group_PROD,pedisfg02-6720] WARN
org.jgroups.protocols.pbcast.NAKACK - use_mcast_xmit should not be used because the transport (TCP) does not support IP multicasting; setting
use_mcast_xmit to false
Which is further indication that the cluster configuration should be reviewed.
What about the session map error? A week later the same error was noted approximately at the same time of day. That begged the question: What's happening on Monday's during the 5:20 AM - 5:30 AM time frame? It was determined that the database was actively being backed-up for several hours.
So if you start seeing "session" related errors, you should begin investigating what's happening in your environment.
UID
ibm11121139