IBM Support

Why did Unit of Work (UOW) failed with "Error encountered in receiveUntil() during main loop processing. Error is: checkSocket Timedout" error?

Question & Answer


Question

I was running an UOW and it failed with errors:
(31) Operation failed with failure category [Device Error]
(32) [Caused by: Error getting info from device (OperationFailedException)]
(33) [Caused by: Unable to retrieve physical devices information for Diag information. (CommsDriverException)]
(34) [Caused by: Unable to retrieve physical devices information for Diag information. (PhysicalDeviceException)]
Upon checking the worker.log, I found the following error:
2020.04.23 16:19:54 GMT+00:00 OperationRunner-005-Worker1 ERROR WORKER Worker1 com.intelliden.common.utility.logging.IntellidenLogger :: [com.intelliden.icos.util.socket.0001]  Error encountered in receiveUntil() during main loop processing.  Error is: checkSocket Timedout 
 token=#
 time=1588054794744
 time out value: 900000
 maxResponeTimeout = 150000
 available count = 0
 foundFlag = false
 buf = 

What is wrong and how do I fix it?

Answer

The error shows that the UOW timed out while waiting for a command to complete.
To troubleshoot, scroll up the worker.log and look for the nearest successful "MSG:cmd" string (see an example below):
2020.04.23 16:17:33 GMT+00:00 OperationRunner-005-Worker1 FINEST WORKER Worker1 com.intelliden.icos.util.handlers.ComHandler.processCommands :: MSG:cmd: diag.01.send
2020.04.23 16:17:33 GMT+00:00 OperationRunner-005-Worker1 FINEST WORKER Worker1 com.intelliden.icos.util.handlers.ComHandler.processCommands :: MSG:Sending: show version 
2020.04.23 16:17:33 GMT+00:00 OperationRunner-005-Worker1 FINEST WORKER Worker1 com.intelliden.icos.util.socket.ComWrapper.send :: MSG:Sending: show version 
2020.04.23 16:17:33 GMT+00:00 OperationRunner-005-Worker1 FINEST WORKER Worker1 com.intelliden.icos.util.handlers.ComHandler.processCommands :: MSG:cmd: diag.02.wait
2020.04.23 16:17:33 GMT+00:00 OperationRunner-005-Worker1 FINEST WORKER Worker1 com.intelliden.icos.util.handlers.ComHandler.processCommands :: MSG:Waiting for: #
2020.04.23 16:17:33 GMT+00:00 OperationRunner-005-Worker1 FINEST WORKER Worker1 com.intelliden.icos.util.socket.ComWrapper.receiveUntil :: MSG:receiving until token:# Timeout:900000
2020.04.23 16:17:33 GMT+00:00 OperationRunner-005-Worker1 FINEST WORKER Worker1 com.intelliden.icos.util.socket.ComWrapper.receiveUntil :: MSG:setting deadline: 1588055543488
 maxDeadline: 1588054793488 timeout: 900000 maxResponseTimeout: 150000
2020.04.23 16:17:33 GMT+00:00 OperationRunner-005-Worker1 FINEST WORKER Worker1 com.intelliden.icos.util.socket.ComWrapper.receiveUntil :: MSG:entering read loop, starting buffer length: 0
2020.04.23 16:17:34 GMT+00:00 OperationRunner-005-Worker1 FINEST WORKER Worker1 com.intelliden.icos.util.socket.ComWrapper.receiveUntil :: MSG:exiting read loop, ending buffer length: 911, receiveBufCount: 911, startIndex: 0, ifStartIndex: 0
2020.04.23 16:17:34 GMT+00:00 OperationRunner-005-Worker1 FINEST WORKER Worker1 com.intelliden.icos.util.socket.ComWrapper.receiveUntil :: MSG:checking for token: '#'  at index: 0, received count: 911
2020.04.23 16:17:34 GMT+00:00 OperationRunner-005-Worker1 FINEST WORKER Worker1 com.intelliden.icos.util.socket.ComWrapper.receiveUntil :: MSG:found token: '#' at index: 910
2020.04.23 16:17:34 GMT+00:00 OperationRunner-005-Worker1 FINEST WORKER Worker1 com.intelliden.icos.util.socket.ComWrapper.checkSocket :: MSG:Received: show version 

Tue Apr 23 16:17:26.127 GMT

Cisco IOS XR Software, Version 5.3.3[Default]
Copyright (c) 2016 by Cisco Systems, Inc.
...
ncm.ibmtest.com#
Based on the example provided, the nearest successful "MSG:cmd" is "diag.02.wait".
From the logs, the command was executed successfully and the configured token of "#" was found too (i.e. ncm.ibmtest.com#).
However, the log messages that followed show the problem:
2020.04.23 16:17:34 GMT+00:00 OperationRunner-005-Worker1 FINEST WORKER Worker1 com.intelliden.icos.util.handlers.ComHandler.processCommands :: MSG:cmd: diag.end
2020.04.23 16:17:34 GMT+00:00 OperationRunner-005-Worker1 FINEST WORKER Worker1 com.intelliden.icos.util.socket.ComWrapper.receiveUntil :: MSG:receiving until token:# Timeout:900000
2020.04.23 16:17:34 GMT+00:00 OperationRunner-005-Worker1 FINEST WORKER Worker1 com.intelliden.icos.util.socket.ComWrapper.receiveUntil :: MSG:setting deadline: 1588055544652
 maxDeadline: 1588054794652 timeout: 900000 maxResponseTimeout: 150000
...
2020.04.23 16:19:54 GMT+00:00 OperationRunner-005-Worker1 ERROR WORKER Worker1 com.intelliden.common.utility.logging.IntellidenLogger :: [com.intelliden.icos.util.socket.0001]  Error encountered in receiveUntil() during main loop processing.  Error is: checkSocket Timedout 
 token=#
 time=1588054794744
 time out value: 900000
 maxResponeTimeout = 150000
 available count = 0
 foundFlag = false
 buf = 
Referring to the Resource Access Method (RAD) of the device, it shows:
diag.01.send=show version\r
diag.02.wait=#
diag.end=#

The problem with this section of RAD is that after the "show version" command is executed the prompt "#" is returned and captured (see the example of the successful "MSG:cmd").
Hence, the "diag.end=#" code is waiting for a "#" prompt that will never happen. In the end, it times out and declares the UOW failed.
To fix the problem, you change the RAD:
[FROM]
diag.01.send=show version\r
diag.02.wait=#
diag.end=#
[TO]
diag.01.send=show version\r
diag.end=#
While "diag.02.wait" is capable of picking up the "#" prompt, but we need the mandatory "diag.end".
Once you make the changes, the problem is resolved.

[{"Business Unit":{"code":"BU053","label":"Cloud & Data Platform"},"Product":{"code":"SS7UH9","label":"Tivoli Netcool Configuration Manager"},"ARM Category":[{"code":"a8m50000000L25VAAS","label":"ITNCM->ITNCM-Base->Resources->RADs and Device Scripts"}],"ARM Case Number":"","Platform":[{"code":"PF002","label":"AIX"},{"code":"PF016","label":"Linux"}],"Version":"6.4.2","Line of Business":{"code":"LOB45","label":"Automation"}}]

Document Information

Modified date:
01 May 2020

UID

ibm16204516