IBM Support

When we have a failover in a cluster we get problem with performance, and we see a SOFTWARE ERROR CODE "Command is respawning too rapidly" related to guardium stap. What shall we do to avoid this?

Question & Answer


Question

The DBA have told us that when the cluster fail over to another node there's a performance problem, and we see a SOFTWARE ERROR CODE "Command is respawning too rapidly" related to guardium stap. What shall we do to avoid this? We got a Guardium stap of version 10 installed on an AIX server with Oracle Database. We got an active/passive cluster node. We see this error in the OS log: root@bagheera01: / # errpt -a -j 4A20258F ------------------------------------------------------------------------ --- LABEL: INIT_RAPID IDENTIFIER: 4A20258F Date/Time: Fri Jun 10 16:25:47 AST 2016 Sequence Number: 5243 Machine Id: 00D189056D00 Node Id: bagheera01 Class: S Type: TEMP WPAR: Global Resource Name: init Description SOFTWARE PROGRAM ERROR Probable Causes SOFTWARE PROGRAM User Causes PERFORMANCE DEGRADED Recommended Actions REVIEW DETAILED DATA Detail Data SOFTWARE ERROR CODE COMMAND id: utap "/oradump/guardium/guardium/guard_stap/guard_stap /oradump/guardium/gu root@bagheera01: / # Command is respawning too rapidly. Check for possible errors.

Answer

When the server start the stap will try to find the db-server in this case a running oracle binary. If the stap fails to find a running oracle binary it will stop/abend and then immediately automatically start again. So it will restart over and over again until it finds a running oracle binary.

To avoid this do this:
-
First avoid to core dump the stap, if stap stop in this way we don't want a coredump unless we want to debug a stap problem.
-
Also if stap configuration been tested successfully before you can configure stap to start and wait for db-server to start, instead of stopping. Also this will avoid the error/message above. To configure that set:
wait_for_db_exec=1 in guard_tap.ini both on active and passive cluster node.

Short explanation about wait_for_db_exec configuration parameter:
wait_for_db_exec=-1 <--- This setting will cause stap to start and stop if db-server/oracle is not running. I recommend you use this only when you initially are testing the inspection engine settings.
wait_for_db_exec=1 <--- With this setting stap will start and wait for db-server to start. The 1 is telling the stap to check every second if db-server started. If you set 2 it will check every 2 seconds etc. If for some other reason the stap will stop/abend it will still start again automatically.

[{"Product":{"code":"SSMPHH","label":"IBM Security Guardium"},"Business Unit":{"code":"BU059","label":"IBM Software w\/o TPS"},"Component":"Guardium S-TAP","Platform":[{"code":"PF002","label":"AIX"}],"Version":"10.0;10.0.1;10.1;9.0;9.1;9.5","Edition":"All Editions","Line of Business":{"code":"LOB24","label":"Security Software"}}]

Document Information

Modified date:
16 June 2018

UID

swg21993412