A fix is available
APAR status
Closed as program error.
Error description
Tivoli Monioring agent for Linux OS - LZ - creates false alerts for MISSING processes that are seen as running in "ps -ef" output after applying level of code including changes for APAR IV47790 / IV51033 that were introduced in 6.3.0-TIV-ITM-FP0002. Affected Levels: This issue affects Linux OS agent - LZ only, it does NOT impact UNIX OS agent - UX. L2 Diagnostics: Enable following tracing in lz.ini: KBB_RAS1=ERROR (UNIT:klz ALL) (UNIT:proc ALL) (UNIT:kra ALL) Review RAS1 tracing and look in klz07 TakeSample that exits without numerous lockTable / unlockTable trace entries for the individual process numbers. Example: klz07agt.cpp,462,"TakeSample") Entry klz07agt.cpp,481,"TakeSample") TakeSample coming from situation missing_proc klz07agt.cpp,484,"TakeSample") missing_proc is a MISSING situation proccmdinfo.cpp,71,"Instance") Entry proccmdinfo.cpp,77,"Instance") Exit: 0x973E168 proccmdinfo.cpp,291,"getAllProcCmdInfo") Entry proccmdinfo.cpp,293,"getAllProcCmdInfo") Take lock psCmdLock proccmdinfo.cpp,298,"getAllProcCmdInfo") Release lock psCmdLock proccmdinfo.cpp,301,"getAllProcCmdInfo") Exit: 0x0 proccmdinfo.cpp,305,"lockTable") Entry proccmdinfo.cpp,307,"lockTable") Take lock psCmdLock proccmdinfo.cpp,310,"lockTable") Exit proccmdinfo.cpp,314,"unlockTable") Entry proccmdinfo.cpp,316,"unlockTable") Release lock psCmdLock proccmdinfo.cpp,319,"unlockTable") Exit klz07agt.cpp,662,"TakeSample") Exit In a correctly funcitoning example, there would be numerous lockTable and unlockTable trace entries for each process before the TakeSample exits: klz07agt.cpp,505,"TakeSample") Processing process 150 of 172 klz07agt.cpp,507,"TakeSample") 150: Reading info about process 15 Initital impact: High, situations that were previously working prior to upgrade result in false alerts. Additional Keywords: IV48002 KUX_PROCESS_CMD_SAMPLE_SECS 06.30.02.00 KLZPROC KLZ_Process
Local fix
Disable the caching mechanism for Process attribute data when evaluating MISSING situations: In lz.ini, set the following and recycle the LZ agent: KLZ_PROCESS_CMD_SAMPLE_SECS=0
Problem summary
False alerts sent while monitoring missing Linux processes. When the "MISSING" clause is used in the definition of a situation on the Linux Process group or the Linux Process (Superseded) group, an event may be fired even if the process is actually running, as also reported by queries in the portal client workspaces.
Problem conclusion
False alerts no longer occur under these conditions. The fix for this APAR will be contained in the following maintenance package: | FixPack | 6.3.0-TIV-ITM-FP0003 | InterimFix | 6.3.0.2-TIV-ITM_LINUX-IF0001
Temporary fix
Set KLZ_PROCESS_CMD_SAMPLE_SECS=0 in lz.ini file.
Comments
APAR Information
APAR number
IV51064
Reported component name
ITM AGENT LINUX
Reported component ID
5724C04LN
Reported release
630
Status
CLOSED PER
PE
NoPE
HIPER
YesHIPER
Special Attention
NoSpecatt
Submitted date
2013-10-18
Closed date
2013-11-12
Last modified date
2014-08-08
APAR is sysrouted FROM one or more of the following:
APAR is sysrouted TO one or more of the following:
Fix information
Fixed component name
ITM AGENT LINUX
Fixed component ID
5724C04LN
Applicable component levels
R630 PSY
UP
R610 PSN
UP
R620 PSN
UP
R621 PSN
UP
R622 PSN
UP
R623 PSN
UP
[{"Business Unit":{"code":"BU048","label":"IBM Software"},"Product":{"code":"SSCTLMN","label":"ITM Agent Linux V6"},"Component":"","ARM Category":[],"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"630","Edition":"","Line of Business":{"code":"","label":""}}]
Document Information
Modified date:
08 August 2014