IBM Support

Structure of DSNX881I Messages in Db2 Analytics Accelerator Version 7

General Page

The following text describes the structure of messages that are returned by an accelerator if the state of the hardware changes or if problems are encountered. These messages start with the prefix or qualifier DSNX881I.
List of MESSAGE-IDs, SEVERITY and MESSAGE-TEXT
General information

When an accelerator has been connected successfully to a Db2 subsystem, and the accelerator has been started by the -START ACCEL command or the corresponding function in IBM Db2 Analytics Accelerator Studio, a heartbeat connection is established between the accelerator and that particular Db2 subsystem. Status information about the accelerator is sent to the DB2 subsystem every 30 seconds.

You can view most of this information by using the -DIS ACCEL DB2 commands. Other information cannot be viewed in this way, but is written to the z/OS system log (SYSLOG).

Accelerator support model

IBM Db2 Analytics Accelerator is a solution that consists of various hardware and software components. Each of these components might issue a DSNX881I message.
If the message indicates a hardware or software problem, open a support case for Db2 Analytics Accelerator for z/OS V7.5.

Make sure to provide an IBM Db2 Analytics Accelerator trace file (including the appliance trace) that was obtained by using the Save Trace function in IBM Db2 Analytics Accelerator Studio.

Such a trace file does not only contain software trace messages, but also a complete set of diagnostic hardware information.

DSNX881I message structure

Each DSNX881I message is made up of the following parts, which appear in the order as is shown in the following lines:

DSNX881I  -<SSID> <MESSAGE-ID> <SEVERITY> <ACCELERATOR_MESSAGE_COUNTER> (<ACCELERATOR-TIMESTAMP>) ACCELERATOR-NAME(ACCELERATOR-IP) <MESSAGE-TEXT>

The placeholders have the following meaning:

SSID

  • Is the Db2 subsystem ID (SSID)
MESSAGE-ID
  • A numeric ID for the specific error message. This ID can be used for system monitoring.
SEVERITY
  • I
    • Information message
    W
    • Warning message
    E
    • Error message

ACCELERATOR_MESSAGE_COUNTER

  • An internal counter that increases with every additional error on the accelerator.
    If the text after the DSNX881I qualifier is longer than 255 characters, another DSNX881I message is issued.
    All messages belonging together will have the same <ACCELERATOR_MESSAGE_COUNTER> value.
    The <MESSAGE-TEXT> block of the each subsequent message contains a sequel to the information in the previous message.
ACCELERATOR-TIMESTAMP
  • The time when the error occurred on the accelerator. The internal clock of the accelerator is synchronized with the first Db2 subsystem that was connected to the accelerator.
ACCELERATOR-NAME
  • The name of the accelerator where the error occurred.
ACCELERATOR-IP
  • The IP address of the accelerator where the error occurred.
    The field can be empty if no IP address can be determined. However, the parenthesis will appear.
MESSAGE
  • A textual description of the error.
The length of a DSNX881I message does not exceed 255 characters. If more characters are needed, additional DSNX881I messages are written to the SYSLOG.
If an LPAR contains multiple Db2 subsystems that are connected to the same physical accelerator, error messages are issued for every subsystem. That is, you see the same messages multiple times in the log, each time with a different subsystem ID (SSID).

If an accelerator is paired with a data sharing group (DSG), all members of the group can write messages to group's system logs (SYSLOGs), provided that the -START ACCEL command has been issued for all members.
In this case, make sure that applications are in place monitoring the SYSLOGs. If all members of the DSG are located in the same logical partition (LPAR), there is only one SYSLOG to monitor.
However, if the members are located in different LPARs, you need to monitor the SYSLOGs of all LPARs involved.

Note: It might look as if only one member writes messages to the SYSLOG, but this is actually a synchronization issue.
If one member is always the first to issue a heartbeat request, then this member will receive all the messages and write these to the SYSLOGs. After that, the messages are deleted from the accelerator queue.
The other members that send their heartbeat requests later, will not receive these messages because the queue is empty.


You might also see that only a few members write messages to the SYSLOG. This just means that the first member to send a heartbeat request is (always) found among this subset of members. The underlying mechanism is the same.
An error can occur although accelerator is in the Stopped state. In this case, the -STOP ACCEL command was issued before an error message could be stored on the accelerator.
As soon as the accelerator becomes available again in Db2, the stored error messages are sent to the Db2 subsystem, provided that -START ACCEL has been issued for the subsystem, or, in case of a data sharing group, for at least one member of the group.
It might happen that a DSNX881I message reports a past problem that has already been fixed.

Hardware alerts

Software alerts

The following numbers might be displayed in a DSNX881I message as values of the MESSAGE-ID, SEVERITY, and MESSAGE-TEXT parts:

Appliance messages:

DSNX881I-ID Severity Accelerator Event Category Event Category 'Call Home Case severity
1 I HostStateChange sysStateChanged 2


Expected MESSAGE-TEXT

System <HOST> went from <previousState> to <currentState> at <eventTimestamp> <eventSource>. <notifyMsg> Event: <eventDetail>

Impact

The target database changed its state (detected by MonitoringDaemon).

Availability of the accelerator for query processing. Everything different from Online prevents the accelerator from answering queries.
Note: In contrast to a restart of the database engine on the accelerator, a restart of IBM Db2 Analytics Accelerator itself does not produce a DSNX881I message. However, to find indicators for accelerator restarts in the SYSLOG, look for "TCP/IP Connection loss" messages.

Action

If <currentState> shows a value other than Online, run the following functions on the IBM Db2 Analytics Accelerator Console:

  1. Function 1: Run Accelerator Functions, followed by
  2. Function 4: Restart accelerator process for Db2 Analytics Accelerator 7.1.

DSNX881I-ID Severity Accelerator Event Category Event Category 'Call Home Case severity
4 I Disk8090PercentFull N/A  2


Expected MESSAGE-TEXT

URGENT: System <HOST> - <hwType> <hwId> <partition> partition is <value> % full at <eventTimestamp>. <notifyMsg> SPA ID: <spaId> SPA Slot: <spaSlot> Threshold: <threshold> Value: <value>

Impact

This warning occurs if a hard disk is at least 90 percent, but no more than 95 percent full. If the disk space usage remains within this range, the message will not be sent again. If you receive this message from one or two disks, your data might be unevenly distributed across the processing nodes (data skew). A full disk might prevent operations (detected by SystemMaintenanceDaemon)

Action

Reclaim space or remove redundant tables from the accelerator. To be notified again, the disk space usage needs to drop below 85 percent. Consider changing the the distribution of data by defining distribution keys in IBM DB2 Analytics Accelerator Studio.

Contact IBM support if you cannot reduce disk space by removing tables.

DSNX881I-ID Severity Accelerator Event Category Event Category Call Home Case severity
20 I ReplicationEvent N/A N/A


Expected MESSAGE-TEXT

Various replication-related messages are reported with an ID of 20. The structure of these different messages depends on the replication technology that is used. 
 

The message structure is:

Id: >>eventID<< Subscription: >>status<<
Message: >>Message<< Originator: >>Originator<<

The eventID returned by IBM Infosphere Change Data Capture (CDC) is determined by the CDC product itself.

For IBM Integrated Synchronization, the eventID is either 1 for warning messages, or 2 for error messages.

In the following example E indicates an error and 1001 is the error ID:

DSNX881I #DBxx 20 E 2615 (yyyy-mm-dd hh:mm:ss UTC)
IDAAP(xxIPxx) Id: 2 Subscription:
ACCEL_DWA_LOCDBxx_yyyy-mm-ddThh:mm Message: /E1001/ Row is not in BRF format.Originator: TerminatingUncaughtExceptionHandler

See DSNX881I messages (ID 20) returned by IBM Integrated Synchronization
for a complete list of all messages that might be issued by this component.

Impact

Checks the status of the replication infrastructure.

Action

Solve the problem by following the guidance in the message. Otherwise, contact IBM support.

DSNX881I-ID Severity Accelerator Event Category Event Category Call Home Case severity
24 E FileSystemTooFullEvent N/A N/A


Expected MESSAGE-TEXT

File system mounted at >>mountPoint<< has only >>freeSpacePercentage<< % free space.

Impact

The capacity of the disk storage has been exceeded. The system monitors the storage resources by scanning all mounted file systems and by checking the amount of free space. If disk space becomes scarce in one of these systems, an event is generated and propagated to all client database management systems.

Action

Contact IBM support.

DSNX881I-ID Severity Accelerator Event Category Event Category Call Home Case severity
2000 I,E, MissingReferenceTimes N/A N/A


Expected MESSAGE-TEXT

Current reference times are not available and system time cannot by synchronized.

Impact

Reference times are missing so that the TimeSyncDaemon cannot synchronize the system clock

Action
      Start the accelerator associated with the Db2 subsystem that you use for time synchronization. Alternatively, use a    different time reference system. If the warnings continue, contact IBM support.

DSNX881I-ID Severity Accelerator Event Category Event Category Call Home Case severity
2001 I,E,W LongRunningSQLStatement N/A N/A


Expected MESSAGE-TEXT

SQL statement with task ID >>TaskID<< is running for more than >>Seconds<< seconds.

Impact

The execution of a single SQL statement takes a very long time. The SQL statement might hang, or the result set cannot be received by the Db2 client application.

Action

Identify the running Db2 applications and cancel these together with the SQL statement. Submit the statement once more. If it hangs again, try to simplify the statement and isolate the section that causes the issue. Contact IBM support with the collected information.

DSNX881I-ID Severity Accelerator Event Category Event Category Call Home Case opened
2002 W LongRunningTransaction N/A N/A


Expected MESSAGE-TEXT

SQL transaction with task ID >>TaskID<< is running for more than >>Seconds<< seconds.

Impact

This message is issued if you started a transaction and that transaction has been running without completion for more than 8 hours.

Action
  1. Identify the long-running Db2 application that submitted the SQL statement. Stop this application. This will also cancel the SQL statement.
  2. Resubmit the SQL statement. If the statement hangs again, try to simplify it and isolate the section that causes the issue. Contact IBM support with this information.

DSNX881I-ID Severity Accelerator Event Category Event Category Call Home Case severity
2005 I,E,W CertificateExpiration N/A N/A


Expected MESSAGE-TEXT

INFORMATION: Certificate  >>certName<< will expire in >>Days<< days.

WARNING: Certificate >>certName<< will expire in >>Days<< days.

ERROR: Certificate >>certName<< is expired.

Impact

A certificate will expire soon or has already expired.

Action

INFORMATION: Replace the certificate before it expires.

WARNING: Replace the certificate before it expires.

ERROR: Replace the certificate now.

DSNX881I-ID Severity Accelerator Event Category Event Category Call Home Case severity
2006 W Long Running SQL No Rows Fetched Statement N/A N/A


Expected MESSAGE-TEXT

SQL statement with task ID >>id<<, client application >>client application name<<, and client user ID >>uid<< is running, but has been fetching no rows for more than >>Seconds<< seconds.

Impact

The execution of a single SQL statement takes a very long time. The SQL statement might hang, or the result set cannot be received by the Db2 client application.

Action
  1. Identify the running Db2 applications and cancel these together with the SQL statement.
  2. Submit the statement once more.
  3. If the statement hangs again, try to simplify the statement and isolate the section that causes the issue. Contact IBM support with the collected information.

DSNX881I-ID Severity Accelerator Event Category Event Category Call Home Case severity
2007 W Long Running SQL Prepare Time Statement N/A N/A


Expected MESSAGE-TEXT

The SQL statement with task ID >>id<< is running, but will be cancelled because the preparation phase could not be completed within 900 seconds.

Impact

The preparation of the SQL statement takes a very long time. Rows have not been fetched up to this point.

Action
  1. Identify the running Db2 applications and cancel these together with the SQL statement.
  2. Contact IBM support to investigate the problem.

DSNX881I-ID Severity Accelerator Event Category Event Category Call Home Case severity
2008 W N/A N/A N/A


Expected MESSAGE-TEXT

SYSCATSPACE used pages has reached percentage % of the maximum 512 GB of catalog space with number of pages used.

Impact

The number of used pages in the system catalog table space (SYSCATSPACE) has reached a defined threshold percentage. If used pages take up the entire SYSCATSPACE, nearly all accelerator operations will slow down or fail.

By default, this message is issued for the first time when 75% of the pages in the SYSCATSPACE are in use. After that, the message is re-issued every 30 minutes until the percentage drops below 75%. The page consumption depends on the workload. It might grow considerably, especially when you load many tables, but it can also shrink. A consumption of 75% is not critical. However, it is advisable to take action if you notice message re-issues every 30 minutes.

Action

Contact IBM support and ask for a manual REORG job to reclaim parts of the SYSCATSPACE for used pages. Note also that the 30-minutes interval and the threshold percentage are configurable. You might want to ask IBM support to change the values for you.

DSNX881I-ID Severity Accelerator Event Category Event Category Call Home Case severity
2009 W Long Running SQL Stalled Statement N/A N/A


Expected MESSAGE-TEXT

SQL statement with task ID >>id<<, client application >>client application name<<, and client user ID >>uid<< is running, total application fetch stall time of >>Seconds<< seconds, has fetched >>number of rows<< rows.

Impact

Result fetching stalls after a certain time. The number of fetched rows up to the stall point is shown in the message.

Action
  1. Identify the running Db2 applications and cancel these together with the SQL statement.
  2. Submit the statement once more.

DSNX881I-ID Severity Accelerator Event Category Event Category Call Home Case opened
2010 W LongRunningTransaction N/A N/A


Expected MESSAGE-TEXT

A transaction has been running longer than expected. This prevented the automatic termination of the transaction and the client process. Contact IBM support.

Impact

This message is issued if a background transaction started by Db2 Analytics Accelerator or your Db2 target database has been running without completion for more than 24 hours, and if attempts to end the transaction automatically have failed. The message indicates a potentially severe situation because the Db2 transaction log might be filled to capacity, in which case major accelerator functions cease to function.

Action

Contact IBM support immediately.

DSNX881I-ID Severity Accelerator Event Category Event Category Call Home Case severity
3000 W ReplicationLatency N/A N/A


Expected MESSAGE-TEXT

WARNING: The current replication latency of >>LatencyInSeconds<< s  on DB2 location >>LocationName<< has exceeded the threshold of >>Seconds<< s.

Impact

The replication latency threshold has been reached.

Action

Check the replication latency. If the latency value remains high for a longer time, check for factors that might contribute to the increased latency. Such factors are the size and the number of committed and uncommitted database transactions, delays when writing changes to the log, and the utilization of the accelerator.

DSNX881I-ID Severity Accelerator Event Category Event Category Call Home Case severity
3003 W,E, ReplicationStatusMissing N/A N/A


Expected MESSAGE-TEXT

WARNING: The target database is offline. Replication is stopped.

ERROR: The replication status for DB2 location >>LocationName<< >>SubscriptionName<< is missing.

Impact

The replication status of the subscription is missing (f.e. components are unavailable).

Action

WARNING: Check the target system.

ERROR: Check that the replication capture agent is running, has valid credentials, is attached to Db2 and is reachable under >>ReplicationSubscription<< from the accelerator network.

DSNX881I-ID Severity Accelerator Event Category Event Category Call Home Case severity
3004 W,E ReplicationTargetDown N/A N/A


Expected MESSAGE-TEXT

1) WARNING: The replication status for DB2 location >>LocationName<< is STARTED again. Replication was restarted successfully.

2) ERROR: The target database is offline. Replication is stopped.

Impact

The replication target database is offline.

Action

1) Nothing to do.

2) Check the accelerator and the underlying database system is up and running.  If not, then contact IBM support.

DSNX881I-ID Severity Accelerator Event Category Event Category Call Home Case severity
3005 I,E ReplicationTargetUp N/A N/A


Expected MESSAGE-TEXT

The target database is offline. Replication is stopped. Check the target system.

Impact

The target database is online again after an outage.

Action

Nothing to do if the subscription reaches the state STARTED again. If one replication-enabled subsystem is missing,  the source datastore is down or a network error occurred.

DSNX881I-ID Severity Accelerator Event Category Event Category Call Home Case severity
3006 I,E,W ReplicationRestartFailed N/A N/A


Expected MESSAGE-TEXT

The replication status for DB2 location >>LocationName<< is >>state<< and replication could not be restarted. Unsuccessful restart attempts: >>subscriptionID<<. Check the incremental update components (Access Server, Replication Engine). Consider a restart from the IBM DB2 Analytics Accelerator Console.

Impact

Attempts at restarting a replication subscription fail.

Action

Contact IBM support if the problems persists.

DSNX881I-ID Severity Accelerator Event Category Event Category Call Home Case severity
3007 I,E ReplicationRestartRecovered N/A N/A


Expected MESSAGE-TEXT

The subscription with ID '" + subscriptionID + "' recovered. Generating event now.

Impact

A replication subscription recovered after unsuccessful restart attempts.

Action

Nothing to do.

DSNX881I-ID Severity Accelerator Event Category Event Category Call Home Case severity
3008 E ReplicationRestartSuspended N/A N/A


Expected MESSAGE-TEXT

The automatic restart of the replication component for the Db2 location <location-name> has been suspended. This indicates a serious problem that requires attention and investigation.

Impact

Automatic restarts have been temporarily disabled. The replication component for Db2 location <location-name> has been stopped and will not be restarted automatically.

Action
  1.  To analyze the problem, start the event viewer for incremental updates from your adminstration client (IBM Db2 Analytics Accelerator Studio or IBM Data Server Manager).
  2. Try to restart the replication component from the IBM Db2 Analytics Accelerator Console. If the problem persists, contact IBM support.

DSNX881I-ID Severity Accelerator Event Category Event Category Call Home Case severity
3009 I ReplicationInsyncMonitoringEvent N/A N/A


Expected MESSAGE-TEXT

Integrated Synchronization status:

  • Latency: x seconds.
  • Latest commit RBA/LRSN: hex value.
  • Number of open transactions: x.
  • Earliest open RBA/LRSN: hex value.
  • Parsed source operations: x insert, x update, x delete.
  • Applied target operations: x insert, x delete.
  • Tenured heap usage: x%.

The message varies according to the operation mode. In regular operation mode, the message looks as shown above. When issued after a restart, the message looks as follows:

Integrated Synchronization status after restart at timestamp:

  • Latency: x seconds.
  • Latest commit RBA/LRSN: hex value.
  • Number of open transactions: x.
  • Earliest open RBA/LRSN: hex value.
  • Tenured heap usage: x%.

For more information, see Status information for error analyses

Impact

None. This is an informational message.

DSNX881I-ID Severity Accelerator Event Category Event Category Call Home Case severity
4000 I/W Db2FODCDirectoryCreated N/A N/A


Expected MESSAGE-TEXT

Informational message: A new first-occurrence data capture (FODC) directory >>FODC directory path<< was created by the target database. You can ignore this unless you receive other messages that indicate a problem related to the target database.

Warning message: A new first-occurrence data capture (FODC) directory >>FODC directory path<< was created by the target database.

Impact

An error occurred in the target database. This error led to the creation of a first occurrence data capture (FODC) directory. This directory contains a set of diagnostic information.

Action

Contact IBM support if the FODC directory causes problems, or if you need an analysis of the issue.

 

Hardware alerts:

Reason
code
Severity Policy Collected
logs
Call Home Case opened Open
CASE
101 MAJOR NodeRecovery General,node Yes Yes


Expected MESSAGE-TEXT

Server is unreachable and cannot be recovered.

When sent

Sent when a server is unreachable and when it was impossible to recover it. Such servers are marked as 'disabled' and will not be used to run appliance applications.

When closed

Closed when the resource manager reports the server status 'OK'.

Reason
code
Severity Policy Collected
logs
Call Home Case opened Open
CASE
102 MAJOR NodeFailedDisablePolicy General,node Yes Yes


Expected MESSAGE-TEXT

Server failed and was disabled

When sent

Sent when node resource manager reported the node status 'FAILED'.  Such nodes are marked as 'disabled' and are not used to run appliance applications.

When closed

Closed when the resource manager reports the node status 'OK'.

Reason
code
Severity Policy Collected
logs
Call Home Case opened Open
CASE
103 MAJOR HwStatusAlerter General, resmgr
for component
Yes Yes


Expected MESSAGE-TEXT

Major component is unreachable

When sent

Sent when the status of a major component other than a server (node) was reported as 'UNREACHABLE' by the resource manager that is responsible for monitoring the server. Major components are all the components located directly in the rack (hw://rackX.typeY).

When closed

Closed when the resource manager reports the component status 'OK'.

Reason
code
Severity Policy Collected
logs
Call Home Case opened Open
CASE
104 MAJOR HwStatusAlerter General, resmgr
for component
Yes Yes


Expected MESSAGE-TEXT

Major component failed

When sent

Sent when status of major component other than a server (node) was reported as 'FAILED' by the resource manager that is responsible for monitoring it. Major components are all the components located directly in the rack (hw://rackX.typeY).

When closed

Closed when the resource manager reports the component status 'OK'.

Reason
code
Severity Policy Collected
logs
Call Home Case opened Open
CASE
105 MAJOR HwStatusAlerter General,resmgr
that reported issue
Yes Yes


Expected MESSAGE-TEXT

Subcomment failed

When sent

Sent when the status of a component's subcomponent is 'FAILED', 'ERROR' or 'FAILING'.

When closed

Closed when the resource manager reports the subcomponent status 'OK'.

Reason
code
Severity Policy Collected
logs
Call Home Case opened Open
CASE
106 MAJOR NodeEventAlerting General, node No Yes


Expected MESSAGE-TEXT

FSP unrecoverable events detected

When sent

Sent when an FSP event is reported by dev_node.py

When closed

N/A

Reason
code
Severity Policy Collected
logs
Call Home Case opened Open
CASE
108 MAJOR HwStatusAlerter General, resmgr
that reported issue
Yes Yes


Expected MESSAGE-TEXT

Subcomponent is unreachable

When sent

Sent when the status of a component's subcomponent is 'UNREACHABLE'.

When closed

Closed when the resource manager reports the subcomponent status 'OK'.

Reason
code
Severity Policy Collected
logs
Call Home Case opened Open
CASE
109 MAJOR FcPortRetrain General, node Yes Yes


Expected MESSAGE-TEXT

Sub-optimal speed of FC port

When sent

When the FC port speed is not optimal and cannot be improved.

When closed

Closed when the FC port speed is optimal.

Reason
code
Severity Policy Collected
logs
Call Home Case opened Open
CASE
110 MAJOR NodeMgmtNet General, other logs for all nodes Yes Yes


Expected MESSAGE-TEXT

Server is unreachable in management network

When sent

Sent when a node is not reachable in the management network.

When closed

Closed when the node is reachable in the management network.

Reason
code
Severity Policy Collected
logs
Call Home Case opened Open
CASE
111 MAJOR CPUClockRateMonitoring General, node Yes Yes


Expected MESSAGE-TEXT

System cannot be tuned, CPU clock not optimal

When sent

Sent when node monitoring reports a non-optimal CPU frequency and when this cannot be fixed automatically.

When closed

Closed when the CPU frequency is optimal.

Reason
code
Severity Policy Collected
logs
Call Home Case opened Open
CASE
112 MAJOR NodeEventAlerting General, node Yes No


Expected MESSAGE-TEXT

HW_SERVICE_REQUESTED | 112: FSP unrecoverable events detected please check ap_issues_c.out dataset and get in contact with customer

When sent

Sent when an FSP event is reported by dev_node.py.

When closed

Closed when the the closure of the event in FSP has been reported by dev_node resmgr.

Reason
code
Severity Policy Collected
logs
Call Home Case opened Open
CASE
113 MAJOR CPUClockRateMonitoring General Yes Yes


Expected MESSAGE-TEXT

Unable to fix CPU configuration

When sent

Sent when the SMT configuration of the CPU cannot be set.

When closed

Closed when the SMT configuration has been set.

Reason
code
Severity Policy Collected
logs
Call Home Case opened Open
CASE
114 MAJOR RPCMonitoring General Yes Yes


Expected MESSAGE-TEXT

Communication with RPC management ports lost

When sent

Sent when RPC management ports cannot be connected to and when resetting these does not fix the problem.

When closed

Closed when the RPC management ports can be reached again.

Reason
code
Severity Policy Collected
logs
Call Home Case opened Open
CASE
151 MAJOR CPUClockRateMonitoring General Yes Yes


Expected MESSAGE-TEXT

Cannot activate tuned.service

When sent

Sent when the tuned service cannot be started on a node.

When closed

Closed when the tuned service has been started.

Reason
code
Severity Policy Collected
logs
Call Home Case opened Open
CASE
153 WARNING KernelPanicReporting os (dmesg from crash) Yes Yes


Expected MESSAGE-TEXT

Kernel panic(s) occurred

When sent

Sent when one or more kernel panics were detected on a node since the last event (or since 06/01/2018 when there were no previous events of this type).

When closed

N/A

Reason
code
Severity Policy Collected
logs
Call Home Case opened Open
CASE
154 MAJOR NodeRecovery General Yes Yes


Expected MESSAGE-TEXT

Soft power off action for node failed, could not recover node

When sent

Sent when a server is unreachable because the soft power-off action failed and the hard power-off action  is disabled during a recovery (it is enabled by default). Such servers are marked as 'disabled' and will not be used to run appliance applications.

When closed

Closed when the resource manager reports the server status 'OK'.

Reason
code
Severity Policy Collected
logs
Open
CASE
201 WARNING
MINOR
HwStatusAlerter General, resmgr
for component
Yes


Expected MESSAGE-TEXT

For example: Issues 1:  sys_hw_config reporting "Flash Modules Incorrect number 3"

fsn4.interface_card1 | WARNING | 1234 | 2020-01-01 04:05:58 | HW_NEEDS_ATTENTION | 201: Unhealthy component detected | hw://hadomain3.

When sent

Sent for a hardware component when its report status is neither OK, nor FAILED, UNREACHABLE or NOT_PRESENT (other alerts covers it). The severity depends on the status. If the status is 'WARNING', the  severity is WARNING. In other cases, the severity is MINOR. An alert is not sent if an alert has been opened 10 times already for the same component.

When closed

Closed when the resource manager reports the component status 'OK'.

Reason
code
Severity Policy Collected
logs
Open
CASE
202 WARNING BatteryReconditioning General, fsn No


Expected MESSAGE-TEXT

FSN battery needs reconditioning

Reconditioning of hw://hadomainX.fsnX.batteryX failed - FAILED

When sent

Sent when FSN battery reconditioning is needed.

When closed

Closed shortly after the start of the reconditioning process. You'll see message 203 at that time.

Reason
code
Severity Policy Collected
logs
Open
CASE
203 WARNING BatteryReconditioning General, fsn No


Expected MESSAGE-TEXT

FSN battery reconditioning in-progress

When sent

Sent when an FSN battery reconditioning was requested and is in progress.

When closed

Closed when battery reconditioning is complete (as reported by the resource manager).

Reason
code
Severity Policy Collected
logs
Open
CASE
204 MINOR HwStatusAlerter General No


Expected MESSAGE-TEXT

Component is missing

When sent

Sent when the resource manager reports the component status 'NOT_PRESENT'.

When closed

Closed when the resource manager reports the component status 'OK'.

Reason
code
Severity Policy Collected
logs
Open
CASE
205 MAJOR Multipath General, multipath No


Expected MESSAGE-TEXT

Low fibre channel path count

When sent

Sent when fewer paths than required are reported as healthy in a multipath environment. Paths related to broken FC links are not counted (there is another alert for FC links).

When closed

Closed when the required number of healthy paths has been reached.

Software alerts:

Reason
code
Severity Policy Collected
logs
Open
CASE
301 WARNING Gpfs General, gpfs No


Expected MESSAGE-TEXT

Action to restore a GPFS component failed.

When sent

Sent when a GPFS-related recovery action failed.

When closed

N/A

Reason
code
Severity Policy Collected
logs
Open
CASE
302 MAJOR AppStartup General No


Expected MESSAGE-TEXT

Container start-up action failed

When sent

Sent when the start of a container failed.

When closed

N/A

Reason
code
Severity Policy Collected
logs
Open
CASE
303 MINOR AppShutdown General No


Expected MESSAGE-TEXT

Container stop action failed

When sent

Sent when a container shutdown failed.

When closed

N/A

Reason
code
Severity Policy Collected
logs
Open
CASE
304 WARNING NTPD General No


Expected MESSAGE-TEXT

Action to restore NTP synchronization failed

When sent

Sent when an NTPD recovery failed.

When closed

N/A

Reason
code
Severity Policy Collected
logs
Open
CASE
305 WARNING NTPD General N/A


Expected MESSAGE-TEXT

Failed to enable a node

When sent

Sent when node enabling fails while it is in progress.

When closed

N/A

Reason
code
Severity Policy Collected
logs
Open
CASE
307 MAJOR MonitoredAppDisable General No


Expected MESSAGE-TEXT

Application disabling failed.

When sent

Sent when it is not possible for the user to disable an application by calling the 'ap apps disable' command.

When closed

N/A

Reason
code
Severity Policy Collected
logs
Open
CASE
308 MAJOR MonitoredAppEnable General No


Expected MESSAGE-TEXT

Application enabling failed

When sent

Sent when it is not possible for the user to enable an application by calling the 'ap apps enable' command.

When closed

N/A

Reason
code
Severity Policy Collected
logs
Open
CASE
309 MAJOR ConsoleKeeper General No


Expected MESSAGE-TEXT

WebConsole container stop action failed

When sent

Sent when the web console fails to start.

When closed

N/A

Reason
code
Severity Policy Collected
logs
Open
CASE
314 WARNING NodeRecovery General, node No


Expected MESSAGE-TEXT

Soft power off failed for node

When sent

Sent when a server could not be reached, and when, during the recovery phase, the soft power-off action failed with the result that the hard power-off fallback is enabled (enabled by default).

When closed

N/A

Reason
code
Severity Policy Collected
logs
Open
CASE
351 .... DsmAlertsRelay General, apidag for db2 No


Expected MESSAGE-TEXT

Database availability issue

When sent

Sent when IBM Data Server Manager reports the unavailability of a database.

When closed

Closed when IBM Data Server Manager has closed the issue.

Reason
code
Severity Policy Collected
logs
Open
CASE
352 .... DsmAlertsRelay General, apidag for db2 No


Expected MESSAGE-TEXT

Physical memory usage threshold exceeded

When sent

Sent when a limit on the use of physical memory has been exceeded. This is a message passed from IBM Data Server Manager.

When closed

Closed when IBM Data Server Manager has closed the issue.

Reason
code
Severity Policy Collected
logs
Open
CASE
353 .... DsmAlertsRelay General, apidag for db2 No


Expected MESSAGE-TEXT

Virtual memory usage threshold exceeded

When sent

Sent when a limit on the use of virtual memory has been exceeded. This is a message passed from IBM Data Server Manager.

When closed

Closed when IBM Data Server Manager has closed the issue.

Reason
code
Severity Policy Collected
logs
Open
CASE
354 .... DsmAlertsRelay General, apidag for db2 No


Expected MESSAGE-TEXT

File system utilization threshold exceeded

When sent

Sent when a limit on the use of the file system has been exceeded. This is a message passed from IBM Data Server Manager.

When closed

Closed when IBM Data Server Manager closes the issue.

Reason
code
Severity Policy Collected
logs
Open
CASE
355 .... GDsmAlertsRelay General, apidag for db2 No


Expected MESSAGE-TEXT

Maximum log space exceeded

When sent

Sent when the maximum log space has been exceeded. This is a message passed from IBM Data Server Manager.

When closed

Closed when IBM Data Server Manager closes the issue.

Reason
code
Severity Policy Collected
logs
Open
CASE
356 .... DsmAlertsRelay General, apidag for db2 No


Expected MESSAGE-TEXT

Table space container utilization threshold exceeded

When sent

Sent when a limit on the use of a table space container has been exceeded. This is a message passed from IBM Data Server Manager.

When closed

Closed when IBM Data Server Manager closes the issue.

Reason
code
Severity Policy Collected
logs
Open
CASE
399 .... DsmAlertsRelay General, apidag for db2 No


Expected MESSAGE-TEXT

Other database issue

When sent

Sent when IBM Data Server Manager reports an unspecified database issue.

When closed

Closed when IBM Data Server Manager closes the issue.

Reason
code
Severity Policy Collected
logs
Open
CASE
401 MAJOR Gpfs General, gpfs No


Expected MESSAGE-TEXT

GPFS node failed to start.

When sent

Sent when it is impossible to start a GPFS node. The node will be disabled automatically.

When closed

Closed when the GPFS resource manager reports 'OK' as the state of the node.

Reason
code
Severity Policy Collected
logs
Open
CASE
402 MINOR GPFS General, gpfs No


Expected MESSAGE-TEXT

GPFS nsd failed to start

When sent

Sent when it is impossible to start the GPFS network-shared disk (NSD). The message is sent only for nodes that have been  enabled.

When closed

Closed when the GPFS resource manager reports 'OK' as the state of the NSD.

Reason
code
Severity Policy Collected
logs
Open
CASE
403 MAJOR AppStartOnEnabledNode
NodeEnable
None No


Expected MESSAGE-TEXT

Application container cannot be started on a node

When sent

Sent when application container on an enabled node cannot be started.

When closed

Closed when the Docker resource manager (resmgr) reports that the container had been started.

Reason
code
Severity Policy Collected
logs
Open
CASE
404 MAJOR
CRITICAL
Gpfs General, gpfs Yes


Expected MESSAGE-TEXT

GPFS local partition failed to be mounted

When sent

Sent when the GPFS file system cannot be mounted on the local GPFS partition of an enabled node. If the file system cannot be mounted on any of the partitions, the severity is set to CRITICAL and the system shuts down. Otherwise,  the severity is set to MAJOR and the partition is marked as 'disabled'.

When closed

Closed when the GPFS resource manager reports that the file system had been mounted on all local partitions of an enabled node.

Reason
code
Severity Policy Collected
logs
Open
CASE
405 MAJOR, CRITICAL Gpfs General, gpfs Yes


Expected MESSAGE-TEXT

GPFS filesystem failed to be mounted

When sent

Sent when the GPFS file system cannot be mounted on an enabled node. If the file system cannot be mounted on any of the nodes, the severity is set to CRITICAL and the system shuts down. Otherwise,  the severity is set to MAJOR and the node is marked as 'disabled'.

When closed

Closed when the GPFS resource manager reports that the file system had been mounted on all nodes.

Reason
code
Severity Policy Collected
logs
Open
CASE
406 WARNING Ntpd General No


Expected MESSAGE-TEXT

Time on node is not synchronized

When sent

Sent when it is not possible to synchronize the system time of a node.

When closed

Closed when the NTPS resource manager reports that the system time of the node could be synchronized.


 

Reason
code
Severity Policy Collected
logs
Open
CASE
408 WARNING NTPD None No


Expected MESSAGE-TEXT

The NTP daemon is down

When sent

Sent when the NTPD daemon cannot be started.

When closed

Closed when the NTPD daemon can be started.

Reason
code
Severity Policy Collected
logs
Open
CASE
409 MAJOR CallHomeDaemonKeeper None No


Expected MESSAGE-TEXT

Unable to start Call Home Daemon

When sent

Sent when it is not possible to start the Call Home Daemon container on a node.

When closed

Closed when Call Home Daemon container can be started (as reported by Docker).

Reason
code
Severity Policy Collected
logs
Open
CASE
410 MINOR CallHomeDaemonKeeper None No


Expected MESSAGE-TEXT

Unable to stop Call Home Daemon

When sent

Sent when it is not possible to stop the Call Home Daemon container on a node.

When closed

When the Call Home Daemon container can be stopped (as reported by Docker).

Reason
code
Severity Policy Collected
logs
Open
CASE
411 CRITICAL SwapWatch General No


Expected MESSAGE-TEXT

Heavy swap usage

When sent

Sent when the swap utilization on a node is above 95 percent of the swap space.

When closed

This kind of issue can be closed automatically by the system. Closed when the swap utilization on the node drops below 90 percent.

Reason
code
Severity Policy Collected
logs
Open
CASE
413 MAJOR LdapWatch General, other logs No


Expected MESSAGE-TEXT

Directory service cannot be started

When sent

Sent when the apslapd service cannot be started on a node where it is enabled.

When closed

Closed when apslapd can be started on that node.

Reason
code
Severity Policy Collected
logs
Open
CASE
414 MAJOR LdapWatch General, other logs No


Expected MESSAGE-TEXT

Security daemon cannot be started

When sent

Sent when the security daemon sssd cannot be started on a node where it is enabled.

When closed

Closed when sssd can be started on that node.

Reason
code
Severity Policy Collected
logs
Open
CASE
415 MAJOR DockerWatch General No


Expected MESSAGE-TEXT

Docker service failed

When sent

Sent when the Docker service cannot be started on a node.

When closed

Closed when the Docker service can be started on that node.

Reason
code
Severity Policy Collected
logs
Open
CASE
416 MAJOR ConsoleKeeper General No


Expected MESSAGE-TEXT

Unable to start console container

When sent

Sent when the console container cannot be started on a node.

When closed

Closed when the console container can be started on that node.

Reason
code
Severity Policy Collected
logs
Open
CASE
417 MAJOR ConsoleKeeper General No


Expected MESSAGE-TEXT

Unable to stop console container

When sent

Sent when the console container cannot be stopped on a node.

When closed

Closed when the console container can be stopped on that node.

Reason
code
Severity Policy Collected
logs
Open
CASE
418 MAJOR ConsoleKeeper General No


Expected MESSAGE-TEXT

Console is down

When sent

Sent when console container works, but the console itself fails and cannot be recovered.

When closed

Closed when the console in the container works again.

Reason
code
Severity Policy Collected
logs
Open
CASE
419 CRITICAL GodMonitoring General Yes


Expected MESSAGE-TEXT

Grow on demand limit not satisfied

When sent

Sent when the growth-on-demand limit has been exceeded on a system (through illegal reconfiguration or a use of storage above the limit).

When closed

Closed when the use is again below the limit.

Reason
code
Severity Policy Collected
logs
Open
CASE
420 MINOR GodMonitoring General No


Expected MESSAGE-TEXT

Detection of change GoD

When sent

Sent when growth-on-demand limits have changed.

When closed

N/A

Reason
code
Severity Policy Collected
logs
Open
CASE
421 MAJOR LiftKeeper General No


Expected MESSAGE-TEXT

Unable to start Lift container

When sent

Sent when the Lift container cannot be started on a node.

When closed

Closed when the Lift container can be started or stopped.

Reason
code
Severity Policy Collected
logs
Open
CASE
422 MAJOR LiftKeeper General No


Expected MESSAGE-TEXT

Unable to stop Lift container

When sent

Sent when the Lift container cannot be started on a node.

When closed

Closed when the Lift container can be started or stopped on that node.

Reason
code
Severity Policy Collected
logs
Open
CASE
423 MAJOR LiftKeeper General No


Expected MESSAGE-TEXT

Lift down

When sent

Sent when the Lift container can be started, but Lift itself is not working.

When closed

Closed when Lift reports a state of 'healthy'.

Reason
code
Severity Policy Collected
logs
Open
CASE
424 MAJOR SystemdServiceWatch General, other logs No


Expected MESSAGE-TEXT

Token and auth service cannot be started

When sent

Sent when the token service cannot be started on a node.

When closed

Closed when the token service can be started, or when the token service is no longer needed on that node.

Reason
code
Severity Policy Collected
logs
Open
CASE
425 MAJOR SystemdServiceWatch General, other logs No


Expected MESSAGE-TEXT

DR management service cannot be started

When sent

Sent when the DR management service cannot be started on a node.

When closed

Closed when the DR management service can be started, or when the service is no longer needed on that node.

Reason
code
Severity Policy Collected
logs
Open
CASE
426 MAJOR SystemdServiceWatch General, other logs No


Expected MESSAGE-TEXT

Firewall with iptables cannot be started

When sent

Sent when iptables cannot be started on a node.

When closed

Closed when iptables can be started on that node.

Reason
code
Severity Policy Collected
logs
Open
CASE
427 MAJOR GatewayKeeper General No


Expected MESSAGE-TEXT

Unable to start IDAA Gateway container

When sent

Sent when the DRDA Gateway container cannot be started on an accelerator node.

When closed

Closed when that DRDA Gateway container can be started on that node.

Reason
code
Severity Policy Collected
logs
Open
CASE
428 MAJOR GatewayKeeper General No


Expected MESSAGE-TEXT

Unable to stop IDAA Gateway container

When sent

Sent when the DRDA Gateway container cannot be stopped on an accelerator node.

When closed

Closed when the DRDA Gateway container can be started on that node.

Reason
code
Severity Policy Collected
logs
Open
CASE
429 MAJOR GatewayKeeper General No


Expected MESSAGE-TEXT

IDAA Gateway down

When sent

Sent when the DRDA Gateway container works. but the gateway itself is not running.

When closed

Closed when the DRDA Gateway container has been started and the gateway is running.

Reason
code
Severity Policy Collected
logs
Open
CASE
433 MAJOR SystemdServiceWatch General, other logs No


Expected MESSAGE-TEXT

Primary SKLM proxy service cannot be started

When sent

Sent when the primary Security Key Lifecycle Manager (SKLM) proxy cannot be started on a node.

When closed

Closed when the primary SKLM proxy can be started, or when it is no longer needed on that node.

Reason
code
Severity Policy Collected
logs
Open
CASE
434 MAJOR SystemdServiceWatch General, other logs No


Expected MESSAGE-TEXT

Secondary SKLM proxy service cannot be started

When sent

Sent when the secondary SKLM proxy cannot be started on a node.

When closed

Closed when the secondary SKLM proxy can be started, or when it is no longer needed on that node.

Reason
code
Severity Policy Collected
logs
Open
CASE
435 WARNING Network General No


Expected MESSAGE-TEXT

Gateway not in routing table

When sent

Sent when interfaces of the default gateway are down, or when the default gateway is not in the routing table and the issue cannot be recovered.

When closed

Closed when all interfaces of the default gateway are up again, and the default gateway is in the routing table.

Reason
code
Severity Policy Collected
logs
Open
CASE
436 MAJOR ResMgrFail General, other logs No


Expected MESSAGE-TEXT

Failed to collect status from resource manager

When sent

Sent when the resource manager of a given component failed 10 times in a row at collecting status information.

When closed

Closed when the resource manager of a given component can retrieve status information successfully.

Reason
code
Severity Policy Collected
logs
Open
CASE
437 MAJOR DockerWatch General No


Expected MESSAGE-TEXT

Duplicate containers running

When sent

Sent when more than one container was started from the same (monitored) image.

When closed

Closed when no more than one container is started from each (monitored) image.

Reason
code
Severity Policy Collected
logs
Open
CASE
442 WARNING Ntpd General No


Expected MESSAGE-TEXT

Timezones mismatch between nodes

When sent

Sent when the nodes of the appliance are set to different timezones.

When closed

Closed when all nodes of the appliance are set to the same timezone.

Reason
code
Severity Policy Collected
logs
Open
CASE
501 CRITICAL AppStartup General, other logs No


Expected MESSAGE-TEXT

Start-up failed due to container start error

When sent

Sent when the appliance cannot start because too many containers failed.

When closed

N/A

Reason
code
Severity Policy Collected
logs
Open
CASE
502 CRITICAL AppStartup General, other logs No


Expected MESSAGE-TEXT

Application start-up timeout

When sent

Sent when the appliance failed to start within the timeout period.

When closed

N/A

Reason
code
Severity Policy Collected
logs
Open
CASE
503 CRITICAL AppStartup General, other logs No


Expected MESSAGE-TEXT

Start-up timeout on waiting for healthy nodes

When sent

Sent when the appliance failed to start because the number of healthy nodes was too low.

When closed

N/A

Reason
code
Severity Policy Collected
logs
Open
CASE
504 CRITICAL AppStartup General, other logs No


Expected MESSAGE-TEXT

Start-up failed (dashDB HA failed)

When sent

Sent when the appliance failed because dashDB could not be started and a status of 'FAILED' was reported.

When closed

N/A

Reason
code
Severity Policy Collected
logs
Open
CASE
601 MAJOR FloatingIpStarter General No


Expected MESSAGE-TEXT

Unable to bring up floating IP

When sent

Sent when the floating IP address (or virtual IP address) cannot be assigned to the head node.

When closed

Closed when the floating IP address (or virtual IP address) can be assigned to the head node.

Reason
code
Severity Policy Collected
logs
Open
CASE
602 MAJOR FloatingIpStarter General No


Expected MESSAGE-TEXT

Unable to bring-down floating IP

When sent

Sent when the floating IP address cannot be removed from a node that is no longer the head node.

When closed

Closed when the floating IP address can be removed from that node.

Reason
code
Severity Policy Collected
logs
Open
CASE
603 MAJOR FloatingIpStarter Genral No


Expected MESSAGE-TEXT

Unable to bring-up floating IP – cannot connect to server

When sent

Sent when the floating IP address cannot be assigned to the head node because that node cannot be contacted over the network.

When closed

Floating IP is down on a worker node.


 

Reason
code
Severity Policy Collected
logs
Open
CASE
701 CRITICAL NodeRevocery General, other logs Yes


Expected MESSAGE-TEXT

Appliance application went down due to disabled node.

For example: issues 1: sys_hw_config reporting "Flash Modules Incorrect number 3"

When sent

Sent when a broken node cannot be disabled because that would not leave enough nodes to run the appliance.

When closed

Closed when the appliance reports a state of 'READY'.

Reason
code
Severity Policy Collected
logs
Open
CASE
703 CRITICAL AppStartup General Yes


Expected MESSAGE-TEXT

Appliance application can't start. nodeXYZ is unable to start docker and it is disabled.

When sent

Sent when the appliance startup fails (501-505 event will also be sent).

When closed

Closed when the appliance reports a state of 'READY'.

Reason
code
Severity Policy Collected
logs
Open
CASE
704 CRITICAL SwStatusAlerting General, other logs from all nodes Yes


Expected MESSAGE-TEXT

Appliance application went down (db2 HA)

When sent

Sent when the appliance FAILED (after a grace period).

When closed

Closed when the appliance reports a state of 'READY'.

Reason
code
Severity Policy Collected
logs
Open
CASE
801 INFORMATION NodeDisable None No


Expected MESSAGE-TEXT

Node disabled by user

When sent

Sent when a node was disabled on request by a user.

When closed

N/A

Reason
code
Severity Policy Collected
logs
Open
CASE
802 INFORMATION NodeDisable None No


Expected MESSAGE-TEXT

Node disabled by system

When sent

..

When closed

N/A

Reason
code
Severity Policy Collected
logs
Open
CASE
803 INFORMATION NodeEnable None No


Expected MESSAGE-TEXT

Node enabled by user

When sent

Sent when a node was enabled on request by a user.

When closed

N/A

Reason
code
Severity Policy Collected
logs
Open
CASE
804 INFORMATION NodeEnable None No


Expected MESSAGE-TEXT

Node enabled by system

When sent

Sent when a node was enabled by the system.

When closed

N/A

Reason
code
Severity Policy Collected
logs
Open
CASE
805 INFORMATION REST node_handler None No


Expected MESSAGE-TEXT

Node rebalance requested

When sent

Sent when a user requested a rebalancing of the data on the nodes. Data rebalancing includes decompressing older data, and moving data that was on the original storage device to evenly distribute it across all connected devices. For more information, see:

Data rebalancing after a data node is added

When closed

N/A

Reason
code
Severity Policy Collected
logs
Open
CASE
806 INFORMATION REST node_handler None No


Expected MESSAGE-TEXT

Node init requested

When sent

Sent when a user requested a node initialization.

When closed

N/A

Reason
code
Severity Policy Collected
logs
Open
CASE
807 INFORMATION AppStartup None No


Expected MESSAGE-TEXT

Application start requested

When sent

Sent when a restart of the appliance was requested.

When closed

N/A

Reason
code
Severity Policy Collected
logs
Open
CASE
808 INFORMATION AppShutdown None No


Expected MESSAGE-TEXT

Application stop requested

When sent

Sent when a restart of the appliance was requested.

When closed

..

Reason
code
Severity Policy Collected
logs
Open
CASE
809 INFORMATION NodeRevocery None No


Expected MESSAGE-TEXT

Unreachable node restart requested

When sent

Sent when a disconnection from and a reconnection to the power supply (power cycle) is requested.

When closed

N/A

Reason
code
Severity Policy Collected
logs
Open
CASE
810 INFORMATION DockerWatch None No


Expected MESSAGE-TEXT

Docker service restarted

When sent

Sent when a restart of the Docker service was requested.

When closed

N/A

Reason
code
Severity Policy Collected
logs
Open
CASE
811 INFORMATION NTPD None No


Expected MESSAGE-TEXT

NTPD service recovered

When sent

Sent when a restart of the NTPD service is requested.

When closed

N/A

Reason
code
Severity Policy Collected
logs
Open
CASE
812 INFORMATION GPFS None No


Expected MESSAGE-TEXT

GPFS issue recovered

When sent

When GPFS issue is recovered.

When closed

N/A

Reason
code
Severity Policy Collected
logs
Open
CASE
813 INFORMATION AppStartOnEnabledNode None No


Expected MESSAGE-TEXT

Application container restarted

When sent

Sent when a dashDB container must be restarted separately, that is, not as part of a regular application startup.

When closed

N/A

Reason
code
Severity Policy Collected
logs
Open
CASE
814 INFORMATION AppStartup None No


Expected MESSAGE-TEXT

Application recovered by dashDB HA

When sent

Sent when dashDB could be recovered through a high-availability setup. The message is not sent during regular application starts.

When closed

N/A

Reason
code
Severity Policy Collected
logs
Open
CASE
815 INFORMATION FcPortRetrain None No


Expected MESSAGE-TEXT

FC port retrained

When sent

Sent when a fibre channel (FC) port was retrained.

When closed

N/A

Reason
code
Severity Policy Collected
logs
Open
CASE
815 INFORMATION LdapWatch None No


Expected MESSAGE-TEXT

Directory service restarted

When sent

Sent when the directory service apstalpd was restarted.

When closed

N/A

Reason
code
Severity Policy Collected
logs
Open
CASE
817 INFORMATION LdapWatch None No


Expected MESSAGE-TEXT

Security daemon restarted

When sent

Sent when the security service sssd was restarted.

When closed

N/A

Reason
code
Severity Policy Collected
logs
Open
CASE
818 INFORMATION NTPD None No


Expected MESSAGE-TEXT

Node time synchronized

When sent

Sent when the system times of the nodes had to be resynchronized.

When closed

N/A

Reason
code
Severity Policy Collected
logs
Open
CASE
819 INFORMATION ConsoleKeeper None No


Expected MESSAGE-TEXT

Console container restarted

When sent

Sent when the console container had to be restarted to recover the console.

When closed

N/A

Reason
code
Severity Policy Collected
logs
Open
CASE
820 INFORMATION MonitoredAppDisable None No


Expected MESSAGE-TEXT

Application disabled by user

When sent

Sent when the application was disabled by a user (ap apps disable).

When closed

N/A

Reason
code
Severity Policy Collected
logs
Open
CASE
821 INFORMATION MonitoredAppEnable None No


Expected MESSAGE-TEXT

Application enabled by user

When sent

Sent when the application was enabled by a user.

When closed

N/A

Reason
code
Severity Policy Collected
logs
Open
CASE
822 INFORMATION CPUClockRateMonitoring None No


Expected MESSAGE-TEXT

CPU clock tuned successfully

When sent

Sent when the optimal frequency setting of the CPU clock was restored.

When closed

N/A

Reason
code
Severity Policy Collected
logs
Open
CASE
823 INFORMATION CPUClockRateMonitoring None No


Expected MESSAGE-TEXT

Successfully activated tuned.service

When sent

Sent when the tuned service was restarted.

When closed

N/A

Reason
code
Severity Policy Collected
logs
Open
CASE
824 INFORMATION LiftKeeper None No


Expected MESSAGE-TEXT

Lift container restarted

When sent

Sent when the Lift container was restarted because it was stopped or not in an operable state.

When closed

N/A

Reason
code
Severity Policy Collected
logs
Open
CASE
825 INFORMATION SystemdServiceWatch None No


Expected MESSAGE-TEXT

Firewall service restarted

When sent

Sent when the firewall service iptables had to be restarted.

When closed

N/A

Reason
code
Severity Policy Collected
logs
Open
CASE
826 INFORMATION AppStartOnEnabledNode None No


Expected MESSAGE-TEXT

Application container(s) restart requested by db2 HA

When sent

Sent when a restart of dashDB containers was requested through the high-availability setup of Db2.

When closed

N/A

Reason
code
Severity Policy Collected
logs
Open
CASE
827 INFORMATION Rebalance None No


Expected MESSAGE-TEXT

Node suspended

When sent

Sent when a node is suspended as long as it needs to recover.

When closed

N/A

Reason
code
Severity Policy Collected
logs
Open
CASE
828 INFORMATION NodeEnable None No


Expected MESSAGE-TEXT

Node resumed

When sent

Sent when a suspended node resumed operation (either automatically or manually).

When closed

N/A

Reason
code
Severity Policy Collected
logs
Open
CASE
829 INFORMATION Rebalance None No


Expected MESSAGE-TEXT

Node is ready to be resumed

When sent

Sent when the recovery of a suspended node succeeded and the node is ready to resume operation.

When closed

N/A

Reason
code
Severity Policy Collected
logs
Open
CASE
830 INFORMATION CPUClockRateMonitoring None No


Expected MESSAGE-TEXT

CPU configuration updated

When sent

Sent after updating the simultaneous multithreading (SMT) configuration of the CPU. For more information, see:

How do you spell “SMT” on z Systems?

When closed

N/A

Reason
code
Severity Policy Collected
logs
Open
CASE
831 INFORMATION AppStartup None No


Expected MESSAGE-TEXT

Db2 crash recovery in progress

When sent

Sent when the timeout period for a regular start of the application was exceeded because a Db2 crash recovery program is still in progress.

When closed

N/A

Reason
code
Severity Policy Collected
logs
Open
CASE
833 INFORMATION Maintenance None No


Expected MESSAGE-TEXT

Maintenance mode disabled

When sent

Sent when the maintenance mode was disabled.

When closed

N/A

Reason
code
Severity Policy Collected
logs
Open
CASE
834 INFORMATION Maintenance None No


Expected MESSAGE-TEXT

Node restart requested due to docker issues

When sent

Sent when a node is in the process of being restarted due to Docker issues.

When closed

N/A

Reason
code
Severity Policy Collected
logs
Open
CASE
836 INFORMATION Maintenance None No


Expected MESSAGE-TEXT

Application container(s) restart requested by user

When sent

Sent when database containers are restarted on request by a user.

When closed

N/A

Reason
code
Severity Policy Collected
logs
Open
CASE
901 CRITICAL StorageUtilizationCheck General Yes

Expected MESSAGE-TEXT

The z/OS syslog displays the following message: Storage utilization above threshold

The 'ap issues' command, entered on the IBM Integrated Analytics System (IIAS), results in the following message:

ID: 8414 / Date: yyyy-mm-dd hh:mm:ss / Closed date: yyyy-mm-dd hh:mm:ss / Type: STORAGE_UTILIZATION / Reason Code and Title: 901 Storage utilization above threshold / 
Target: sw://fs.data/hadomain1 / Severity: Warning

When sent

This message is sent when the storage utilization has reached 80% of its limit. If the utilization reaches 90%, older log records are replaced with newer records, and the data sets containing the older records will be deleted. 

When closed

A closure depends on the actions taken in the aftermath. Check the "Target:" information in the IIAS message.

If the information reads sw://fs.sda8/hadomain1.node2, there is no problem. The message can be ignored. 

If it reads sw://fs.data/hadomain1, there is a problem that should be analyzed. Call IBM support.

[{"Type":"MASTER","Line of Business":{"code":"LOB10","label":"Data and AI"},"Business Unit":{"code":"BU059","label":"IBM Software w\/o TPS"},"Product":{"code":"SS4LQ8","label":"Db2 Analytics Accelerator for z\/OS"},"ARM Category":[{"code":"a8m0z0000000775AAA","label":"Db2 related products and functions-\u003EDb2 Analytics Accelerator for z\/OS"},{"code":"a8m0z00000007BNAAY","label":"Troubleshooting-\u003EMessage"}],"ARM Case Number":"","Platform":[{"code":"PF035","label":"z\/OS"}],"Version":"All Versions"}]

Document Information

Modified date:
23 January 2024

UID

ibm15694807