IBM Support

ReplMissedMetadataHeartbeat Events

Question & Answer


Question

What is the purpose of the ReplMissedMetadataHeartbeat event, what causes it to trigger, and can we prevent the alerts from showing so often?

Cause

The ReplMissedMetadataHeartbeat event messages are sent between replication nodes at 30 second intervals, and each keeps track of when the last one was received. The event is generated for one of two events:

•A metadata heartbeat from a replication node is missed for more than 5 intervals. The event is generated for each node.

•The PTS latency for files exceeds 10 seconds.

This is mostly because of how the event system is designed where there is not a time threshold that older events can "age" out. Having no aggregation in use will result in an event for a missed HB
any time they occur. Often these messages will be seen on a random event where the timing of the heartbeat delivery was only slightly delayed passed the event threshold.

Using aggregation avoids this, however customers can get what seems like random serious events since the occurrence accrues over time and then are all reported at once. It is important to check the timestamp for these errors to verify if they have occurred during a specific window or over a long duration.

Answer

You can get more detailed missed heartbeat information by using the command:
nzreplstate -heartbeat

You can also check which events are currently enabled to see what is set for the replMissedMetadataHeartbeat with:
nzevent show

If there are a large amount of events, you can always grep for the specific one we are looking for, ie:
nzevent -syntax | grep -i ReplMissedMetadataHeartbeat

In the example below, the ReplMissedMetadataHeartbeat notification is enabled with no aggregation, meaning it will send an email for each alert. Most customers would probably want this set to a higher number as to not receive the false alerts that often occur.

-name 'replMissedMetadataHeartbeat' -on yes -eventType replMissedMetadataHeartbeat -eventArgsExpr '' -notifyType email -dst 'michael@ibm.com' -ccDst '' -msg 'replMissedMetadataHeartbeat' -bodyText '' -callHome no -eventAggrCount 0

Here is another example that you can use to enable the aggregation to 10 for the admin user on the replMissedMetadataheartbeat event:

nzevent modify -u admin -pw password -name replMissedMetadataHeartbeat -on yes -dst email@ibm.com -eventAggrCount 10

[{"Product":{"code":"SSULQD","label":"IBM PureData System"},"Business Unit":{"code":"BU053","label":"Cloud & Data Platform"},"Component":"IBM Netezza Analytics","Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"1.0.0","Edition":"All Editions","Line of Business":{"code":"LOB10","label":"Data and AI"}}]

Document Information

Modified date:
17 October 2019

UID

swg21699318