View alerts
Watson Studio Local automatically notifies you when a node or pod goes down or when you're at risk of overloading a resource.
By default, Watson Studio Local issues alerts when:
- CPU usage or reserved CPU usage on a node goes above 90%
- Memory usage or reserved memory usage on a node goes above 90%
- Disk usage on a node goes above 90%
- A node in the cluster goes down
- A pod is not running or is in an unknown state for more than 5 minutes
If you want to change the threshold at which alerts are issued, you can configure them on the Settings page. For more information, see Set up Watson Studio Local.
Watson Studio Local also issues a follow-up alert once the problem is resolved.
Tip: If Watson Studio Local is configured to connect to your
SMTP server, each Watson Studio Local admin receives alerts through
email.
When you have alerts, the alert icon displays the number of unread alerts in your queue:
You can access alerts in either of the following ways:
- If you want a quick peek at your alerts, click the alert icon in the menu bar ( ).
- If you want to manage your alerts, you can access the Alerts page from the menu icon ( ).
From the Alerts page you can:
- Filter alerts by type
- Filter alerts based on whether they were read or not
- Filter alerts by status
- Mark alerts as read or unread
- Delete alerts
Tip: When you delete an alert, you can't access it again. Make sure that you don't need
it before you delete it.
Click Refresh and alert settings to adjust when alerts are generated and
how frequently the metrics on the dashboard are refreshed:
- Log retention (days)
- Number of days to retain the log files.
- CPU alert threshold (%)
- The CPU usage threshold at which an alert is triggered. When the CPU usage reaches this threshold, the node color immediately changes to red. The alert is generated only if the usage stays above the threshold longer than the time that is specified for the Alert length threshold setting.
- Memory alert threshold (%)
- The memory usage threshold at which an alert is triggered. When the memory usage reaches this threshold, the node color immediately changes to red. The alert is generated only if the usage stays above the threshold longer than the time that is specified for the Alert length threshold setting.
- Disk alert threshold (%)
- The disk usage threshold at which an alert is triggered. When the disk usage reaches this threshold, the node color immediately changes to red. The alert is generated only if the usage stays above the threshold longer than the time that is specified for the Alert length threshold setting.
- Alert length threshold (minutes)
- Specify a time, in minutes, for an error condition to resolve itself before an alert is issued.
- Metric retention (days)
- The number of days to retain metrics.
- CPU alert warning threshold (%)
- The CPU usage threshold at which a warning is triggered and the node color changes to yellow.
- Memory alert warning threshold (%)
- The memory usage threshold at which a warning is triggered and the node color changes to yellow.
- Disk alert warning threshold (%)
- The disk usage threshold at which a warning is triggered and the node color changes to yellow.
- Dashboard refresh (seconds)
- Number of seconds between refreshes.