Self Monitoring

The Guardium solution monitors itself to minimize disruptions and correct problems automatically whenever possible.

Guardium uses a three-pronged approach to ensuring that it is available, functioning properly, has not been tampered with, and alerts users of problems:

Reports - Whether textual or graphical, reports are at the core of the Guardium® solution. By using Guardium’s Query Builder and Report Builder, a user can effectively report on any of the self-monitoring data collected through associated domains and entities. Many of the predefined reports can be enhanced through more detailed effort to provide higher levels of granularity. A specific query builder has been created (VA Test Tracking) to report on tests that are available for security assessments.
Alerts - In addition to building reports, a user can define an alert against those reports through defined thresholds--indicating an exception or policy rule violation. These alerts can either be real-time or determined through historical analysis. These alerts can then trigger notification to users through SMTP, SNMP, syslog, or a custom Java™ class.
Self-Monitoring Utility - Guardium has implemented an internal self-monitoring demon (always running) service utility on collectors and aggregators that wakes up every 5 minutes and does system scan, checking components for optimal configuration, operational effectiveness, and repairs when necessary. For example if the utility finds the Web Server down, it will first validate a complete shutdown of the service, restart the service, and then alerts an administrative user.

Components Monitored

Table 1. Components monitored
Components	How to access
System Disk space(%full)	Manage > System View > System Monitor Alert: You can use the Queries and Correlation Alerts, utilizing the Sniffer Buffer domain and Sniffer Buffer Usage entity to create alerts
CPU Load Uptime and Reboots Memory Usage Monitoring Engine (sniffer) - Status: up/down/stuck/overloaded CPU Usage Memory Usage Overload and delays (queues)	Reports > Guardium Operational Reports > Buff Usage Monitor Alert: You can use the Queries and Correlation Alerts, utilizing the Sniffer Buffer domain and Sniffer Buffer Usage entity to create alerts
Failed Logins	Manage > System View > System Monitor. Alert: You can use the Queries and Correlation Alerts, utilizing the Guardium Login domain and Guardium Users Login entity to create alerts
Lost requests	Manage > Reports > Activity Monitoring > Dropped Requests Alert: You can use the Queries and Correlation Alerts, utilizing the Exceptions domain and Exceptions entity to create alerts
Change in data patterns	Reports >Real-time Operational Reports > Values Changed Alert: See Viewing an Audit Process Definition for alert: Data Source Changes - alert on any data source changes
Packets rates Request rates Ignored data	Reports >Guardium Operational Reports > Buffer Usage Monitor Alert: You can use the Queries and Correlation Alerts, utilizing the Sniffer Buffer domain and Sniffer Buffer Usage entity to create alerts
Scheduled Jobs Exceptions	Reports >Guardium Operational Reports > Scheduled Job Exceptions, or See Predefined admin Reports: Alert: You can use the Queries and Correlation Alerts, utilizing the Exceptions domain and Exception Type entity to create alerts.
Audit processes status	Reports >Guardium Operational Reports > Number of Active Audit Processes, or See Predefined admin Reports. Alert: You can use the Queries and Correlation Alerts, utilizing the Audit Process domain and Audit Process entity to create alerts
Inspection Engine Changes	Reports >Activity Monitoring > S-TAP Configuration Change History Alert: See Viewing an Audit Process Definition for alert: Inspection Engines and S-TAP - alert on any activity related to inspection engine and S-TAP configuration
Guardium Users Activity - Login/logout	Reports >Guardium Operational Reports > Logins to Guardium, or See Predefined admin Reports Alert: You can use the Queries and Correlation Alerts, utilizing the Guardium Login domain and SQL Guard Login entity to create alerts
Failed Logins	Reports >Guardium Operational Reports > Logins to Guardium, or See Predefined admin Reports Alert: See Viewing an Audit Process Definition for alert: Failed Logins To Guardium - alert if have more than 5 failed logins in the last 11 minutes, or Select Tools > Report Building > drop-down Report Title: Guardium Logins, See Reports for additional information
User Activity Audit Trail	Reports >Guardium Operational Reports > User Activity Audit Trail, or See Predefined admin Reports Alert: You can use the Queries and Correlation Alerts, utilizing the Guardium Activity domain and SQL Guard User Activity Audit entity to create alerts Note: User activity includes those instances where a user changes to the root shell -- providing a log of their root activity.
Creation/Deletion of Users/Roles	Reports >Guardium Operational Reports > User Activity Audit Trail, or See Predefined admin Reports Alert: See Viewing an Audit Process Definition for alert: Guardium - Add/Remove Users - alert on any Addition or Removal of Guardium User
Permissions monitoring	Reports >Guardium Operational Reports > Guardium Users, Guardium Roles, or Guardium Applications Alert: You can use the Queries and Correlation Alerts, utilizing the Application domain and Application Data entity to create alerts
S-TAP® Info (Central Manager)	Report: See S-TAP Reports. On a Central Manager, an additional report, S-TAP Info, is available. This report monitors S-TAPs of the entire environment. Upload this data using the Custom Table Builder. This report is the result of uploading data using remote sources on a Central Manager and using that data to see a consolidated view of S-TAPs. S-TAP info is a predefined custom domain which contains the S-TAP Info entity and is not modifiable like the entitlement domain.

Guardium nanny process

The Guardium nanny is an internal process that monitors the system's critical resources and then alert when potential problems are emerging. Nanny alerts go to syslog, can be forwarded and sent as emails to the administrator, and in some cases take remedial actions.

The nanny watches key components and critical resources within the Guardium system—guaranteeing their availability and reliability. These resources and components include:

Web service monitoring - service port (default 8443) not responding or tomcat service is not up
- syslog message
- mail admin
- will issue restarts of the web service
Inspection Engine activity - snif overloaded, not responding, or failure
- syslog message
- mail admin
- mail guardium support (optional)
- will try and fix by restarting the snif under certain conditions
- will try and respawn snif if process dies
Diskspace utilization - alerts when > 75% on the critical partitions
- syslog message
- alert admin
- will perform preventive action by cleaning temporary files when over 95%
Failed login (ssh) to the appliance - checks for ssh daemon's messages and alerts on failed ssh login attempts
- mail admin (it's already in syslog)
Monitor internal database (TURBINE) - verify service is up, status, and capacity utilization monitoring
- syslog message
- mail admin
- restart service
File System utilization - every five minutes, Nanny.pl checks file system at /var, warning alert when > 75% in the /var directory, critical alert and services stopped when >90% in /var directory
- syslog message
- alert admin
- Admin clean-up required, using CLI commands: show filesystem usage, clear filesystem dir, and restart stopped_services