Aggregation

Collect and merge information from multiple Guardium® units into a single Guardium Aggregation appliance to facilitate an enterprise view of database usage.

Aggregation Process

  • Accomplished by exporting data on a daily basis from the source appliances to the Aggregator (copying daily export files to the aggregator).
  • Aggregator then goes over the uploaded files, extracts each file and merges it into the internal repository on the aggregator.

For example, if you are running Guardium in an enterprise deployment, you may have multiple Guardium servers monitoring different environments (different geographic locations or business units, for example). It may be useful to collect all data in a central location to facilitate an enterprise view of database usage. You can accomplish this by exporting data from a number of servers to another server that has been configured (during the initial installation procedures) as an aggregation appliance. In such a deployment, you typically run all reports, assessments, audit processes, and so forth, on the aggregation appliance to achieve a wider view, not always an enterprise view. Note: The Aggregator does not collect data, but it is used to present the data from the collectors.

Pre-defined aggregation reports can be located on the Guardium Monitor tab, Enterprise Buffer Usage Monitor, and the Daily Monitor tab, Logging Collectors.

Appliance Types

Collector
Used to collect database activity, analyze it in real time and log it in the internal repository for further analysis and/or reacting in real-time (alerting, blocking, etc.).
Use this unit for the real-time capture and analysis of the database activity.  
Aggregator (see notes 1, 2)
Used to collect and merge information from multiple appliances (collectors and other aggregators) to produce a holistic view of the entire environment and generate enterprise-level reports. The Aggregator does not collect data itself; it just aggregates data from multiple sources.
Central Manager (see notes 1, 3, 4)
Use this Appliance to manage and control multiple Guardium appliances.
With Central Manager (CM), manage the entire Guardium deployment (all the collectors and aggregators) from a single console (the CM console).
This includes patch installation, software updates and the management and configuration of queries, reports, groups, users, policies, etc.
Note:

In many environments, the Central Manager is also the Aggregator. Central Manager and Aggregator can be installed on the same appliance.

Guardium appliance needs to be configured as an Aggregator at install time, in order to be promotable to a Central Manager.

One Central Manager per federated environment

Central Manager/Aggregator enforcement
Starting with v9.5 (v9.0 patch 500), the application will enforce that a Central Manager has to be an Aggregator-type appliance. This would mean that starting with v9.5, only aggregator-type appliances would be promotable to the Central Manager appliance. Pre-existing pre-v9.5 CM appliances are not subject to this change.

Solution for unit showing as down after upgrade

Issue: When upgrading the Aggregator with search mode in CM_only or Local_only mode, this unit shows as down in search post upgrade. Also, if after upgrade, user chooses to change search mode to all_machines search will not be available from the Aggregator.

Solution: Once the Aggregator unit has been upgraded and the user does not want to see the aggregator unit to show as down on the search tooltip, User can run the two commands below

  1. grdapi enable_quick_search schedule_interval=2 schedule_units=MINUTE

  2. restart network

Note: If the environment was in and will be in cm_only or local_only mode, this step will not enable search from aggregator, just make it so that aggregator does not show as down.

Terminology

Table 1.
Term Description
Guardium Appliance The physical or virtual Guardium box; can be either a “collector” or an “aggregator” (with or without central management)
Guardium Unit See Guardium Appliance
Manager Unit An appliance configured as Central Manager
Managed Unit An appliance managed by the Central Manger
Standalone Unit An appliance not in a Central Manager environment
Purge For the best performance, purge all data that is not needed. Purge to free disk space.
Archive Compress the data of a single day into an encrypted file and send it to the aggregator.

Hierarchical Aggregation

Guardium also supports hierarchical aggregation, where multiple aggregation appliances merge upwards to a higher-level, central aggregation appliance. This is useful for multi-level views. For example, you may need to deploy one aggregation appliance for North America aggregating multiple units, another aggregation appliance for Asia aggregating multiple units, and a central, global aggregation appliance merging the contents of the North America and Asia aggregation appliances into a single corporate view. To consolidate data, all aggregated Guardium servers export data to the aggregation appliance on a scheduled basis. The aggregation appliance imports that data into a single database on the aggregation appliance, so that reports run on the aggregation appliance are based on the data consolidated from all of the aggregated Guardium servers.

About the System Shared Secret

The Guardium administrator defines the System Shared Secret on the System Configuration panel, which is described in the following section. The system shared secret is used for archive/restore operations, and for Central Management and Aggregation operations. When used, its value must be the same for all units that will communicate. This value is null at installation time, and can change over time.

The system shared secret is used:

  • When secure connections are being established between a Central Manager and a managed unit.
  • When an aggregated unit signs and encrypts data for export to the aggregator.
  • When any unit signs and encrypts data for archiving.
  • When an aggregator imports data from an aggregated unit.
  • When any unit restores archived data.

Depending on your company’s security practices, you may be required to change the system shared secret from time to time. Because the shared secret can change, each system maintains a shared secret keys file, containing an historical record of all shared secrets defined on that system. This allows an exported (or archived) file from a system with an older shared secret to be imported (or restored) by a system on which that same shared secret has been replaced with a newer one. Shared secrets (current and historic ones) can be exported from one appliance and imported to another through the CLI.

For aggregation to work, the shared secret must be set and be the same for aggregator and all aggregated collectors.

Aggregating, Archiving, and Purging Operations

Scheduled export operations send data from Guardium collector units to a Guardium aggregation appliance. On its own schedule, the aggregation appliance executes an import operation to complete the aggregation process. On either or both units, archive and purge operations are scheduled to back up and purge data on a regular basis (both to free up space and to speed up access operations on the internal database). The export, archive, and purge functions can work on the same data, but not the same date ranges. For example, you may want to export and archive all information older than one day and purge all information older than one month, thereby always leaving one month of data on the sending unit.

Note:

When setting the schedule of import on an aggregator, it should be planned to run after export is completed on all collectors.

CAS data is also aggregated and archived.

Note: The alert for no traffic is inactive for aggregator servers.

Managing Data on an Aggregator

  • Exporting Data
    • Stopping Export
  • Importing Data
    • Stopping Import
  • Archiving and Purging
  • Stopping Archiving and Purging
  • Verify Archiving and Purging Process
  • Reporting on Aggregation and Archiving Activity
  • Restoring

Exporting Data

Table 2. Exporting Data
Topic Description
Function

Compress the data of a single day (midnight to midnight, typically - yesterday)  into an encrypted file and send it to the aggregator (or to an external repository on Archive).

Schedule

Executed on a daily basis.

Starts immediately after midnight (00:10) to include full day’s data.

Assumed to take up to 2 hours to complete (Average – dependent on amount of data).

High Level Process

Create a temporary database.

Load the relevant data (last day’s activity) to the tmp db.

Update auto-increment IDs in tmp db to ensure uniqueness.

Create an encrypted compressed export file of the tmp database.

Copy the export file to the aggregator (or to an external repository on Archive).

To export data to an aggregation appliance, follow the procedure. You can define a single export configuration for each Guardium unit.

  1. Click Manage > Data Management > Data Export to open Data Export.
  2. Check the Export box as this will open additional options for exporting data.
  3. In the boxes following Export data older than, specify a starting day for the export operation as a number of days, weeks, or months prior to the current day, which is day zero. These are calendar measurements, so if today is April 24, all data captured on April 23 is one day old, regardless of the time when the operation is performed. To archive data starting with yesterday’s data, enter the value 1.
  4. Optionally, use the boxes following Ignore data older than to control how many days of data will be archived. Any value specified here must be greater than the Export data older than value, so you always export at least two days of data. If you leave the Ignore data older than blank, you export data for all days older than the value specified in the Export data older than row; It is recommended to always set the Ignore older than value, otherwise you will be exporting the exact same days over and over again; overloading the network and the aggregator with redundant data (that will be ignored).
  5. The Export Values box is checked by default. In some cases, where the collector resides in a country that prohibits the export of data, and the aggregation appliance resides in another country, you would want to clear the Export Values check box, which would mask all fields containing database values.
  6. In the Host box, enter the IP address or DNS host name of the aggregation appliance to which this system’s encrypted data files will be sent. There is also an option to enable a secondary aggregation for export data over more then one aggregator. There are two Host boxes available, the first one is required, while the Secondary Host is an option. This unit and the aggregation appliance to which it is sending data must have the same System Shared Secret. If not, the export operation works, but the aggregation appliance that receives the data is not able to decrypt the exported file and the Import will fail. See System Shared Secret in System Configuration for more information. The Shared Secret is required to be identical on both exporting system and receiving system. The reason for this is that unless they have same shared secret, the configuration on the exporting system will not be set and there will be a message for a test file that can not be sent to the receiving system.
  7. Use the Scheduling section to define a schedule for running this operation on a regular basis.
  8. Click the Save button to save the export and purge configuration for this unit. When you click the Apply button, the system attempts to verify that the specified aggregator host will accept data from this unit. If the operation fails, the following message is displayed and the configuration will not be saved: A test data file could not be sent to this host. Please confirm the hostname or IP address is entered correctly and the host is online.
  9. Click Run Once Now to run the operation one time.

Stopping Export

To stop the export of data to an aggregation appliance:

  1. Click Manage > Data Management > Data Export to open Data Export.
  2. Clear the Export checkbox.
  3. Click Save.
Note: Stopping an export after the Run Once Now button has been clicked is impossible.

Importing Data

The Guardium collector units export encrypted data files to another Guardium appliance configured as an aggregation appliance. The encrypted data files reside in a special location on the aggregation appliance until the aggregation appliance executes an import operation to decrypt and merge all data to its own internal database.

Note: To avoid the possibility of importing files that have not completely arrived, the aggregation appliance will not import files that have changed in the last two minutes.
Table 3. Importing Data
Topic Description
Function

Import and merge the imported data into the internal databases of the Aggregator.

Schedule

Executed on a daily basis. Do not run more than once a day.

Starts at 02:00 (or after export has ended).

Assumed to take up to 3 hours to complete.

High Level Process (for each purged day)

Construct the delete command for each purged table (tables and the purge conditions defined in AGG_TABLES).

Execute the delete commands for each of the tables.

Follow the procedure to define the Data Import operation on an aggregation appliance. You can define only a single Data Import configuration on each unit.

  1. Click Manage > Data Management > Import to open Import.
  2. Check the Import checkbox which causes the appearance of an additional non-modifiable field indicating the location of the data files to be imported.
  3. Click Apply to save the configuration. The Apply button is only available when you toggle the Import data from checkbox on or off.
  4. Click Run Once Now to run the operation once.
  5. Click Modify Schedule to schedule the operation to open the general-purpose task scheduler and run on a regular basis. This aggregation appliance and all units exporting data to it must have the same System Shared Secret. If not, the export operations will still work, but the aggregation appliance will not be able to decrypt the files of exported data.

Stopping Import

To stop importing data sent from other Guardium units:

  1. Click Manage > Data Management > Import to open Import.
  2. Clear the Import data box.
  3. Click Apply to save the configuration. Stopping importing does not stop other Guardium units from exporting data to this system. To stop that, you must stop the Export operation on each sending unit.
Note: Stopping an import once the RUN ONCE NOW button is clicked is impossible.

Archiving and Purging

Archiving and purging data on a regular basis is essential for the health of your Guardium system. For the best performance, we strongly recommend that you archive and purge all data that is not needed. Important - purge to free disk space. For example, if you only need three moths of data on the Guardium appliance, archive and purge all data that is older than 90 days.

The archive and purge process frees space and preserves information for future use. You should periodically archive and purge data from standalone units and from aggregation units. The Guardium’s archive function creates signed, encrypted files that cannot be tampered with. Archive files are transferred and stored on external systems such as file servers or storage systems.

Note:

If both Archive and Purge are scheduled, Purge will run after Archive.

Data that was archived on a collector can be restored either on another collector or an aggregator server. Restoring of data that was archived on an aggregator to a collector machine is not supported.

By default, all static tables on an aggregator are archived daily. Adding the static tables to the normal purge process eliminates the existence of orphans, freeing up disk space and improving report performance.

Archive and export of static tables on an aggregator includes full static data only on the first day of the month (archive) or when the export configuration changes (export). Use the CLI commands, store archive_table_by_date [enable | disable] or show archive_table_by_date. Other relevant CLI commands are store aggregator clean orphans or show aggregator clean orphans.

Scheduling Data Management tasks - Default schedule times are supplied when the unit is built and these can be amended accordingly. The Data Management tasks should be scheduled at less busy times, for example, overnight. They should be spaced out so as not to overlap (for example, the start of one task should not run into the start of another before finishing.)

Aggregator Data Archive, when dealing with an Aggregator/ Central Manager that performs Data Imports and Data Archives. A default or common setting is to have the Data Archive perform an Archive of data older than one day ignoring data older than two days. If it happens that the Data Archive is scheduled to run BEFORE the Data Imports from other Collector(s)/Aggregator(s), then the Archive will NOT contain the Imports meant for that days Archive. Imagine the following schedule: Data Archive to run at 30 minutes past Midnight; Data Imports to run at 6:00 AM for data older than 1 day - ignoring older than 2 days. When the Archive happens - it will not Archive any relevant yesterday data - no Imports for that days data have yet occurred. In this example, the Data Archive should be re-scheduled to occur AFTER the Data Import(s) have finished. This way the Archive would correctly contain data for yesterday.

Table 4. Archiving and Purging Data
Topic Description
Purge Function

Delete old records from appliance (typically - older than 60 days) to free up space and speed up access operation to the internal database.

Purging is based on dates (deleting whole days’ worth of data), but will not delete records that are still “in use” (for example: open sessions).

Schedule

The default purge activity is scheduled every day at 5:00 AM.

Collectors, after the export/archive.

Aggregator, after the import.

Assumed to take up to 2 hours to complete.

High Level Process (for each purged day)

Purge configuration is used by both Data Archive and Data Export.

Use the Purge data older than field to specify a starting day for the purge operation as a number of days, weeks, or months prior to the current day, which is day zero.

Default Purging

The default value for purge is 60 days

The default purge activity is scheduled every day at 5:00 AM.

For a new install a default purge schedule will be installed that is based on the default value and activity

When a unit type is changed between manager managed or back to standalone the default purge schedule will be applied The purge schedule will not be affected during an upgrade

It may be necessary to run reports or investigations on this data at some point. For example, some regulatory environments may require that you keep this information for three, five, or even seven years in a form that can be queried within 24-hours. This functionality is supported by the Guardium restore capability, which allows you to restore archived data to the unit.

The following sections describe how to define and schedule archiving and how to restore from an archive.

Note: The archive and restore operations depend on the file names generated during the archiving process. DO NOT change the names of archived files.

Archive data files can be sent to an SCP or FTP host on the network, or to an EMC Centera or TSM storage system (if configured). You can define a single archiving configuration for each unit To archive data to another host on the network and optionally purge data from the unit, follow the procedure.

  1. Click Manage > Data Management > Data Archive to open Data Archive.
  2. Check the Archive checkbox to expose additional fields for the archive process.
  3. In the boxes following Archive data older than, specify a starting day for the archive operation as a number of days, weeks, or months prior to the current day, which is day zero. These are calendar measurements, so if today is April 24, all data captured on April 23 is one day old, regardless of the time when the operation is performed. To archive data starting with yesterday’s data, enter the value 1.
  4. Optionally, use the boxes following Ignore data older than to control how many days of data will be archived. Any value specified here must be greater than the value in the Archive data older than field. If you leave the Ignore data older than row blank, you archive data for all days older than the value specified in the Archive data older than row. This means that if you archive daily and purge data older than 30 days, you archive each day of data 30 times (before it is purged on the 31st day). Depending on the archive options configured for your system (using the store storage-system CLI command), you may have EMC Centera or TSM options on your panel. If you select one of those archive destinations, see the appropriate topic.
    1. EMC Centera Archive and Backup
    2. TSM Archive and Backup
  5. Enter the IP address or DNS Host name of the host to receive the archived data
  6. In the Directory box, identify the directory in which the data is to be stored. How you specify this depends on whether the file transfer method used is FTP or SCP. For FTP, specify the directory relative to the FTP account home directory. For SCP, specify the directory as an absolute path.
  7. In the Username box, enter the user name to use for logging onto the host machine. This user must have write/execute permissions for the directory specified in the Directory box.
  8. In the Password box, enter the password for the user, then enter it again in the Re-enter Password box.
  9. Data Purge
  10. Check the Purge checkbox to purge data, whether or not it is archived. When this box is marked, the Purge data older than fields display. It is important to note that the Purge configuration is used by both Data Archive and Data Export. Changes made here will apply to any executions of Data Export and vice-versa. In the event that purging is activated and both Data Export and Data Archive run on the same day, the first operation that runs will likely purge any old data before the second operation's execution. For this reason, any time that Data Export and Data Archive are both configured, the purge age must be greater than both the age at which to export and the age at which to archive.
  11. If purging data, use the Purge data older than fields to specify a starting day for the purge operation as a number of days, weeks, or months prior to the current day, which is day zero. All data from the specified day and all older days will be purged, except as noted otherwise. Any value specified for the starting purge date must be greater than the value specified for the Archive data older than value. In addition, if data exporting is active (see Exporting Data to an aggregation appliance), the starting purge date specified here must be greater than the Export data older than value. There is no warning when you purge data that has not been archived or exported by a previous operation. The purge operation does not purge restored data whose age is within the do not purge restored data timeframe specified on a restore operation. For more information, see Restoring Archived Data.
  12. Use the Scheduling section to define a schedule for running this operation on a regular basis.
  13. Click Save to verify and save the configuration changes. When you click the Save button, the system attempts to verify the specified Host, Directory, Username, and Password by sending a test data file to that location.
  14. Click Run Once Now to run the operation once.

Orphan cleanup on aggregators

When the aggregator includes restored data, orphans cleanup related to the restored data will be set to run according to the expiration date set when data was first restored.

If any changes are done through GuardAPI commands related to the expiration date, this will not affect the date restored data that is available for Orphans cleanup.

For example: The user restores data and wants to keep this data for 7 days. This means the expiration date of this data will be in 7 days from today and this data will be available for orphan cleanup after 7 days.

If the expiration date is changed (set to keep the data for shorter/longer period - it won't affect the date this data is available for orphan cleanup. Customer should pay attention for this especially if they change the expiration period to be longer - in order not to lose data), then the rest of the data on the machine will be available for orphan cleanup as first designed.

EMC Centera Archive and Backup

To use EMC Centera:

  1. Click Manage > Data Management > Data Archive to open Data Export.
  2. Click on the Data Archive or System Backup in the Data Management section. Initially, the Network radio button is selected by default, and the Network backup parameters are displayed
  3. Select the EMC Centera radio button. The EMC Centera parameters will be displayed on the panel.
  4. In the Retention box, enter the number of days to retain the data. The maximum is 24855 (68 years). If you want to save if for longer, you can restore the data later and save it again.
  5. In the Centera Pool Address box, enter the Centera Pool Connection String; for example: 10.2.3.4,10.6.7.8/var/centera/profile1_rwe.pea
  6. Click Upload PEA to upload a Centera PEA file to be used for the connection string.
  7. Click Save to save the configuration. The system will attempt to verify the Centera address by opening a pool using the connection string specified. If the operation fails, you will be informed and the configuration will not be saved.

TSM Archive and Backup

When you select TSM as an archive or backup destination, the TSM portion of the archive or backup configuration panel expands. Before setting TSM as an archive or backup destination, the Guardium system must be registered with the TSM server as a client node. A TSM client system options file (dsm.sys) must be created (on your PC, for example) and uploaded to Guardium. Depending on how that file is defined, you may also need to upload a dsm.opt file. For help creating a dsm.sys file for use by Guardium, consult with your company’s TSM administrator. To upload a TSM configuration file, use the CLI command, import tsm config.

The TSM (or Spectrum Protect client) lifecycle is defined by the Spectrum Protect product terms.

To use TSM:

  1. Click Manage > Data Management > Data Archive to open Data Archive.
  2. Select the TSM radio button. The TSM parameters will be displayed on the panel.
  3. In the Password box, enter the TSM password that this Guardium unit uses to request TSM services, and re-enter it in the Re-enter Password box.
  4. Optionally enter a Server name matching a servername entry in your dsm.sys file.
  5. Optionally enter an As Host name.
  6. Click Save to save the configuration. When you click the Apply button, the system attempts to verify the TSM destination by sending a test file to the server using the dsmc archive command. If the operation fails, you will be informed and the configuration will not be saved.

Stopping Archiving and Purging

  1. Click Manage > Data Management > Data Archive to open Data Archive.
  2. Clear the Archive or Purge box.
  3. Click Save.

Verify Archiving and Purging Process

  1. Click Reports > Guardium Operational Reports > Aggregation/Archive Log to open the Aggregation/Archive Log.
  2. Check to ensure that each Archive/Purge operation has a status of Succeeded.

Reporting on Aggregation and Archiving Activity

  1. Navigate to Manage > Reports > Data Management > Aggregation/Archive Log to open the Aggregation/Archive Log.
  2. Define a query and build a report.

Restoring

As described previously, archives are written to a SCP or FTP host, or to a Centera or TSM storage system. To restore archives, you must copy the appropriate file(s) back to the Guardium system on which the data is to be restored. There is a separate file for each day of data. Depending on how your archive/purge operation is configured, you may have multiple copies of data archived for the same day. Archive and export data file names have the same format: <daysequence>-<hostname.domain>-w<run> datestamp>-d<data_date>.dbdump/TAR file. To restore file for archived data (and not backup system), you need to use the GUI screen called Catalog Archive. The archive and restore operations depend on the file names generated during the archiving process. DO NOT change the names of archived files. If a generated file name is changed, the restore operation will not work.

For example: 732423-g1.guardium.com-w20050425.040042-d2009-04-22.dbdump/TAR file.

Unless you are restoring data from the first archive created during the month, you will need to restore multiple days of data. That is because when restoring data, Guardium needs to have all of the information that it had when the data being restored was archived. After the archive was created, some of that information may have been purged due to a lack of use. All information needed for a restore operation is archived automatically, the first time that data is archived each month. So, when restoring data, you can restore the first day of the month and all the following days until the desired day or restore the desired day and then the first day of the following month

For example, to restore June 28th, either restore June 1st through June 28th, or restore June 28th and July 1st.

To restore file for archived data (and not backup system), you need to use the GUI screen called Catalog Archive. The archive and restore operations depend on the file names generated during the archiving process. DO NOT change the names of archived files. If a generated file name is changed, the restore operation will not work.

  1. Click Manage > Data Management > Data Restore to open Data Restore.
  2. Enter a date in the From box, to specify the earliest date for which you want data.
  3. Enter a date in the To box, to specify the latest date for which you want data.
  4. In the Host Name box, optionally enter the name of the Guardium appliance from which the archive originated.
  5. Click Search.
  6. In the Search Results panel, mark the Select box for each archive you want to restore.
  7. In the Don't purge restored data for at least box, enter the number of days that you want to retain the restored data on the appliance.
  8. Click Restore.
  9. Click Done when you are finished.

Troubleshooting

On an escalation to technical support, please supply a detailed log from the time when the problem occurred. Navigate to Manage > Reports > Data Management > Aggregation/Archive Log and define a report for the time period in question.

Calculating maximum number of Collectors per Aggregator

When a Guardium system is built from an .ISO, a default value of 10 for the maximum number of collectors per aggregator is set.

When a customer upgrades the Guardium system, the system calculates the maximum number of collectors using the following logic:

  1. Get number of collectors according to data in internal Guardium table. The default value is 10.

  2. If results of step 1 is 0 (no collectors are found), the system sets this value to 10.

  3. If a different number of collectors is found, the system will add 20 percent more to the number determined in step 2.

  4. For example, if Step 1 did not find any collectors, then Step 2 will set a value of 10, and then Step 3 will add 20% to it and will make it 12.

  5. Another example, in Step 1 the system found five collectors exporting to an aggregator. In this case, the value is set to 5. Step 2 is not relevant as result was 5 and not 0. Step 3 will add 20% to 5 and will set this value to 6.