IBM Support

QRadar: About event retention buckets

Question & Answer


Question

What are retention buckets and retention policies for administrators who are responsible for managing data storage in QRadar?

Answer

How QRadar Data is Stored on Disk

QRadar® stores event and flow data in a custom minute-by-minute time-series Ariel database. When QRadar processes the event or flow and needs to store event or flow data on disk, this information is stored in a series of flat files in chronological order. Ariel records (our normalized records) are stored in /store/ariel/<db>/records/YYYY/MM/dd/hh where <db> is the Ariel database (events, flows, simarc (QRM), Global Views(gv)), YYYY is year, MM month, day, hour. 
 
The raw payloads are stored separately in a similar directory structure:  /store/ariel/<db>/payloads/YYYY/MM/dd/hh.  Ariel writes out its records and indexes once per minute into what is referred to as intervals. Once per minute, the data is accumulated and written to disk. The timestamp on the files on disk matches the timestamp that they were received. Every event received during that minute is written into that Ariel interval file.
 
To verify this SSH into a QRadar Appliance. From /store/ariel/events/records/, change into one of the hourly records directories and enter the command ls to list all the files. There are many different files there, and they are laid out as:  events˜mm˜<UUID>˜<retention_bucket> - where UUID is a unique UUID for that minute, mm is the minute, and retention bucket is the numbered retention bucket (˜~0 is the default retention bucket). 
An example of a file in the Events database:
An example of a file in the Events database:
/store/ariel/events/records/2016/1/1/9/events~21_0~ab62f8ecd3b4408~9301b7073617ac26~0

image 7437
The example indicates that it is an event record file (normalized data - all event records for this minute) that was captured on January 1st, 2016, at 9:21.  If the file is deleted, you no longer retain events for that minute.
Other files in that directory are indexes and look like:
/store/ariel/events/records/2016/1/1/9/SourceIP~21_0~ab62f8ecd3b4408~9301b7073617ac26~0
image 7456
Example of the SourceIP index for that minute.
In the example SourceIP, if these were the records and indexes for retention bucket 5, the end of the filename would look like ~˜5 instead. The last piece in this directory is the Lucene directory. This is for the Quick Search that we have within QRadar.

Retention policies and how they are applied

The example indicates that it is an event record file (normalized data - all event records for this minute) that was captured on January 1st, 2016 at 9:21.  If the file is deleted, you no longer retain events for that minute.
Other files in that directory are indexes and look like:
/store/ariel/events/records/2016/1/1/9/SourceIP~21_0~ab62f8ecd3b4408~9301b7073617ac26~0

image-20220706232201-1
Note: The user interface can display a Compression column as an existing known issue in APAR IJ20880.

Retention

  • Keep data placed in this bucket for your retention period: Data is never deleted while it is still within the retention period. If the system reaches 95% disk usage, all services are shut down before that data is deleted.
  • Delete data in this bucket: This defines how strictly you would like QRadar to adhere to your retention policy.  If you must keep three months of data but would like to keep 6, then you can set this to when storage space is required. If you set this to Immediately after retention period expires, QRadar deletes the data hourly to clean up disks and preserve space.

    Legacy user interface
    A field can exist in older QRadar versions that includes a field Allow data in this bucket to be compressed. This option no longer exists as all data in QRadar is compressed by default.
    image-20220706234208-1

Retention buckets and how they work

Retention buckets work as a waterfall and in priority order. Events and flows run through the filters in each retention bucket in order (1-10). The first bucket they match is the bucket data falls into. QRadar never writes an event into two different retention buckets. As we saw in Ariel Data, retention buckets end up as different files on the disk such that their retention policies can be handled separately.
 
As an illustration of this, try to set up two retention buckets, 1 and 2. Both of them set up the same criteria (and when the source IP is 10.0.0.0/8). Wait for data to accumulate (hopefully with source IP addresses in the 10 address space). Look at the files in the directories mentioned in the Ariel Data example. You notice files created with retention bucket 1, but none for retention bucket 2. This is because of the ordering. All the events match the first bucket, so they end up in the first bucket. None of the remaining events match the second bucket since they were already taken into the first.
image 7459
An important note to keep in mind is that any changes to the filters for a retention bucket only come into play after the change is made. It is not retroactive. If you were to let bucket 1 collect all the 10/8 data for 24 hours and then switched it to only 192.168 data, we do not go back in the past and make changes retroactive. The result is that retention bucket 1 has 24 hours of 10/8 data and everything from the filter change thereafter is 192.168 data. Contrast this with the policy set on that bucket. If you retain all data in bucket 1 for 7 days, all data is retained regardless of how many filter changes were made or what is in the bucket. If you changed it to a shorter time frame, that change is implemented. The filters are applied in a linear progression. The retention policies are applied to the bucket as a whole regardless of filters. Retention policies are meant to be a long-term configuration and should not be changing daily or regularly.

How retention policies are enforced

Your system up and running and data is accumulated in Ariel. When QRadar needs to evaluate data to be removed, this is what occurs:
 
  • Deletion: If the bucket is set to Delete data in this bucket: immediately after the retention period expired, no disk space checks are done and the deletion task removes any data that is past the retention. Deletion tasks run every hour. These tasks are submitted at these intervals for each retention bucket.
  • Disk space checks: All these tasks only run when the free disk space drops to 15% or less and remove the oldest events on disk based on the checks in the retention bucket. Deletion tasks run until the free disk space is 18% or they have run out of policy. Refer to the IBM® Knowledge Center article on Data Retention for more information.
     
 

[{"Line of Business":{"code":"LOB24","label":"Security Software"},"Business Unit":{"code":"BU059","label":"IBM Software w\/o TPS"},"Product":{"code":"SSBQAC","label":"IBM Security QRadar SIEM"},"ARM Category":[{"code":"a8m0z000000cwsyAAA","label":"Admin Tasks"}],"ARM Case Number":"","Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"All Version(s)"}]

Document Information

Modified date:
06 July 2022

UID

ibm16379748