Removing data from the Ariel database
The ACP (Ariel Copy) tool reads through an Ariel database, applies criteria, and then re-writes the filtered data to another location. This tool is useful for GDPR (General Data Protection Regulation) compliance. For example, you want to remove all data that is flagged with a given user name or source IP address.
Usage
Provide the tool with database (events | flows), the start and end times that need to be filtered, the AQL query (-q) or the key creator/value (-k/-p) combination to apply, the user who needs to be running it (-u), and the destination directory (-d).
The ACP tool reads through the Ariel database for the specified times, applies the filter, and places all the records that match the filter in the specified destination directory.
[root@m5arch06 templates]# /opt/qradar/bin/runjava.sh com.q1labs.ariel.io.ACP
data base name is required
Configured data bases:
flows
hc
simarc
events
simevent
usage: ACP -n events -d /store/new_ariel -b "2013/11/21 1:00:00" -e
"2013/11/21 4:00:00"
ACP -n events -d /store/new_ariel -b "2013/11/21 1:00:00" -e "2013/11/21
4:00:00" -q "username is not null and sourceip = '127.0.0.1'" -u admin
Options:
-n,--name data base name, for example -n events
-b,--begintime begin time, for example -t "2018/08/28 09:04",
optional, by default current system time
-e,--endtime end time, for example -t "2018/08/28 10:04",
optional, by default current system time
-d,--destination Destination directory i.e. /store/filtered_ariel/
-h,--help print this message
-k,--keycreator class for IKeyCreator
-p,--param optional parameter for IKeyCreator
-q,--query AQL 'where clause' to copy records
-u,--user username (default=admin)
-v,--testedvalue value to compare createKey with
[root@m5arch06 templates]#
Primer on the Ariel file structure
Before we go any further, let's review how Ariel data is laid out on disk. Ariel is a time-series based database. Ariel does not know what data is on disk until it searches it. You can copy Ariel data between systems without any consequences or the Ariel database knowing about it. Ariel data is stored in minute-by-minute records in a format similar to the following example:
/store/ariel/[events | flows ]/[records | payloads | md]/YYYY/MM/dd/hh/
-
events or flows is the name of the Ariel database
- records is the normalized event record
- payloads is the raw payload associated with the records
- md are the hash signatures or message digests (optionally HMAC encrypted) of the Ariel data
In general, the Ariel data files have the following structure:
identifier~Minute#_File#~UUID~UUID~RetentionBucketNumber
- events~5_0~35c5c6bddcb24578~ad5711e8c1337858~0
This record represents the normalized event record for the 5th minute and the 0th retention bucket (default bucket).
- SourceIP~5_0~35c5c6bddcb24578~ad5711e8c1337858~0
The sourceIP index for the 5th minute and the 0th retention bucket (default bucket).
In the payloads directory, we will see files like the following files:
payload_events~5_0~35c5c6bddcb24578~ad5711e8c1337858~0The raw payloads for the 5th minute and the 0th retention bucket (default bucket).
In the md directory, you see files like this file:
- events~5_0~35c5c6bddcb24578~ad5711e8c1337858~0.HMACSHA512
The HMAC-SHA512 signature for the normalized event records for the 5th minute and 0th retention bucket.
- payload_events~5_0~35c5c6bddcb24578~ad5711e8c1337858~0.HMACSHA512
The HMAC-SHA512 signature for the raw payloads for the 5th minute and the 0th retention bucket.
Note: The ACP tool creates a directory called hashes instead of md. If you have file hashing or integrity hashing enabled, then beware of this issue.
Example - Removing events that have root as the user name
In this example, we want to remove all events that have the user name of root from the Ariel events database for a time period between 09:00 and 10:00.
- First, we search and show what we want to remove.The following screen capture shows all events:
- Next, run the ACP
tool:
[root@m5arch06 ~]# /opt/qradar/bin/runjava.sh com.q1labs.ariel.io.ACP -n events -b "2018/08/28 09:00:00" -e "2018/08/28 10:00:00" -q "username not like 'root'" -u admin -d /store/new_ariel AQL criteria: [username not like 'root'] User: admin Copying: [events] to /store/new_ariel Timeline from Tue Aug 28 09:00:00 NDT 2018 to Tue Aug 28 09:59:00 NDT 2018 Trying to copy dir: /store/ariel/events/records/2018/8/28/9[18-08-28,09:00:00] Reader started ... Processing interval /store/ariel/events/records/2018/8/28/9/events~59_0~13e6848042143d0~92f58da705b4b452~0[18-08-28,09:59:00] Processing interval /store/ariel/events/records/2018/8/28/9/events~58_0~23b0a632f1df4d10~bbe50149b34453d8~0[18-08-28,09:58:00] Processing interval /store/ariel/events/records/2018/8/28/9/events~57_0~745684fd0cca424c~a9857807a2952671~0[18-08-28,09:57:00] Processing interval /store/ariel/events/records/2018/8/28/9/events~56_0~89c02507d81940c3~a54edd654c2ce796~0[18-08-28,09:56:00] …. … ... Processing interval /store/ariel/events/records/2018/8/28/9/events~4_0~2c48501471734615~b8ca3cb590a16e05~0[18-08-28,09:04:00] Processing interval /store/ariel/events/records/2018/8/28/9/events~3_0~3e369b679aa64306~a97b95e8528a1a9b~0[18-08-28,09:03:00] Processing interval /store/ariel/events/records/2018/8/28/9/events~2_0~4e370db759d44bd2~81f695b0164b5a66~0[18-08-28,09:02:00] Processing interval /store/ariel/events/records/2018/8/28/9/events~1_0~cdbe9e25b9ee4dbd~982ef0da1ab7e85d~0[18-08-28,09:01:00] Processing interval /store/ariel/events/records/2018/8/28/9/events~0_0~ecd2e4bf6bbb425a~b2acfe04fda202a6~0[18-08-28,09:00:00] Reader stopped. Written 9507759 records, Skipped 2654 records Completed copying dir: /store/ariel/events/records/2018/8/28/9[18-08-28,09:00:00] [root@m5arch06 ~]#
The following screen capture shows all events without the root user name:The following screen capture shows all events without the root user name.As explained previously, we removed all events from 09:00 to 10:00 that matched the username of root. So, with the ACP tool, we specified in our criteria that the username was NOT root.
Also, we placed the filtered records in the /store/new_Ariel directory.
- Replace the old Ariel data with the new Ariel data.
Looking in the /store/new_ariel directory (the destination we chose in this example), we can see that we processed one hour worth of data:
[root@m5arch06 new_ariel]# find /store/new_ariel -maxdepth 6 /store/new_ariel /store/new_ariel/events /store/new_ariel/events/hashes /store/new_ariel/events/hashes/2018 /store/new_ariel/events/hashes/2018/8 /store/new_ariel/events/hashes/2018/8/28 /store/new_ariel/events/hashes/2018/8/28/9 /store/new_ariel/events/records /store/new_ariel/events/records/2018 /store/new_ariel/events/records/2018/8 /store/new_ariel/events/records/2018/8/28 /store/new_ariel/events/records/2018/8/28/9 /store/new_ariel/events/payloads /store/new_ariel/events/payloads/2018 /store/new_ariel/events/payloads/2018/8 /store/new_ariel/events/payloads/2018/8/28 /store/new_ariel/events/payloads/2018/8/28/9 [root@m5arch06 new_ariel]#
In this example, the data was chosen to line up on hour boundaries to make the archive and copies easier (mv command can be used). If your data does not line up on hour boundaries, you will need to migrate file-by-file.
- Archive the old data from the ariel directory to the
ariel_archive directory by typing the following
command:
[root@m5arch06 store]# mv /store/ariel/events/records/2018/8/28/9 /store/ariel_archive/events/records/2018/8/28/ [root@m5arch06 store]# mv /store/ariel/events/payloads/2018/8/28/9 /store/ariel_archive/events/payloads/2018/8/28/ [root@m5arch06 store]# mv /store/ariel/events/md/2018/8/28/9 /store/ariel_archive/events/hashes/2018/8/28/ [root@m5arch06 store]#
At this point, that hour of data is not searchable. The Ariel database will not be able to find it because the files are no longer in the Ariel database.
- Copy (or move) the new data from the new_ariel directory to the
ariel directory by typing the following
command:
[root@m5arch06 store]# cp -a /store/new_ariel/events/records/2018/8/28/9 /store/ariel/events/records/2018/8/28 [root@m5arch06 store]# cp -a /store/new_ariel/events/payloads/2018/8/28/9 /store/ariel/events/payloads/2018/8/28 [root@m5arch06 store]# cp -a /store/new_ariel/events/hashes/2018/8/28/9 /store/ariel/events/md/2018/8/28 [root@m5arch06 store]#
Because we ran the searches against this data before we began, let's remove those searches so that we don’t get the cursor cache when we run identical searches. Remove the searches from the Manage Search Results from the Log Activity tab.
- Run the searches for all data from 09:00 to 10:00.
Note: The total results are the same as the "not" root user name results from before we ran the ACP tool.
- Search for the root user name.
Note: Some of my indexes were not regenerated. The regular minute-by-minute property-based indexes get created, but the super indexes never get generated and the free text searching /Lucene does not get generated either. (Known issue). Luckily for us, we have the /opt/qradar/bin/ariel_offline_indexer.sh tool!
Usage for the offline indexer script
Use the /opt/qradar/bin/ariel_offline_indexer.sh to generate the super indexes.
Usage is:
[root@m5arch06 store]# /opt/qradar/bin/ariel_offline_indexer.sh --help
usage: options
-R,--repair re-build corrupted super indices
-d,--duration time duration to look files for in minutes, for
example -d 5
-n,--name ariel data base name, for example -n events
-t,--endtime end time, for example -t "2018/08/28 11:00",
optional, by default current system time
-F,--renamefrom rename from (internal use)
-L,--light load minimal QRadar frameworks
-T,--renameto rename to (internal use)
-V,--validate validate super indices
-a,--auto backfill all active indexes
-b,--batchmode run in batch mode with options in a file
-f,--fts create free text search indices
-h,--help print this message
-k,--key property java class name
-l,--list list all enabled indices from the configuration
-p,--param optional paramiter for property (key creator
construction)
-r,--remove remove indices for a property
-s,--superindices create super indices from the minute indices
-v,--verbose verbose (optional, default = false)
-w,--threads maximum number of threads to produce minute indices
if requested, default is 48
[root@m5arch06 store]#
[root@m5arch06 store]# /opt/qradar/bin/ariel_offline_indexer.sh -n events -t "2018/08/28 10:00" -d 60 -f -s
2018/08/28 11:04 Running command [-n, events, -t, 2018/08/28 10:00, -d, 60, -f, -s] ...
Trying to index dir: /store/ariel/events/records/2018/8/28/9/lucene[18-08-28,09:00:00]
Completed indexing dir: /store/ariel/events/records/2018/8/28/9/lucene[18-08-28,09:00:00]
All done in 0:02:39.840
[root@m5arch06 store]#
Now our free text index and super indexes are populated.