Connect the Amazon Athena data source
to IBM Security QRadar® Suite Software to enable your applications and
dashboards to collect and analyze Amazon Athena
security data. Universal Data Insights connectors enable
federated search across your security products.
Before you begin
Collaborate with an AWS administrator to
obtain a user account with access to query the CloudWatch data source.
Configure VPC Flow Logs in
Amazon Athena
- Enable the VPC flow logs in Amazon
Console.
- Configure VPC flow log service to save logs in Amazon S3 bucket. For more information, see Publishing flow logs to Amazon
S3 (https://docs.aws.amazon.com/vpc/latest/userguide/flow-logs-s3.html).
- Create Amazon VPC table for VPC flow logs in
Amazon Athena service. For more information, see
Querying Amazon VPC Flow
Logs (https://docs.aws.amazon.com/athena/latest/ug/vpc-flow-logs.html).
Configure Amazon GuardDuty in Amazon Athena
- Enable the GuardDuty features in
Amazon Console.
- Configure the GuardDuty feature to
export findings in Amazon S3 bucket. For more
information, see Export Findings
(https://docs.aws.amazon.com/guardduty/latest/ug/guardduty_exportfindings.html).
- Create a table for GuardDuty findings
in Amazon Athena. For more information, see Querying Amazon GuardDuty
Findings (https://docs.aws.amazon.com/athena/latest/ug/querying-guardduty.html).
Configure
Amazon Security Lake in
Amazon Athena
- Enable and start Amazon Security Lake in
Amazon Console. For more information, see Getting Started with Amazon Security Lake
(https://docs.aws.amazon.com/security-lake/latest/userguide/getting-started.html).
- Ensure that Amazon Security Lake stores logs in
Open Cybersecurity Schema Framework (OCSF) format. For more information, see Open Cybersecurity Schema Framework
(OCSF) format (https://schema.ocsf.io/).
- The client program must have query access to the AWS
Lake Formation tables as a subscriber. The following list contains the
minimum IAM permissions for the Amazon Athena connector:
- "athena:GetQueryExecution"
- "athena:GetQueryResults"
- "athena:ListWorkGroups"
- "athena:StartQueryExecution"
- "athena:StopQueryExecution"
- "glue:GetDatabases"
- "glue:GetTable"
- "s3:AbortMultipartUpload"
- "s3:DeleteObject"
- "s3:GetBucketLocation"
- "s3:GetObject"
- "s3:ListBucket",
- "s3:ListBucketMultipartUploads"
- "s3:ListMultipartUploadParts"
- "s3:PutObject"
- "sts:AssumeRole"
For more information, see
Subscriber management in Amazon Security Lake
(https://docs.aws.amazon.com/security-lake/latest/userguide/subscriber-management.html).
If you have a firewall between your cluster and the data source target, use the
IBM® Security Edge Gateway to host the containers. The Edge Gateway must be V1.6 or later. For more information,
see Edge Gateway.
About this task
Amazon Athena uses standard SQL to analyze
data in Amazon S3. QRadar Suite Software currently supports the data source
connection for logs of Amazon GuardDuty and VPC
Flow.
Structured Threat Information eXpression (STIX) is a language and
serialization format that organizations use to exchange cyberthreat intelligence. The connector uses STIX patterning to
query Amazon Athena data and returns results as
STIX objects. For more information about how the Amazon Athena data schema maps to STIX, see Amazon Athena
stix-shifter repository
(https://github.com/opencybersecurityalliance/stix-shifter/tree/develop/stix_shifter_modules/aws_athena).
Procedure
- Log in to IBM Security QRadar Suite Software.
- From the menu, click
.
- On the Integration data sources page, on the
Amazon Athena tile, click Set up a
connection.
- On the Connection services page, select the
Federated searches service tile, and then click
Enable.
The available connection services include:
- Connected assets & risk
- Federated searches
Note: If there are multiple data sources to connect, select the connector from the Sources
list, and then click Enable.
- Click Next.
- On the Connection
details page, configure the following parameters.
- Configure the connection to the data source.
Table 1. Connection parameters
| Parameter |
Description |
| Data source name |
Enter a unique name to identify the data source connection. You can create multiple
connections to a data source, so it is useful to clearly set them apart by name. Only
alphanumeric characters and the following special characters are allowed: - .
_
|
| Data source description |
Enter a description to indicate the purpose of the data source connection. You can create
multiple connections to a data source, so it is useful to clearly indicate the purpose of each
connection by description. Only alphanumeric characters and the following special characters are
allowed: - . _
|
| Edge gateway |
If you have a firewall between your cluster and the data source target, use the Edge Gateway to host the containers. In the
Edge gateway field, specify an Edge Gateway to host the connector. It can take up to
five minutes for the status of newly deployed data source connections on the Edge Gateway to show as being connected.
|
| Region |
Enter the Amazon Athena region for the
data source. Select your region code from the Region column of the Service Endpoints table in the
Amazon Athena endpoints and
quotas (https://docs.aws.amazon.com/general/latest/gr/athena.html). |
| Amazon S3 Bucket Location |
Enter the location of the S3 bucket where query results will be stored. |
| VPC Flow Logs database name (optional) |
If you are using Amazon Athena with VPC
flow logs, specify the name of the database that contains the VPC flow logs in the VPC
Flow Logs database name (optional) field. |
| VPC Flow Logs table name (optional) |
If you are using Amazon Athena with VPC
flow logs, specify the name of the table that contains the VPC flow logs in the VPC Flow
Logs table name (optional) field. |
| Amazon GuardDuty database name (optional) |
If you are using Amazon Athena with
Amazon GuardDuty, specify the name of the database that
contains the Amazon GuardDuty logs in the
Amazon GuardDuty database name (optional) field. |
| Amazon GuardDuty table name (optional) |
If you are using Amazon Athena with
Amazon GuardDuty, specify the name of the table that
contains the Amazon GuardDuty logs in the
Amazon GuardDuty table name (optional) field. |
| OCSF logs database name (optional) |
If you are using Amazon Athena to query
Amazon Security Lake logs, specify the name of the
AWS
Lake formation database that contains the security logs in the
OCSF logs database name (optional) field. |
| OCSF logs table name (optional) |
If you are using Amazon Athena to query
Amazon Security Lake logs, specify the name of the
AWS
Lake formation table that contains the security logs in the OCSF
logs table name (optional) field. |
- Set the query parameters to control the behavior of the
federated search query on the data source.
Table 2. Query parameters
| Query parameter |
Description |
| Concurrent search limit |
Enter the number of simultaneous connections that can be made to the data source. The default
limit for the number of connections is 4. The value must not be less than 1 and must not be greater
than 100. |
| Query search timeout limit |
Enter the time limit in minutes for how long the query is run on the data source. The default
time limit is 30. When the value is set to zero, no timeout occurs. The value must not be less than
1 and must not be greater than 120. |
| Result size limit |
Enter the maximum number of entries or objects that are returned by search query. The default
result size limit is 10,000. The value must not be less than 1 and must not be greater than
500,000. |
| Query time range |
Enter the time range in minutes for the search, represented as the last X
minutes. The default is 5 minutes. The value must not be less than 1 and must not be greater than
10,000. |
| Custom mapping (Optional) |
If you need to customize the STIX attributes mapping, click
Customize attribute mapping and edit the JSON blob to map new or existing
properties to their associated target data source fields. |
Important: If you increase the Concurrent search limit and the
Result size limit, a greater amount of data can be sent to the data source,
which increases the strain on the data source. Increasing the query time range also increases the
amount of data.
- Click Next.
- On the Connection
configurations page, configure identity and access.
- Click Add a configuration.
- In the Configuration details window,
configure the following parameters.
Table 3. Configuration parameters
| Parameter |
Description |
| Configuration Name |
Enter a unique name to describe the access configuration and distinguish it from the other
access configurations for this data source connection that you might set up. Only alphanumeric
characters and the following special characters are allowed: - . _ |
| Configuration Description |
Enter a unique description to describe the access configuration and distinguish it from the
other access configurations for this data source connection that you might set up. Only alphanumeric
characters and the following special characters are allowed: - . _ |
| AWS Access key id |
Establish AWS authentication
to enable access to the AWS search API. To
establish an AWS key-based authentication, enter
values for the AWS Access key id and AWS secret access
key parameters.
To establish an AWS role-based authentication, enter values for the
AWS Access key id, AWS secret access key, and
AWS IAM Role parameters.
|
| AWS secret access key |
| AWS IAM Role |
| External ID for AWS Assume Role |
To grant access to your AWS resources and
establish an Assume Role authentication, enter a value for the External ID for AWS Assume
Role parameter. For more information, see Using an external ID for third-party access
(https://docs.aws.amazon.com/IAM/latest/UserGuide/id_roles_create_for-user_externalid.html). |
For more information about AWS authentication,
see Configuring AWS authentication.
- To save your configuration and establish the connection, click
Save.
- Click Next.
- To assign user access, on the User access
control page, select one or more data source configurations from the
Access list, and then click Finish.
- To manage your active connections, complete the following
steps:
- On the Integration data sources page, on the tile of the relevant
data source, click Manage <x> of <x> active
connections.
- On the Connection status page, on the tile of the relevant data
source, you can edit, refresh, or delete your data source connection.
Results
After you connect a data source, it might take up to 30 seconds
to retrieve the data. Before the full data set is returned, the data source might display as
unavailable. After the data is returned, the data source shows as being connected, and a polling
mechanism occurs to validate the connection status. The connection status is valid for 60 seconds
after every poll.
You can add other connection configurations for this data source that have
different users and different data access permissions.
What to do next
Test the connection by searching for an IP address in IBM Security Data Explorer that matches an asset data source. In
Data Explorer, click an IP address to view
its associated assets and risk.
To use Data Explorer,
you must have data sources that are connected so that the application can run queries and retrieve
results across a unified set of data sources. The search results vary depending on the data that is
contained in your configured data sources. For more information about how to build a query in
Data Explorer, see Build a query.