Planning data loading, checking, and training

You can load and validate data for testing for AI training and operational purposes.

Loading AI training data

When you are loading AI training data, such as to correlate events from various sources, or find anomalies in logs, use the following end-to-end guidance for best results.

For each AI type, load all relevant historical data. For example, if you are training the log anomaly AI type, first load all relevant historical log data by using one or more of the following data integrations. Set the integration mode to Historical data for initial AI training at the end of the page where you define the data integration.
- Custom
- ELK
- Falcon LogScale
- Kafka
- Mezmo
- Splunk
Before you run the training, you can optionally run a data quality precheck on your loaded data to ensure that it is sufficient enough to train on the algorithm selected. Even if you skip this step to run the model training, the data quality precheck still runs as a prerequisite to the model training.
Train the AI type, as described in Configuring AI training, and use all of the data that you loaded in Step 1. For example, if you are training the log anomaly AI type then follow the instructions to create a log anomaly training definition.
Deploy the model for the AI type that you trained in Step 3. Deployment can be automatic or manual.
For log anomaly training only, after the model deploys, go back to each data integration that you defined in step 1. Next, change the integration mode to Live data for continuous AI training and anomaly detection.

Loading operational data

When you are loading operation data, such as for real-time analysis, use the guidance within the following documentation:

Incoming integrations

Incoming integrations enable IBM Cloud Pak for AIOps to collect log, metric, and event data from various sources.
Connecting probes and gateways

If you want to load event data, you can also configure probes and gateways form various sources.