Building an AutoAI model (Watson Machine Learning)

AutoAI automatically prepares data, applies algorithms, and builds model pipelines that are best suited for your data and use case. Learn how to generate the model pipelines that you can save as machine learning models.

Follow these steps to upload data and have AutoAI create the best model for your data and use case.

  1. Collect your input data
  2. Open the AutoAI tool
  3. Specify details of your model and training data and start AutoAI
  4. View the results

Collect your input data

Collect and prepare your training data. For details on allowable data sources, see AutoAI overview.

Open the AutoAI tool

For your convenience, your AutoAI model creation uses the default storage that is associated with your project to store your data and to save model results.

  1. Open your project.

  2. Click the Assets tab.

  3. Click New asset > AutoAI.

Note: After you create an AutoAI asset it displays on the Assets page for your project in the **AutoAI experiments** section, so you can return to it.

Specify details of your experiment

  1. Specify a name and description for your experiment.

  2. Select a compute configuration and click Create. The compute configuration specifies the computing resources to allocate to running the experiment. Larger sizes improve training speed and might be required for larger data sources, but cost more than smaller configurations.

  3. Choose data from your project or upload it from your file system or from the asset browser, then press Continue. Click the preview icon to review your data. (Optional) Add a second file as holdout data for testing the trained pipelines.

  4. Choose the Column to predict for the data you want the experiment to predict.

    • Based on analyzing a subset of the data set, AutoAI selects a default model type: binary classification, multiclass classification, or regression. Binary is selected if the target column has two possible values. Multiclass has a discrete set of 3 or more values. Regression has a continuous numeric variable in the target column. You can optionally override this selection.

      Note: The limit on values to classify is 200. Creating a classification experiment with many unique values in the prediction column is resource-intensive and affects the experiment's performance and training time. To maintain the quality of the experiment:
      - AutoAI chooses a default metric for optimizing. For example, the default metric for a binary classification model is *Accuracy*.
      - By default, 10% of the training data is held out to test the performance of the model.
  5. (Optional): Click Experiment settings to view or customize options for your AutoAI run. For details on experiment settings, see Configuring a classification or regression experiment.

  6. Click Run Experiment to begin model pipeline creation.

An infographic shows you the creation of pipelines for your data. The duration of this phase depends on the size of your data set. A notification message informs you if the processing time will be brief or require more time. You can work in other parts of the product while the pipelines build.

View the results

When the pipeline generation process completes, you can view the ranked model candidates and evaluate them before you save a pipeline as a model.

Next steps

Watch this video to see how to build a binary classification model

This video provides a visual method to learn the concepts and tasks in this documentation.

Parent topic: AutoAI overview