AutoAI tutorial: Build a Binary Classification Model

This tutorial guides you through training a model to predict whether or not a customer is likely to subscribe to a bank promotion. In this tutorial, you will create an AutoAI experiment that analyzes your data and selects the best model type and algorithms to produce, train, and optimize pipelines, which are model candidates. After reviewing the pipelines, you will save one as a model, deploy it, then test it to get a prediction.

Overview of the data sets

If you preview the sample data, you can see it is structured demographic data in rows and columns, and saved in a .csv file.

Preview of training data

The data set is from a direct marketing campaigns (phone calls) of a Portuguese banking institution. The classification goal is to train a model that can predict if a new client will subscribe (yes/no) a term deposit (variable y).

Tasks overview:

This tutorial presents the basic steps for building and training a machine learning model using AutoAI:

  1. Create a project
  2. Create an AutoAI experiment
  3. Training the experiment
  4. Deploy the trained model
  5. Test the deployed model
  6. Creating a batch to score the model

Task 1: Create a project

  1. To download the following files, click on the Download or Raw button and right click to Save as to save the sample training data file to your local computer as a .csv file:

  2. In the Projects page to create a new project, select New Project.
    a. Select Create an empty project.
    b. Include your project name.
    c. Click Create.

Task 2: Create an AutoAI experiment

In this section, you will define and run the experiment on the banking data to generate pipelines, or model candidates.

  1. In the project page, select Add to Project and choose AutoAI experiment.
  2. Specify a name and optional description for your new experiment.
  3. To add a data source, you can choose one of the following:
    a. If you downloaded your file locally, upload the training data file, bank-full.csv, from your local computer by dragging the file onto the data panel or by clicking browse and then following the prompts.
    b. If you already uploaded your file to your project, click select from project, then select the data asset tab and choose bank-full.csv.

Task 3: Training the experiment

After adding the data, you choose a prediction column, which represents the problem you are trying to solve with the experiment. For this experiment, we want to know if a new bank customer will subscribe to a bank promotion, represented by the column labeled y.

  1. In Configuration details, select No for the option to create a Time Series Forecast.
  2. Select y as the column to predict. You can see that when you choose a column to predict, AutoAI selects a model type that matches the data. AutoAI analyzes your data and determines that the y column contains Yes/No information, making this data suitable for a binary classification model.
  3. Click Run experiment. As the model trains, you will see an infographic that shows the process of building the pipelines.
    Pipeline creation infographic For a list of algorithms, or estimators, available with each machine learning technique in AutoAI, see: AutoAI implementation detail.

  4. Once all the pipelines are created, you can compare their accuracy on the Pipeline leaderboard. Pipeline leaderboard

  5. You can also click Pipeline comparison tab to view differences between pipelines. When you are done reviewing the pipelines, choose one to save as a model.

Metric chart of pipeline comparison

  1. Select the pipeline with Rank 1 and click Save as to create your model. Then select Create. This saves the pipeline under the Saved models section in the Assets tab.

Task 4: Deploy the trained model

  1. You can deploy the model from the model details page. You can access the model details page in one of these ways:
    • Clicking on the model’s name in the notification displayed when you save the model.
    • Open the Assets tab for the project, select the Saved models section and select the model’s name.
  2. Click Promote to Deployment Space then select or create the space where the model will be deployed.
    • To create a deployment space:
      • Enter a name.
      • Select Create.
  3. Once you have created your deployment space or selected an existing one, select Promote.
  4. Click the deployment space link from the notification.
  5. From the Assets tab of the deployment space:
    • Hover over the model’s name and click the deploy icon Deploy icon.
    • In the page that opens, fill in the fields:
      • Select Online as the Deployment type.
      • Specify a name for the deployment.
      • Click Create.

After the deployment is complete, click on the Deployments tab and select the deployment name to view the view the details page.

Task 5: Test the deployed model

You can test the deployed model from the deployment details page.

  1. On the Test tab of the deployment details page, either fill out the form with the following test values:
Data Input
age 37
job management
marital married
education secondary

Test the deployment

  1. Click Predict and the resulting prediction indicates that a customer with the attributes entered has a low probability of signing up for the bank promotion.

Task 6: Creating a batch job to score the model

For a batch deployment, you provide input data, also known as the model payload, in a CSV file. The data must be structured like the training data, with the same column headers. The batch job will process each row of data and create a corresponding prediction.

In a real scenario, you would submit new data to the model to get a score, but this tutorial will use the training data bank-payload.csv that you downloaded as part of the tutorial setup to learn how to create and run a batch deployment. When you deploy a model, you can add the payload data to a project, upload it directly to a space, or link to the data in a storage repository such as a Cloud Object Storage bucket. In this case, you will upload the file directly to the deployment space.

Step 1: Add data to space

From the Assets page of the deployment space:

  1. Click Add to space then choose Data.
  2. Upload the file bank-payload.csv file that you saved locally.

Step 2: Create the batch deployment

Now you can define the batch deployment.

  1. Click the deployment icon next to the model’s name.
  2. Enter a name a name for the deployment.
    1. Select Batch as the Deployment type.
    2. Choose the smallest hardware specification.
    3. Click Create.

Step 3: Create the batch job:

The batch job executes the deployment. To create the job, you must specify the input data and the name for the output file. You can set up a job to run on a schedule or run immediately.

  1. Click New job.
  2. Specify a name for the job
  3. Configure to the smallest hardware specification
  4. Optional: To set a schedule and receive notifications.
  5. Upload the input file: bank-payload.csv
  6. Name the output file: bank-tutorial-output.csv
  7. Review and click Create to run the job.

Step 4: View the output

When the deployment status changes to Deployed, return to the Assets page for the deployment space. You will see that the file bank-tutorial-output.csv was created and added to your assets list.

Click the download icon next to the output file and open the file in an editor. You can review the prediction results for the customer information submitted for batch processing.

View the batch predictions

For each case, the prediction returned indicates the confidence score of whether a customer will buy a tent.

Learn more

AutoAI overview

Parent topic: AutoAI