Tutorial: Analyzing with SPSS Statistics

This tutorial describes how to use IBM® SPSS® Statistics to analyze data that is stored in a Db2® database.

You can view a video version of this tutorial here.

This tutorial shows you how to carry out the following tasks:

  • Connect your database to SPSS Statistics.
  • Prepare an analysis.
  • Analyze data.

Time required

5 minutes

Scenario

You are a data scientist working for a sporting goods manufacturer. You want to know where the stores that sell the products of the sporting goods manufacturer are located. This scenario uses sample data that is already loaded into your Db2 database.

Difficulty

Beginner

Audience

Data scientists

Prepare the analysis

About this task

After you successfully establish the database connection, you can start to prepare the analysis by identifying the correct data and structuring it. Follow these steps to prepare the data for the analysis:

Procedure

  1. In the Database Wizard, click Next to display a list of the tables in the database.
  2. Follow these steps to get a separate entry for each retailer site:
    1. From the Available Tables list, select GOSALESRT.RETAILER_SITE.
    2. Expand GOSALESRT.RETAILER_SITE.
    3. Select RETAILER_SITE_CODE.
    4. Move RETAILER_SITE_CODE to the Retrieve Fields in this Order list by clicking the button with the right-pointing arrow.
  3. Follow these steps to find out in which countries the retailers are located:
    1. From the Available Tables list, select GOSALES.COUNTRY.
    2. Expand GOSALES.COUNTRY.
    3. Select COUNTRY_EN.
    4. Move COUNTRY_EN to the Retrieve Fields in this Order list by clicking the button with the right-pointing arrow.
  4. Join GOSALESRT.RETAILER_SITE and GOSALES.COUNTRY on the columns that they have in common by following these steps:
    1. Click Next to open the Specify Relationships page in the Database Wizard.
    2. Select COUNTRY CODE from the GOSALES.COUNTRY list.
    3. Select RTL COUNTRY CODE from the GOSALESRT.RETAILER_SITE list.
    4. Select Inner from the Join Type list.
    5. Click Join.
  5. Click Finish to create a table that contains a separate entry for each retailer site and the country in which the retailer site is located.

Analyze the data

Finally, let's analyze the data and visualize the results of the analysis. Follow these steps:

Procedure

  1. In the IBM SPSS Statistics Data Editor, click Analyze > Descriptive Statistics > Frequencies to open the Frequencies window.
  2. In the Frequencies window, select COUNTRY_EN to count the number of retailer sites in each country.
  3. Move COUNTRY_EN to the Variable(s) list by clicking the button with the right-pointing arrow.
  4. Click Charts to open the Frequencies: Charts window.
  5. In the Chart Type section of the Frequencies: Charts window, click Bar charts to visualize the result with a bar chart.
  6. Click Continue to confirm your selection.
  7. In the Frequencies window, click OK to analyze the data.

    The result of the analysis is displayed in the IBM SPSS Statistics Viewer window. The IBM SPSS Statistics Viewer window contains a table that shows the distribution of the retailer sites in various countries. This table shows the total numbers and the percentages.

    Table 1. The output table
      (Country) Frequency Percent Valid Percent Cumulative Percent
    Valid Australia 20 2.5 2.5 2.4
      Austria 25 3.0 3.0 5.3
      Belgium 20 2.4 2.4 7.7
      Brazil 16 1.9 1.9 9.6
      Canada 56 6.6 6.6 16.2
      China 35 4.1 4.1 20.3
      Denmark 17 2.0 2.0 22.3
      Finland 19 2.2 2.2 24.6
      France 64 7.6 7.6 32.1
      Germany 61 7.2 7.2 39.3
      Italy 30 3.5 3.5 42.9
      Japan 66 7.8 7.8 50.6
      Korea 20 2.4 2.4 53.0
      Mexico 17 2.0 2.0 55.0
      Netherlands 33 3.9 3.9 58.9
      Singapore 28 3.3 3.3 62.2
      Spain 21 2.5 2.5 64.7
      Sweden 26 3.1 3.1 67.8
      Switzerland 35 4.1 4.1 71.9
      United Kingdom 57 6.7 6.7 78.6
      United States 181 21.4 21.4 100
      Total 847 100 100  
    Note: The layout of the actual SPSS output table differs slightly from what is shown here. The output shown here was reformatted to make it easier to read.

    The IBM SPSS Statistics Viewer window also contains a bar chart, which visualizes the result. This bar chart shows that, by far, most retailer sites are located in the United States.

    The resulting bar chart

Tutorial summary

Connect your database to SPSS Statistics by adding an ODBC connection in the Database Wizard. The Settings for an ODBC data source name (DSN) section on the Db2 Connection Information page contains the information that you need to establish the ODBC connection. To prepare an analysis, select the tables and columns that contain the needed information from the Available Tables list in the Database Wizard. Use the Specify Relationships page to create a table that is the basis of your analysis. For example, you can join tables on their common columns. To count the number of items that match a certain criterion and visualize the result, use the Frequencies window.