Chi-square test for goodness of fit

The Chi-square test for goodness of fit procedure tabulates a variable into categories and computes a chi-square statistic. This goodness-of-fit test compares the observed and expected frequencies in each category to test that all categories contain the same proportion of values or test that each category contains a user-specified proportion of values.

Examples
The chi-square test could be used to determine whether a bag of jelly beans contains equal proportions of blue, brown, green, orange, red, and yellow candies. You could also test to see whether a bag of jelly beans contains 5% blue, 30% brown, 10% green, 20% orange, 15% red, and 15% yellow candies.
Statistics
Mean, standard deviation, minimum, maximum, and quartiles. The number and the percentage of non-missing and missing cases; the number of cases observed and expected for each category; residuals; and the chi-square statistic.

Data considerations

Data
Use ordered or unordered numeric categorical variables (ordinal or nominal levels of measurement). To convert string variables to numeric variables, use the Automatic Recode procedure, which is available on the Transform menu.
Assumptions
Nonparametric tests do not require assumptions about the shape of the underlying distribution. The data are assumed to be a random sample. The expected frequencies for each category should be at least 1. No more than 20% of the categories should have expected frequencies of less than 5.

Obtaining a Chi-square test for goodness of fit

This feature requires Statistics Base Edition.

  1. From the menus choose:

    Analyze > Group comparison - nonparametric > Chi-square test for goodness of fit

  2. Click Select variables under the Test variables section and select one or more test variables. A separate test is computed for each variable. Click OK after selecting the variables.
  3. Optionally, select an Expected values setting. By default, all categories have equal expected values. Categories can have user-specified expected proportions. Select Values, enter a value that is greater than 0 for each category of the test variable. Click Add value to include additional values.

    Each time you add a value, it appears at the bottom of the value list. The order of the values is important; it corresponds to the ascending order of the category values of the test variable. The first value of the list corresponds to the lowest group value of the test variable, and the last value corresponds to the highest value. Elements of the value list are summed, and then each value is divided by this sum to calculate the proportion of cases expected in the corresponding category. For example, a value list of 3, 4, 5, 4 specifies expected proportions of 3/16, 4/16, 5/16, and 4/16.

  4. Optionally, select an Expected range setting. By default, each distinct variable value is defined as a category (Get from data). To establish categories within a specific range, select Use specified values and enter integer values for lower and upper bounds.

    Categories are established for each integer value within the inclusive range, and cases with values outside of the bounds are excluded. For example, if you specify a value of 1 for Lower and a value of 4 for Upper, only the integer values of 1 through 4 are used for the chi-square test.

  5. Optionally, you can select the following options from the Additional settings menu:
    • Click Method to specify additional methods for calculating significance levels for the statistics.
    • Click Statistics to select which statistics to include (descriptive statistics and quartiles).
    • Click Options to control of the treatment of missing data.
  6. Click Run analysis.

This procedure pastes NPAR TESTS command syntax.