Correlation Settings

IBM® SPSS® Modeler can characterize correlations with descriptive labels to help highlight important relationships. The correlation measures the strength of relationship between two continuous (numeric range) fields. It takes values between –1.0 and 1.0. Values close to +1.0 indicate a strong positive association so that high values on one field are associated with high values on the other and low values are associated with low values. Values close to –1.0 indicate a strong negative association so that high values for one field are associated with low values for the other, and vice versa. Values close to 0.0 indicate a weak association, so that values for the two fields are more or less independent.

Using the Correlation Settings dialog box you can control display of correlation labels, change the thresholds that define the categories, and change the labels used for each range. Because the way you characterize correlation values depends greatly on the problem domain, you may want to customize the ranges and labels to fit your specific situation.

Show correlation strength labels in output. This option is selected by default. Deselect this option to omit the descriptive labels from the output.

Correlation Strength. There are two options for defining and labeling the strength of correlations:

  • Define correlation strength by importance (1-p). Labels correlations based on importance, defined as 1 minus the significance, or 1 minus the probability that the difference in means could be explained by chance alone. The closer this value comes to 1, the greater the chance that the two fields are not independent—in other words, that some relationship exists between them. Labeling correlations based on importance is generally recommended over absolute value because it accounts for variability in the data—for example, a coefficient of 0.6 may be highly significant in one dataset and not significant at all in another. By default, importance values between 0.0 and 0.9 are labeled as Weak, those between 0.9 and 0.95 are labeled as Medium, and those between 0.95 and 1.0 are labeled as Strong.
  • Define correlation strength by absolute value. Labels correlations based on the absolute value of the Pearson's correlation coefficient, which ranges between –1 and 1, as described above. The closer the absolute value of this measure comes to 1, the stronger the correlation. By default, correlations between 0.0 and 0.3333 (in absolute value) are labeled as Weak, those between 0.3333 and 0.6666 are labeled as Medium, and those between 0.6666 and 1.0 are labeled as Strong. Note, however, that the significance of any given value is difficult to generalize from one dataset to another; for this reason, defining correlations based on probability rather than absolute value is recommended in most cases.