Case summaries: Statistics

The Statistics dialog provides options for selecting which statistics to include in the current procedure. You can choose one or more subgroup statistics for the variables within each category of each grouping variable. Summary statistics are displayed for each variable across all categories.

Summary statistics
The following case summary statistics are available.
Number of cases (N)
The number of cases (observations or records).
First
Displays the first data value encountered in the data file.
Last
Displays the last data value encountered in the data file.
Sum
The sum or total of the values, across all cases with nonmissing values.
Percent of total N
Percentage of the total number of cases in each category.
Percent of total sum
Percentage of the total sum in each category.
Central tendency
The following statistics that describe the central location of the distribution are available.
Mean (arithmetic)
A measure of central tendency. The arithmetic average, the sum divided by the number of cases.
Geometric mean
The nth root of the product of the data values, where n represents the number of cases.
Grouped median
Median that is calculated for data that is coded into groups. For example, with age data, if each value in the 30s is coded 35, each value in the 40s is coded 45, and so on, the grouped median is the median calculated from the coded data.
Harmonic mean
Used to estimate an average group size when the sample sizes in the groups are not equal. The harmonic mean is the total number of samples divided by the sum of the reciprocals of the sample sizes.
Median
The value above and below which half of the cases fall, the 50th percentile. If there is an even number of cases, the median is the average of the two middle cases when they are sorted in ascending or descending order. The median is a measure of central tendency not sensitive to outlying values (unlike the mean, which can be affected by a few extremely high or low values).
Dispersion
The following statistics that measure the amount of variation or spread in the data are available.
Minimum
The smallest value of a numeric variable.
Maximum
The largest value of a numeric variable.
Range
The difference between the largest and smallest values of a numeric variable, the maximum minus the minimum.
Standard deviation
A measure of dispersion around the mean. In a normal distribution, 68% of cases fall within one standard deviation of the mean and 95% of cases fall within two standard deviations. For example, if the mean age is 45, with a standard deviation of 10, 95% of the cases would be between 25 and 65 in a normal distribution.
Variance
A measure of dispersion around the mean, equal to the sum of squared deviations from the mean divided by one less than the number of cases. The variance is measured in units that are the square of those of the variable itself.
Standard error of mean
A measure of how much the value of the mean may vary from sample to sample taken from the same distribution. It can be used to roughly compare the observed mean to a hypothesized value (that is, you can conclude the two values are different if the ratio of the difference to the standard error is less than -2 or greater than +2).
Distribution
The following statistics describe the shape and symmetry of the distribution.
Skew
A measure of the asymmetry of a distribution. The normal distribution is symmetric and has a skewness value of 0. A distribution with a significant positive skewness has a long right tail. A distribution with a significant negative skewness has a long left tail. As a guideline, a skewness value more than twice its standard error is taken to indicate a departure from symmetry.
Kurtosis
A measure of the extent to which there are outliers. For a normal distribution, the value of the kurtosis statistic is zero. Positive kurtosis indicates that the data exhibit more extreme outliers than a normal distribution. Negative kurtosis indicates that the data exhibit less extreme outliers than a normal distribution.
Standard error of kurtosis
The ratio of kurtosis to its standard error can be used as a test of normality (that is, you can reject normality if the ratio is less than -2 or greater than +2). A large positive value for kurtosis indicates that the tails of the distribution are longer than those of a normal distribution; a negative value for kurtosis indicates shorter tails (becoming like those of a box-shaped uniform distribution).
Standard error of skewness
The ratio of skewness to its standard error can be used as a test of normality (that is, you can reject normality if the ratio is less than -2 or greater than +2). A large positive value for skewness indicates a long right tail; an extreme negative value indicates a long left tail.

Specifying statistics for Case summaries

This feature requires Statistics Base Edition.

  1. From the menus choose:

    Analyze > Reports > Case summaries

  2. In the Case summaries dialog, expand the Additional settings menu and click Statistics.