Group statistics: Statistics

The Statistics dialog provides options for choosing one or more subgroup statistics for the variables within each category of each grouping variable. Summary statistics display for each variable across all categories.

Descriptive statistics
Provides options for selecting which statistics to use for the dependent variables. Statistics are applied to each grouping variable block. The order in which the statistics appear in the Descriptive statistics list is the order in which they display in the output.
Click Change statistics to display the Select subgroup statistics dialog and select statistics for the dependent variables within each category of each grouping variable. Click OK after selecting statistics. The following statistics are available.
First
Displays the first data value encountered in the data file.
Geometric Mean
The nth root of the product of the data values, where n represents the number of cases.
Grouped Median
Median that is calculated for data that is coded into groups. For example, with age data, if each value in the 30s is coded 35, each value in the 40s is coded 45, and so on, the grouped median is the median calculated from the coded data.
Harmonic Mean
Used to estimate an average group size when the sample sizes in the groups are not equal. The harmonic mean is the total number of samples divided by the sum of the reciprocals of the sample sizes.
Kurtosis
A measure of the extent to which there are outliers. For a normal distribution, the value of the kurtosis statistic is zero. Positive kurtosis indicates that the data exhibit more extreme outliers than a normal distribution. Negative kurtosis indicates that the data exhibit less extreme outliers than a normal distribution.
Last
Displays the last data value encountered in the data file.
Maximum
The largest value of a numeric variable.
Mean
A measure of central tendency. The arithmetic average, the sum divided by the number of cases.
Median
The value above and below which half of the cases fall, the 50th percentile. If there is an even number of cases, the median is the average of the two middle cases when they are sorted in ascending or descending order. The median is a measure of central tendency not sensitive to outlying values (unlike the mean, which can be affected by a few extremely high or low values).
Minimum
The smallest value of a numeric variable.
N
The number of cases (observations or records).
Percent of total N
Percentage of the total number of cases in each category.
Percent of total sum
Percentage of the total sum in each category.
Range
The difference between the largest and smallest values of a numeric variable, the maximum minus the minimum.
Skewness
A measure of the asymmetry of a distribution. The normal distribution is symmetric and has a skewness value of 0. A distribution with a significant positive skewness has a long right tail. A distribution with a significant negative skewness has a long left tail. As a guideline, a skewness value more than twice its standard error is taken to indicate a departure from symmetry.
Standard Deviation
A measure of dispersion around the mean. In a normal distribution, 68% of cases fall within one standard deviation of the mean and 95% of cases fall within two standard deviations. For example, if the mean age is 45, with a standard deviation of 10, 95% of the cases would be between 25 and 65 in a normal distribution.
Standard Error of Kurtosis
The ratio of kurtosis to its standard error can be used as a test of normality (that is, you can reject normality if the ratio is less than -2 or greater than +2). A large positive value for kurtosis indicates that the tails of the distribution are longer than those of a normal distribution; a negative value for kurtosis indicates shorter tails (becoming like those of a box-shaped uniform distribution).
Standard Error of Mean
A measure of how much the value of the mean may vary from sample to sample taken from the same distribution. It can be used to roughly compare the observed mean to a hypothesized value (that is, you can conclude the two values are different if the ratio of the difference to the standard error is less than -2 or greater than +2).
Standard Error of Skewness
The ratio of skewness to its standard error can be used as a test of normality (that is, you can reject normality if the ratio is less than -2 or greater than +2). A large positive value for skewness indicates a long right tail; an extreme negative value indicates a long left tail.
Sum
The sum or total of the values, across all cases with nonmissing values.
Variance
A measure of dispersion around the mean, equal to the sum of squared deviations from the mean divided by one less than the number of cases. The variance is measured in units that are the square of those of the variable itself.
ANOVA statistics
Provides settings for a one-way analysis of variance for each independent variable in the first block.
Anova table and eta
Displays a one-way analysis-of-variance table and calculates eta and eta-squared (measures of association) for each independent variable in the first layer.
Test for linearity
Calculates the sum of squares, degrees of freedom, and mean square associated with linear and nonlinear components, as well as the F ratio, R and R-squared. Linearity is not calculated if the independent variable is a short string.

Specifying statistics for Group statistics

This feature requires Statistics Base Edition.

  1. From the menus choose:

    Analyze > Descriptive statistics > Group statistics

  2. In the Group statistics dialog, select Statistics.
  3. Select subgroup statistics for the variables within each category of each grouping variable.