IBM Support

How are the number of the bars of a histogram determined in SPSS?

Question & Answer


Question

How are histograms binned in SPSS Base for Windows? I'd like to know the algorithm.

Answer

If you specify either the bar width or number of bars, that determines the interval directly. Otherwise the number of bars is calculated by an algorithm that uses statistical theory to suggest a number of bars that is optimal for a data set of the size provided, under an assumption of normally-distributed values.

This optimal value may be overridden if the algorithm detects granularity in the data (i.e. values distributed at discrete locations). This granularity will be used to calculate interval widths when the number of bins suggested is not much larger than the value derived from the other algorithm.

An enhancement request has been filed with SPSS Development, asking for provision of the specific formulas and rules used to determine binning when a specific bin width or a number of bins is not requested.

Note: There is a problem in earlier releases of SPSS with measurements occurring at regular intervals, e.g. once a week or every 6 inches. In this situation, the algorithm typically places the measurements at bin boundaries. Therefore, it is possible to obtain inflated values in the last bin when setting custom bin widths. Bin boundaries currently do not include the closing value except for the last bin. This behavior was corrected in Release 16.0.

[{"Product":{"code":"SSLVMB","label":"IBM SPSS Statistics"},"Business Unit":{"code":"BU059","label":"IBM Software w\/o TPS"},"Component":"Not Applicable","Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"18.0","Edition":"","Line of Business":{"code":"LOB10","label":"Data and AI"}}]

Historical Number

49426

Document Information

Modified date:
16 April 2020

UID

swg21480583