Ordinal regression

Ordinal regression provides options for modelling the dependence of a polytomous ordinal response on a set of predictors, which can be factors or covariates. The design of ordinal regression is based on the methodology of McCullagh (1980, 1998), and the procedure is referred to as PLUM in the syntax.

Standard linear regression analysis involves minimizing the sum-of-squared differences between a response (dependent) variable and a weighted combination of predictor (independent) variables. The estimated coefficients reflect how changes in the predictors affect the response. The response is assumed to be numerical, in the sense that changes in the level of the response are equivalent throughout the range of the response. For example, the difference in height between a person who is 150 cm tall and a person who is 140 cm tall is 10 cm, which has the same meaning as the difference in height between a person who is 210 cm tall and a person who is 200 cm tall. These relationships do not necessarily hold for ordinal variables, in which the choice and number of response categories can be quite arbitrary.

Example
Ordinal regression could be used to study patient reaction to drug dosage. The possible reactions may be classified as none, mild, moderate, or severe. The difference between a mild and moderate reaction is difficult or impossible to quantify and is based on perception. Moreover, the difference between a mild and moderate response may be greater or less than the difference between a moderate and severe response.
Statistics and plots
Observed and expected frequencies and cumulative frequencies, Pearson residuals for frequencies and cumulative frequencies, observed and expected probabilities, observed and expected cumulative probabilities of each response category by covariate pattern, asymptotic correlation and covariance matrices of parameter estimates, Pearson's chi-square and likelihood-ratio chi-square, goodness-of-fit statistics, iteration history, test of parallel lines assumption, parameter estimates, standard errors, confidence intervals, and Cox and Snell's, Nagelkerke's, and McFadden's R 2 statistics.

Data considerations

Data
The dependent variable is assumed to be ordinal and can be numeric or string. The ordering is determined by sorting the values of the dependent variable in ascending order. The lowest value defines the first category. Factor variables are assumed to be categorical. Covariate variables must be numeric. Note that using more than one continuous covariate can easily result in the creation of a very large cell probabilities table.
Assumptions
Only one response variable is allowed, and it must be specified. Also, for each distinct pattern of values across the independent variables, the responses are assumed to be independent multinomial variables.
Related procedures
Nominal logistic regression uses similar models for nominal dependent variables.

Obtaining an ordinal regression

This feature requires Statistics Base Edition.

  1. From the menus choose:

    Analyze > Association and prediction > Ordinal regression

  2. Click Select variable under the Dependent variable section and select a categorical variable (numeric or string) that has two or more values. Click OK after selecting the variable.
  3. Optionally, select a function from the Link function list that will be applied to estimate the model. The link function is a transformation of the cumulative probabilities that allows estimation of the model. The following link functions are available:
    • Logit. f(x)=log(x/(1−x) ). Typically used for evenly distributed categories.
    • Complementary log-log. f(x)=log(−log(1−x)). Typically used when higher categories are more probable.
    • Negative log-log. f(x)=−log(−log(x)). Typically used when lower categories are more probable.
    • Probit. f(x)=Φ−1(x). Typically used when the latent variable is normally distributed.
    • Cauchit (inverse Cauchy). f(x)=tan(π(x−0.5)). Typically used when the latent variable has many extreme values.
  4. Optionally, click Select variables under the Independent variables section and select categorical or continuous factor and/or covariate variables that may have an influence on the dependent variable. Click OK after selecting the variables.
  5. Optionally, you can select the following options from the Additional settings menu:
    • Click Location model to specify an optional location model for the analysis.
    • Click Scale model to specify a scale component in the model when there is evidence that the location-only model is inadequate for the data.
    • Click Convergence criteria to specify the criteria for algorithm iterations.
    • Click Statistics to select statistics to include in the analysis.
    • Click Options to specify settings for parameter estimates and for model building methods.
    • Click Save to dataset to add casewise post-estimation statistics to the dataset as new variables.
    • Click Bootstrap for deriving robust estimates of standard errors and confidence intervals for estimates such as the mean, median, proportion, odds ratio, correlation coefficient or regression coefficient.
  6. Click Run analysis.

This procedure pastes PLUM command syntax.