Overview (LOGISTIC REGRESSION command)

LOGISTIC REGRESSION regresses a dichotomous dependent variable on a set of independent variables. Categorical independent variables are replaced by sets of contrast variables, each set entering and leaving the model in a single step.

Options

Processing of Independent Variables. You can specify which independent variables are categorical in nature on the CATEGORICAL subcommand. You can control treatment of categorical independent variables by the CONTRAST subcommand. Seven methods are available for entering independent variables into the model. You can specify any one of them on the METHOD subcommand. You can also use the keyword BY between variable names to enter interaction terms.

Selecting Cases. You can use the SELECT subcommand to define subsets of cases to be used in estimating a model.

Regression through the Origin. You can use the ORIGIN subcommand to exclude a constant term from a model.

Specifying Termination and Model-Building Criteria. You can further control computations when building the model by specifying criteria on the CRITERIA subcommand.

Adding New Variables to the Active Dataset. You can save the residuals, predicted values, and diagnostics that are generated by LOGISTIC REGRESSION in the active dataset.

Output. You can use the PRINT subcommand to print optional output, use the CASEWISE subcommand to request analysis of residuals, and use the ID subcommand to specify a variable whose values or value labels identify cases in output. You can request plots of the actual values and predicted values for each case with the CLASSPLOT subcommand.

Basic Specification

  • The minimum specification is the VARIABLES subcommand with one dichotomous dependent variable. You must specify a list of independent variables either following the keyword WITH on the VARIABLES subcommand or on a METHOD subcommand.
  • The default output includes goodness-of-fit tests for the model (–2 log-likelihood, goodness-of-fit statistic, Cox and Snell R 2, and Nagelkerke R 2) and a classification table for the predicted and observed group memberships. The regression coefficient, standard error of the regression coefficient, Wald statistic and its significance level, and a multiple correlation coefficient adjusted for the number of parameters (Atkinson, 1980) are displayed for each variable in the equation.

Subcommand Order

  • Subcommands can be named in any order. If the VARIABLES subcommand is not specified first, a slash (/) must precede it.
  • The ordering of METHOD subcommands determines the order in which models are estimated. Different sequences may result in different models.

Syntax Rules

  • Only one dependent variable can be specified for each LOGISTIC REGRESSION.
  • Any number of independent variables may be listed. The dependent variable may not appear on this list.
  • The independent variable list is required if any of the METHOD subcommands are used without a variable list or if the METHOD subcommand is not used. The keyword TO cannot be used on any variable list.
  • If you specify the keyword WITH on the VARIABLES subcommand, all independent variables must be listed.
  • If the keyword WITH is used on the VARIABLES subcommand, interaction terms do not have to be specified on the variable list, but the individual variables that make up the interactions must be listed.
  • Multiple METHOD subcommands are allowed.
  • The minimum truncation for this command is LOGI REG.

Operations

  • Independent variables that are specified on the CATEGORICAL subcommand are replaced by sets of contrast variables. In stepwise analyses, the set of contrast variables associated with a categorical variable is entered or removed from the model as a single step.
  • Independent variables are screened to detect and eliminate redundancies.
  • If the linearly dependent variable is one of a set of contrast variables, the set will be reduced by the redundant variable or variables. A warning will be issued, and the reduced set will be used.
  • For the forward stepwise method, redundancy checking is done when a variable is to be entered into the model.
  • When backward stepwise or direct-entry methods are requested, all variables for each METHOD subcommand are checked for redundancy before that analysis begins.

Compatibility

Prior to version 14.0, the order of recoded string values was dependent on the order of values in the data file. For example, when recoding the dependent variable, the first string value encountered was recoded to 0, and the second string value encountered was recoded to 1. Beginning with version 14.0, the procedure recodes string variables so that the order of recoded values is the alphanumeric order of the string values. Thus, the procedure may recode string variables differently than in previous versions.

Limitations

  • The dependent variable must be dichotomous for each split-file group. Specifying a dependent variable with more or less than two nonmissing values per split-file group will result in an error.