SPSS Lesson 28 – Logistic Regression | Dataplexa

Logistic Regression

So far, you have learned regression techniques used to predict numerical outcomes.

However, many real-world problems involve categorical outcomes, such as Yes/No, Pass/Fail, Buy/Not Buy.

Logistic Regression is used when the dependent variable is binary or categorical.


Why Linear Regression Is Not Suitable

Linear regression assumes the outcome variable can take any numerical value.

For binary outcomes:

  • Predictions must stay between 0 and 1
  • Relationships are not linear

Logistic regression solves this problem by modeling probabilities instead of raw values.


Understanding Probability and Odds

Logistic regression works with probabilities and odds.

Probability ranges from 0 to 1, while odds compare the chance of success to the chance of failure.

Example:

  • Probability = 0.75
  • Odds = 0.75 / 0.25 = 3

This means the event is three times more likely to occur than not.


The Logistic Regression Model

Instead of a straight line, logistic regression uses an S-shaped curve.

It models the logit:

log(p / (1 − p)) = a + bX

Where:

  • p → probability of success
  • a → intercept
  • b → coefficient
  • X → predictor

Coefficients are interpreted using odds ratios.


Example Dataset

Consider customer purchase behavior:

Customer_ID Income Purchased
1901 30000 0
1902 45000 1
1903 60000 1
1904 25000 0

Here:

  • Purchased = 1 → customer bought the product
  • Purchased = 0 → customer did not buy

When to Use Logistic Regression

Logistic regression is appropriate when:

  • Dependent variable is binary
  • Independent variables are numeric or categorical
  • You want probability-based predictions

Common use cases include:

  • Customer churn prediction
  • Credit approval decisions
  • Employee attrition analysis

Running Logistic Regression (Menu)

To run logistic regression in SPSS:

  • Go to Analyze → Regression → Binary Logistic
  • Move the binary variable to Dependent
  • Move predictors to Covariates
  • Click OK

SPSS produces classification tables and coefficient estimates.


Using SPSS Syntax


LOGISTIC REGRESSION VARIABLES Purchased
  /METHOD=ENTER Income.

This model predicts purchase probability based on income.


Interpreting the Output

Key elements to interpret:

  • B – logistic coefficient
  • Exp(B) – odds ratio
  • Sig. – significance

Example interpretation:

  • Exp(B) = 1.5 → odds increase by 50% per unit increase
  • p < 0.05 → predictor is statistically significant

Common Mistakes

Typical errors include:

  • Using linear regression for binary outcomes
  • Misinterpreting odds ratios as probabilities
  • Ignoring model fit statistics

Logistic regression requires careful interpretation.


Quiz 1

When is logistic regression used?

When the dependent variable is binary.


Quiz 2

What does Exp(B) represent?

Odds ratio.


Quiz 3

Why is linear regression unsuitable here?

Predictions can fall outside 0–1 range.


Quiz 4

Which SPSS menu runs logistic regression?

Analyze → Regression → Binary Logistic.


Quiz 5

Does logistic regression predict probabilities?

Yes.


Mini Practice

Create a dataset with:

  • Employee_Age
  • Years_Experience
  • Promotion (Yes/No)

Use logistic regression to predict promotion likelihood.

Use Binary Logistic Regression and interpret odds ratios.


What’s Next

In the next lesson, you will learn about Data Transformations, used to improve model assumptions and interpretability.