SPSS Lesson 37 – Model Diagnostics | Dataplexa

Model Diagnostics

Building a statistical model does not end with estimating coefficients.

Model diagnostics are used to check whether a model is reliable, valid, and appropriate for interpretation.

A model that violates assumptions can lead to incorrect conclusions, even if results appear significant.


Why Model Diagnostics Are Important

Statistical models are based on assumptions.

Diagnostics help answer questions such as:

  • Are the residuals normally distributed?
  • Is variance constant?
  • Are observations independent?
  • Are there influential outliers?

Ignoring diagnostics is one of the most common mistakes in data analysis.


What Are Residuals?

A residual is the difference between an observed value and the value predicted by the model.

Residual = Observed − Predicted

Residuals capture what the model fails to explain.


Key Diagnostic Checks

Most regression diagnostics focus on:

  • Normality of residuals
  • Homoscedasticity (constant variance)
  • Independence of errors
  • Outliers and influential points

Checking Normality of Residuals

Normal residuals indicate that parameter estimates are reliable.

In SPSS, normality can be checked using:

  • Histogram of residuals
  • Normal Q–Q plot

A roughly bell-shaped histogram suggests normality.


Checking Homoscedasticity

Homoscedasticity means residual variance is constant across predicted values.

Violation leads to unreliable standard errors.

In SPSS:

  • Plot residuals vs predicted values

A random scatter indicates homoscedasticity.


Detecting Outliers and Influential Points

Outliers can distort model estimates.

Common diagnostic measures include:

  • Standardized residuals
  • Cook’s distance
  • Leverage values

Large values indicate potentially influential cases.


Running Diagnostics in SPSS (Menu)

To generate diagnostic plots:

  • Go to Analyze → Regression → Linear
  • Click Plots
  • Select residuals and predicted values
  • Click OK

SPSS displays diagnostic plots in the output viewer.


SPSS Syntax Example


REGRESSION
  /DEPENDENT Sales
  /METHOD=ENTER Advertising
  /PLOT SCATTER(ZRESID,ZPRED)
  /SAVE ZRESID.

Interpreting Diagnostic Results

When diagnostics are acceptable:

  • Residuals are centered around zero
  • No clear pattern in residual plots
  • No extreme influential cases

If diagnostics fail:

  • Consider transformations
  • Remove problematic outliers
  • Use alternative models

Common Mistakes

Frequent errors include:

  • Ignoring diagnostic plots
  • Trusting p-values blindly
  • Removing outliers without justification

Diagnostics should guide decisions, not be ignored.


Quiz 1

What is a residual?

Observed value minus predicted value.


Quiz 2

Why check residual normality?

To ensure reliable parameter estimates.


Quiz 3

What does homoscedasticity mean?

Constant variance of residuals.


Quiz 4

Which plot checks homoscedasticity?

Residuals vs predicted values plot.


Quiz 5

Should diagnostics be skipped if results look good?

No.


Mini Practice

Run a linear regression model using any dataset.

Generate residual plots and evaluate whether model assumptions are satisfied.

Use plots for residuals, check randomness, and look for outliers.


What’s Next

In the next lesson, you will learn about Advanced Charts, used to communicate results more effectively.