SPSS Lesson 37 – Model Diagnostics | Dataplexa

Model Diagnostics

Building a statistical model does not end with estimating coefficients.

Model diagnostics are used to check whether a model is reliable, valid, and appropriate for interpretation.

A model that violates assumptions can lead to incorrect conclusions, even if results appear significant.

Why Model Diagnostics Are Important

Statistical models are based on assumptions.

Diagnostics help answer questions such as:

Are the residuals normally distributed?
Is variance constant?
Are observations independent?
Are there influential outliers?

Ignoring diagnostics is one of the most common mistakes in data analysis.

What Are Residuals?

A residual is the difference between an observed value and the value predicted by the model.

Residual = Observed − Predicted

Residuals capture what the model fails to explain.

Key Diagnostic Checks

Most regression diagnostics focus on:

Normality of residuals
Homoscedasticity (constant variance)
Independence of errors
Outliers and influential points

Checking Normality of Residuals

Normal residuals indicate that parameter estimates are reliable.

In SPSS, normality can be checked using:

Histogram of residuals
Normal Q–Q plot

A roughly bell-shaped histogram suggests normality.

Checking Homoscedasticity

Homoscedasticity means residual variance is constant across predicted values.

Violation leads to unreliable standard errors.

In SPSS:

Plot residuals vs predicted values

A random scatter indicates homoscedasticity.

Detecting Outliers and Influential Points

Outliers can distort model estimates.

Common diagnostic measures include:

Standardized residuals
Cook’s distance
Leverage values

Large values indicate potentially influential cases.

Running Diagnostics in SPSS (Menu)

To generate diagnostic plots:

Go to Analyze → Regression → Linear
Click Plots
Select residuals and predicted values
Click OK

SPSS displays diagnostic plots in the output viewer.

SPSS Syntax Example


REGRESSION
  /DEPENDENT Sales
  /METHOD=ENTER Advertising
  /PLOT SCATTER(ZRESID,ZPRED)
  /SAVE ZRESID.

Interpreting Diagnostic Results

When diagnostics are acceptable:

Residuals are centered around zero
No clear pattern in residual plots
No extreme influential cases

If diagnostics fail:

Consider transformations
Remove problematic outliers
Use alternative models

Common Mistakes

Frequent errors include:

Ignoring diagnostic plots
Trusting p-values blindly
Removing outliers without justification

Diagnostics should guide decisions, not be ignored.

Quiz 1

What is a residual?

Observed value minus predicted value.

Quiz 2

Why check residual normality?

To ensure reliable parameter estimates.

Quiz 3

What does homoscedasticity mean?

Constant variance of residuals.

Quiz 4

Which plot checks homoscedasticity?

Residuals vs predicted values plot.

Quiz 5

Should diagnostics be skipped if results look good?

No.

Mini Practice

Run a linear regression model using any dataset.

Generate residual plots and evaluate whether model assumptions are satisfied.

Use plots for residuals, check randomness, and look for outliers.

What’s Next

In the next lesson, you will learn about Advanced Charts, used to communicate results more effectively.

← Previous Lesson SPSS Index Next ➜