SPSS Lesson 26 – Linear Regression | Dataplexa

Linear Regression

In earlier lessons, you learned how to compare groups and test whether differences exist.

In many practical situations, however, the goal is not just comparison, but prediction and explanation.

Linear Regression is used to model the relationship between a dependent variable and one or more independent variables.


What Is Linear Regression?

Linear regression examines how a numerical outcome changes as another variable changes.

It answers questions such as:

  • How does sales change with advertising spend?
  • How does salary change with years of experience?
  • How does performance change with training hours?

In simple linear regression, one predictor variable is used.


The Regression Equation

The relationship is expressed as:

Y = a + bX

Where:

  • Y → Dependent variable (outcome)
  • X → Independent variable (predictor)
  • a → Intercept
  • b → Regression coefficient (slope)

The coefficient b indicates how much Y changes for a one-unit increase in X.


Example Dataset

Consider the relationship between study hours and exam score:

Student_ID Study_Hours Score
1801 2 55
1802 4 65
1803 6 78
1804 8 88

The objective is to predict exam score based on study hours.


Key Assumptions

Linear regression relies on several assumptions:

  • Linear relationship between X and Y
  • Normal distribution of residuals
  • Constant variance (homoscedasticity)
  • Independence of observations

Violations affect interpretation and prediction accuracy.


Running Linear Regression (Menu)

To perform regression using SPSS menus:

  • Go to Analyze → Regression → Linear
  • Move the dependent variable to Dependent
  • Move the predictor to Independent(s)
  • Click OK

SPSS produces model summary, ANOVA, and coefficient tables.


Using SPSS Syntax


REGRESSION
  /DEPENDENT Score
  /METHOD=ENTER Study_Hours.

This syntax predicts exam score using study hours.


Interpreting the Output

Focus on these key values:

  • R Square – proportion of variance explained
  • Regression coefficient (B) – effect size
  • Sig. (p-value) – significance of predictor

Example interpretation:

  • R² = 0.85 → 85% of score variation explained
  • B = 4.2 → Each extra study hour increases score by ~4.2 points
  • p < 0.05 → Predictor is statistically significant

Common Mistakes

Frequent errors include:

  • Assuming causation from regression
  • Ignoring assumption diagnostics
  • Overinterpreting R²

Regression explains relationships, not guarantees causality.


Quiz 1

What is the purpose of linear regression?

To predict or explain a dependent variable.


Quiz 2

What does the regression coefficient represent?

Change in Y for a one-unit change in X.


Quiz 3

What does R² indicate?

Proportion of variance explained by the model.


Quiz 4

Which SPSS menu is used for regression?

Analyze → Regression → Linear.


Quiz 5

Does regression prove causation?

No.


Mini Practice

Collect data on:

  • Advertising spend
  • Monthly sales

Run a linear regression to predict sales from advertising spend and interpret the coefficients.

Use Analyze → Regression → Linear and interpret R² and coefficients.


What’s Next

In the next lesson, you will learn about Multiple Linear Regression, which uses more than one predictor variable.