Statistics Lesson 32 – Linear Regression | Dataplexa

Simple Linear Regression

In the previous lesson, we learned how correlation measures the strength and direction of a relationship.

Correlation tells us how variables move together, but it does not allow us to predict.

Simple linear regression takes the next step: it builds a mathematical model to describe and predict relationships.


What Is Simple Linear Regression?

Simple linear regression models the relationship between:

  • One independent variable (X)
  • One dependent variable (Y)

The goal is to explain how changes in X are associated with changes in Y.


The Regression Equation

The simple linear regression model is written as:

Y = a + bX

  • a = intercept
  • b = slope
  • X = independent variable
  • Y = predicted dependent variable

Understanding the Intercept (a)

The intercept represents the predicted value of Y when X equals zero.

In some contexts, this value has a real meaning. In others, it is simply a mathematical starting point.


Understanding the Slope (b)

The slope tells us how much Y changes for a one-unit increase in X.

If b is positive, Y increases as X increases. If b is negative, Y decreases as X increases.


Real-World Interpretation

Suppose we model the relationship between:

  • X = hours studied
  • Y = exam score

If the regression equation is:

Y = 40 + 5X

This means:

  • Each additional hour of study increases the expected score by 5 points
  • A student who studies 0 hours is predicted to score 40

How the Best-Fit Line Is Chosen

The regression line is chosen using the least squares method.

This method minimizes the sum of the squared vertical distances between observed values and predicted values.

In simple terms, it finds the line that best fits the data.


Numerical Example

Consider the following data:

Hours Studied (X) Score (Y)
2 50
4 60
6 72
8 85

A regression line fitted to this data might be:

Y = 38 + 6X

If a student studies 5 hours:

Predicted score = 38 + 6(5) = 68


Regression vs Correlation

Aspect Correlation Regression
Purpose Measure relationship Model & predict
Direction No direction Directional (X → Y)
Prediction No Yes

Limitations of Simple Linear Regression

  • Only models linear relationships
  • Sensitive to outliers
  • Does not imply causation
  • Requires careful interpretation

Quick Check

What does the slope represent in a regression model?


Practice Quiz

Question 1:
What is the purpose of simple linear regression?


Question 2:
If b = −3, what does this indicate?


Question 3:
Does regression prove causation?


Mini Practice

A company models advertising spend (X) and sales revenue (Y) using the equation:

Y = 10,000 + 2,000X

  • What does the slope mean?
  • Predict revenue when X = 3

What’s Next

In the next lesson, we will learn how to interpret regression output, including coefficients and R-squared.