Statistics Lesson 20 – Scatterplots | Dataplexa

Scatterplots and Correlation

So far, we have focused on understanding and summarizing single variables. In many real-world problems, however, we want to understand relationships between two variables.

Scatterplots and correlation help us explore how variables move together.


What Is a Scatterplot?

A scatterplot displays the relationship between two numerical variables. Each point on the plot represents one observation.

One variable is plotted on the horizontal axis (X-axis), and the other on the vertical axis (Y-axis).


Simple Example

Suppose we record the number of hours studied and the exam score for students.

Hours Studied Exam Score
2 55
4 65
6 75
8 85

A scatterplot of this data would show points rising from left to right.


Patterns in Scatterplots

When analyzing a scatterplot, we usually look for:

  • Direction – upward, downward, or no pattern
  • Strength – how closely points follow a pattern
  • Form – linear or curved
  • Outliers – unusual points

Positive and Negative Relationships

If points tend to rise as we move from left to right, the relationship is positive.

If points tend to fall as we move from left to right, the relationship is negative.


Real-World Examples

  • Positive: Hours studied vs exam score
  • Negative: Speed vs travel time for a fixed distance

What Is Correlation?

Correlation measures the strength and direction of the relationship between two numerical variables.

The correlation coefficient is usually denoted by r.

Its value lies between:

  • −1 → Perfect negative correlation
  • 0 → No correlation
  • +1 → Perfect positive correlation

Interpreting Correlation Values

Correlation (r) Interpretation
Close to +1 Strong positive relationship
Close to −1 Strong negative relationship
Close to 0 Weak or no relationship

Numerical Example

If the correlation between study hours and exam scores is 0.85, it indicates a strong positive relationship.

This means that as study hours increase, exam scores tend to increase as well.


Correlation Does Not Imply Causation

A strong correlation does not mean that one variable causes the other.

Classic Example

There may be a strong correlation between ice cream sales and sunglasses sales.

This does not mean ice cream causes people to buy sunglasses. The underlying factor is sunny weather.


When Scatterplots Are Useful

  • Exploring relationships between variables
  • Detecting trends
  • Identifying outliers
  • Checking assumptions before regression

Quick Check

What does a correlation value close to 0 indicate?


Practice Quiz

Question 1:
Which plot is best for showing relationships between two numerical variables?


Question 2:
If r = −0.9, what kind of relationship exists?


Question 3:
Does correlation always indicate causation?


Mini Practice

A researcher studies the relationship between exercise time and resting heart rate.

  • What type of relationship would you expect?
  • Would a scatterplot be useful?

What’s Next

In the next lesson, we will explore Sampling Distributions and the Central Limit Theorem, which explain why sample statistics behave predictably.