SPSS Lesson 14 – Boxplots and Scatterplots | Dataplexa

Boxplots and Scatterplots

While bar charts, pie charts, and histograms help summarize data, some analytical questions require deeper insight into data spread, outliers, and relationships.

Boxplots and scatterplots are powerful visual tools used to explore distributions and relationships between variables. They are widely used in research, quality control, and business analytics.


Understanding Boxplots

A boxplot (box-and-whisker plot) summarizes the distribution of a numerical variable using five key values:

  • Minimum
  • First Quartile (Q1)
  • Median
  • Third Quartile (Q3)
  • Maximum

Boxplots are especially useful for:

  • Detecting outliers
  • Comparing distributions across groups
  • Understanding data spread

Example: Salary Distribution

Consider the following salary data:

Employee_ID Department Monthly_Salary
1001 IT 52000
1002 IT 58000
1003 HR 42000
1004 Sales 75000

A boxplot can immediately show whether salary values are evenly distributed or if extreme values exist.


EXAMINE VARIABLES=Monthly_Salary
  /PLOT=BOXPLOT
  /STATISTICS=NONE.

Interpreting a Boxplot

Key interpretation points:

  • The box shows the middle 50% of data
  • The line inside the box represents the median
  • Points outside whiskers indicate outliers

Outliers should be investigated, not automatically removed. They may represent valid but rare observations.


Understanding Scatterplots

Scatterplots visualize the relationship between two numerical variables. Each point represents one observation.

Scatterplots help answer questions like:

  • Does salary increase with experience?
  • Is there a relationship between study time and exam score?

Patterns in scatterplots indicate the type and strength of relationships.


Example: Experience vs Salary

Experience_Years Monthly_Salary
1 35000
3 42000
5 55000
8 70000

GRAPH
  /SCATTERPLOT(BIVAR)=Experience_Years WITH Monthly_Salary.

Interpreting Scatterplots

Key patterns to look for:

  • Positive relationship – both variables increase
  • Negative relationship – one increases, the other decreases
  • No relationship – points scattered randomly

Scatterplots are often used before correlation or regression analysis.


Common Mistakes

Beginners often make these mistakes:

  • Using scatterplots for categorical data
  • Ignoring outliers in boxplots
  • Assuming causation from correlation

Correct interpretation is essential to avoid misleading conclusions.


Quiz 1

What does a boxplot primarily show?

Data distribution and outliers.


Quiz 2

Which variables are suitable for scatterplots?

Two numerical variables.


Quiz 3

What does an outlier represent?

An unusually high or low value.


Quiz 4

What pattern indicates a positive relationship?

Points trending upward from left to right.


Quiz 5

Why are scatterplots used before regression?

To visually assess relationships between variables.


Mini Practice

Create a dataset with:

  • Experience_Years
  • Monthly_Salary

Perform:

  • A boxplot for Monthly_Salary
  • A scatterplot between Experience_Years and Monthly_Salary

Use Analyze → Descriptive Statistics → Explore for boxplots and Graphs → Chart Builder for scatterplots.


What’s Next

In the next lesson, you will learn how to save, export, and manage SPSS output, which is essential for reporting and documentation.