Histograms and Boxplots
Bar charts and pie charts help us compare categories. But when data is numerical and continuous, we need different tools to understand its distribution.
Histograms and boxplots are two powerful visualizations used to understand how data is spread.
What Is a Histogram?
A histogram shows the distribution of numerical data by grouping values into intervals called bins.
Each bar represents the frequency of values within a specific range.
Unlike bar charts, histograms:
- Are used for continuous data
- Have bars that touch each other
Example (Histogram Data)
Suppose we record exam scores for 30 students. The scores are grouped as follows:
| Score Range | Number of Students |
|---|---|
| 40 – 49 | 3 |
| 50 – 59 | 6 |
| 60 – 69 | 10 |
| 70 – 79 | 7 |
| 80 – 89 | 4 |
A histogram makes it easy to see where most students scored.
Why Histograms Are Useful
- Show shape of data distribution
- Reveal skewness
- Help identify gaps or clusters
- Support assumptions about normality
What Is a Boxplot?
A boxplot (or box-and-whisker plot) summarizes data using five key values:
- Minimum
- First Quartile (Q1)
- Median (Q2)
- Third Quartile (Q3)
- Maximum
Boxplots provide a compact visual summary of data spread and central tendency.
Boxplot Components
| Component | Meaning |
|---|---|
| Box | Middle 50% of the data (Q1 to Q3) |
| Line inside box | Median |
| Whiskers | Spread of the remaining data |
| Outliers | Extreme values outside the whiskers |
Numerical Example (Boxplot)
Consider the dataset:
10, 12, 15, 18, 20, 22, 25, 30
- Q1 = 13.5
- Median = 19
- Q3 = 23.5
A boxplot would visually show:
- Center of the data
- Spread of values
- Any extreme values
Histogram vs Boxplot
| Aspect | Histogram | Boxplot |
|---|---|---|
| Shows distribution shape | Yes | No |
| Shows median | No | Yes |
| Identifies outliers | Sometimes | Clearly |
| Best for | Understanding frequency | Comparing datasets |
Real-World Example
In salary analysis:
- A histogram shows salary distribution
- A boxplot highlights median pay and income inequality
Together, they provide a complete picture.
Common Mistakes
- Using too few or too many bins in histograms
- Ignoring outliers in boxplots
- Comparing histograms with different bin widths
Quick Check
Which plot is better for identifying outliers?
Boxplot.
Practice Quiz
Question 1:
Which visualization shows the shape of the distribution?
Histogram.
Question 2:
Which plot displays quartiles?
Boxplot.
Mini Practice
You are analyzing test scores for two different classes.
- Which plot would help compare medians?
- Which plot would help see score distribution?
Boxplots help compare medians. Histograms help see distributions.
What’s Next
In the next lesson, we will study Scatterplots and Correlation, which help us understand relationships between variables.