Statistical Functions in NumPy
Statistical analysis is one of the most important uses of NumPy. It allows you to summarize, analyze, and understand large numeric datasets using simple and efficient functions.
In this lesson, you will learn how to calculate common statistical values such as mean, median, variance, standard deviation, minimum, and maximum.
Sample Dataset
We will use a simple numeric dataset throughout this lesson.
import numpy as np
data = np.array([45, 50, 55, 60, 65, 70, 75])
print(data)
This dataset could represent scores, prices, or measurements.
Mean (Average)
The mean is the average of all values.
mean_value = np.mean(data)
print(mean_value)
Output:
60.0
This means the average value of the dataset is 60.
Median
The median is the middle value when data is sorted.
median_value = np.median(data)
print(median_value)
Output:
60.0
Median is less affected by extreme values than mean.
Minimum and Maximum
To find the smallest and largest values:
min_value = np.min(data)
max_value = np.max(data)
print(min_value)
print(max_value)
Output:
45
75
These values define the range of the dataset.
Range
The range is the difference between the maximum and minimum values.
range_value = np.max(data) - np.min(data)
print(range_value)
This shows how spread out the values are.
Variance
Variance measures how far values are from the mean.
variance = np.var(data)
print(variance)
Output:
100.0
A higher variance means more spread in the data.
Standard Deviation
Standard deviation is the square root of variance. It is commonly used in real-world analysis.
std_dev = np.std(data)
print(std_dev)
Output:
10.0
This means values typically differ from the mean by about 10 units.
Sum and Count
You can also calculate the total and number of elements.
total = np.sum(data)
count = np.size(data)
print(total)
print(count)
Output:
420
7
These values are often used in reporting and analytics.
Real-World Example
Imagine these values represent daily sales amounts:
- Mean → average daily sales
- Max → best sales day
- Min → lowest sales day
- Standard deviation → consistency of sales
Statistical functions help you summarize data in just a few lines of code.
Practice Exercise
Task
- Create a NumPy array with 10 numbers
- Calculate mean, median, variance, and standard deviation
- Print minimum and maximum values
What’s Next?
In the next lesson, you will learn about Sorting and Searching in NumPy, which helps organize and locate values efficiently.