NumPy Lesson 15 – Statistical Functions| Dataplexa

Statistical Functions in NumPy

Statistical analysis is one of the most important uses of NumPy. It allows you to summarize, analyze, and understand large numeric datasets using simple and efficient functions.

In this lesson, you will learn how to calculate common statistical values such as mean, median, variance, standard deviation, minimum, and maximum.


Sample Dataset

We will use a simple numeric dataset throughout this lesson.

import numpy as np

data = np.array([45, 50, 55, 60, 65, 70, 75])
print(data)

This dataset could represent scores, prices, or measurements.


Mean (Average)

The mean is the average of all values.

mean_value = np.mean(data)
print(mean_value)

Output:

60.0

This means the average value of the dataset is 60.


Median

The median is the middle value when data is sorted.

median_value = np.median(data)
print(median_value)

Output:

60.0

Median is less affected by extreme values than mean.


Minimum and Maximum

To find the smallest and largest values:

min_value = np.min(data)
max_value = np.max(data)

print(min_value)
print(max_value)

Output:

45
75

These values define the range of the dataset.


Range

The range is the difference between the maximum and minimum values.

range_value = np.max(data) - np.min(data)
print(range_value)

This shows how spread out the values are.


Variance

Variance measures how far values are from the mean.

variance = np.var(data)
print(variance)

Output:

100.0

A higher variance means more spread in the data.


Standard Deviation

Standard deviation is the square root of variance. It is commonly used in real-world analysis.

std_dev = np.std(data)
print(std_dev)

Output:

10.0

This means values typically differ from the mean by about 10 units.


Sum and Count

You can also calculate the total and number of elements.

total = np.sum(data)
count = np.size(data)

print(total)
print(count)

Output:

420
7

These values are often used in reporting and analytics.


Real-World Example

Imagine these values represent daily sales amounts:

  • Mean → average daily sales
  • Max → best sales day
  • Min → lowest sales day
  • Standard deviation → consistency of sales

Statistical functions help you summarize data in just a few lines of code.


Practice Exercise

Task

  • Create a NumPy array with 10 numbers
  • Calculate mean, median, variance, and standard deviation
  • Print minimum and maximum values

What’s Next?

In the next lesson, you will learn about Sorting and Searching in NumPy, which helps organize and locate values efficiently.