Time Series Lesson 8 – Rolling Stats | Dataplexa

Rolling Statistics in Time Series

In the previous lesson, we learned how resampling helps us look at data at different time resolutions.

But sometimes, resampling is too aggressive. We don’t want to collapse data — we want to smooth it gradually.

This is where rolling statistics come in.

Real-World Problem First

Imagine you are monitoring:

Daily website traffic
Daily stock prices
Daily electricity usage

Every day has ups and downs.

If your manager asks:

“Is traffic increasing overall or just fluctuating randomly?”

Looking at raw daily data makes this very hard to answer.

We need a way to see the local trend — not too noisy, not too smooth.

What Are Rolling Statistics?

Rolling statistics compute values over a moving window that slides across time.

At each point, we calculate statistics using only recent data.

Common rolling statistics:

Rolling mean (moving average)
Rolling standard deviation
Rolling min / max

They answer the question:

“What has been happening recently?”

Creating a Realistic Daily Time Series

We will reuse a realistic daily sales dataset. This mimics real business data.

Python: Generate Daily Sales

Python: Daily Sales

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

np.random.seed(10)

dates = pd.date_range("2023-01-01", periods=365, freq="D")
trend = np.linspace(80, 140, 365)
weekly = 12 * np.sin(2 * np.pi * np.arange(365) / 7)
noise = np.random.normal(0, 6, 365)

sales = trend + weekly + noise
df = pd.DataFrame({"sales": sales}, index=dates)

Here is the raw daily data:

Observation:

Lots of short-term noise
Trend exists but is hard to see
Decision-making is difficult

Rolling Mean (Moving Average)

A rolling mean calculates the average of the last N values.

For example:

7-day rolling mean → last week
30-day rolling mean → last month

This smooths the data without destroying time structure.

Python: 7-Day Rolling Mean

rolling_7 = df["sales"].rolling(window=7).mean()

Here is how the 7-day rolling mean looks:

What changed?

Noise reduced
Weekly pattern smoother
Trend easier to see

Longer Window: 30-Day Rolling Mean

Now let’s smooth the data even more using a 30-day window.

This focuses on medium-term behavior.

Python: 30-Day Rolling Mean

rolling_30 = df["sales"].rolling(window=30).mean()

Here is the 30-day rolling mean:

Notice:

Very smooth curve
Short-term fluctuations removed
Excellent for trend analysis

Rolling Standard Deviation

Rolling mean shows direction.

Rolling standard deviation shows volatility.

It answers:

“How stable is the data recently?”

Python: Rolling Volatility

Python: Rolling Std

rolling_std = df["sales"].rolling(window=30).std()

Here is the rolling volatility:

Interpretation:

Higher values → unstable period
Lower values → consistent behavior
Very useful for risk analysis

Choosing the Right Window Size

Window	Best For
7 days	Short-term patterns
30 days	Monthly trends
90+ days	Long-term stability

There is no “perfect” window. It depends on the business question.

Common Mistakes

Using very large windows and losing patterns
Comparing raw data directly with smoothed data
Ignoring missing values at window edges

Key Takeaways

Rolling statistics smooth data gradually
Rolling mean reveals local trends
Rolling std shows volatility
Window size controls smoothness

Next Lesson

In the next lesson, we will dive into Autocorrelation (ACF) and understand how past values influence future values.

← Previous Course Index Next →