Time Series Lesson 8 – Rolling Stats | Dataplexa

Rolling Statistics in Time Series

In the previous lesson, we learned how resampling helps us look at data at different time resolutions.

But sometimes, resampling is too aggressive. We don’t want to collapse data — we want to smooth it gradually.

This is where rolling statistics come in.


Real-World Problem First

Imagine you are monitoring:

  • Daily website traffic
  • Daily stock prices
  • Daily electricity usage

Every day has ups and downs.

If your manager asks:

“Is traffic increasing overall or just fluctuating randomly?”

Looking at raw daily data makes this very hard to answer.

We need a way to see the local trend — not too noisy, not too smooth.


What Are Rolling Statistics?

Rolling statistics compute values over a moving window that slides across time.

At each point, we calculate statistics using only recent data.

Common rolling statistics:

  • Rolling mean (moving average)
  • Rolling standard deviation
  • Rolling min / max

They answer the question:

“What has been happening recently?”


Creating a Realistic Daily Time Series

We will reuse a realistic daily sales dataset. This mimics real business data.

Python: Generate Daily Sales

Python: Daily Sales
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

np.random.seed(10)

dates = pd.date_range("2023-01-01", periods=365, freq="D")
trend = np.linspace(80, 140, 365)
weekly = 12 * np.sin(2 * np.pi * np.arange(365) / 7)
noise = np.random.normal(0, 6, 365)

sales = trend + weekly + noise
df = pd.DataFrame({"sales": sales}, index=dates)

Here is the raw daily data:

Observation:

  • Lots of short-term noise
  • Trend exists but is hard to see
  • Decision-making is difficult

Rolling Mean (Moving Average)

A rolling mean calculates the average of the last N values.

For example:

  • 7-day rolling mean → last week
  • 30-day rolling mean → last month

This smooths the data without destroying time structure.

Python: 7-Day Rolling Mean

Python: 7-Day Rolling Mean
rolling_7 = df["sales"].rolling(window=7).mean()

Here is how the 7-day rolling mean looks:

What changed?

  • Noise reduced
  • Weekly pattern smoother
  • Trend easier to see

Longer Window: 30-Day Rolling Mean

Now let’s smooth the data even more using a 30-day window.

This focuses on medium-term behavior.

Python: 30-Day Rolling Mean

Python: 30-Day Rolling Mean
rolling_30 = df["sales"].rolling(window=30).mean()

Here is the 30-day rolling mean:

Notice:

  • Very smooth curve
  • Short-term fluctuations removed
  • Excellent for trend analysis

Rolling Standard Deviation

Rolling mean shows direction.

Rolling standard deviation shows volatility.

It answers:

“How stable is the data recently?”

Python: Rolling Volatility

Python: Rolling Std
rolling_std = df["sales"].rolling(window=30).std()

Here is the rolling volatility:

Interpretation:

  • Higher values → unstable period
  • Lower values → consistent behavior
  • Very useful for risk analysis

Choosing the Right Window Size

Window Best For
7 days Short-term patterns
30 days Monthly trends
90+ days Long-term stability

There is no “perfect” window. It depends on the business question.


Common Mistakes

  • Using very large windows and losing patterns
  • Comparing raw data directly with smoothed data
  • Ignoring missing values at window edges

Key Takeaways

  • Rolling statistics smooth data gradually
  • Rolling mean reveals local trends
  • Rolling std shows volatility
  • Window size controls smoothness

Next Lesson

In the next lesson, we will dive into Autocorrelation (ACF) and understand how past values influence future values.