Rolling Window Features
Lag features give the model individual past values. But in real life, we rarely make decisions based on a single past day.
We usually think in terms of recent behavior:
- How has demand behaved over the last few days?
- Is usage consistently rising or falling?
- Is volatility increasing?
Rolling window features capture exactly this kind of thinking.
Real-World Context
We continue with the same example:
Daily electricity consumption of a city.
An energy planner rarely asks:
“What was usage exactly yesterday?”
Instead, they ask:
- What is the recent average?
- Is consumption becoming unstable?
- Are peaks getting stronger?
Rolling windows convert these questions into numerical features.
What Is a Rolling Window?
A rolling window takes a fixed number of past observations and summarizes them.
Common summaries:
- Rolling mean
- Rolling minimum / maximum
- Rolling standard deviation
Each window moves forward one step at a time.
Base Time Series
Here is our electricity usage series again.
import numpy as np
np.random.seed(7)
days = np.arange(180)
usage = 120 + 0.2*days + 15*np.sin(2*np.pi*days/7) + np.random.normal(0,4,180)
Rolling Mean (Smoothing Recent Behavior)
The rolling mean answers:
“On average, how high has usage been recently?”
This smooths out noise and highlights underlying movement.
window = 7
rolling_mean = np.convolve(usage, np.ones(window)/window, mode='valid')
What you should notice:
- Sharp noise is reduced
- Weekly pattern becomes clearer
- Trend is easier to see
Rolling mean helps the model focus on signal, not noise.
Rolling Standard Deviation (Volatility)
Average alone is not enough.
Sometimes the question is:
“Is usage becoming unstable?”
Rolling standard deviation measures recent variability.
rolling_std = [
np.std(usage[i-7:i]) if i >= 7 else None
for i in range(len(usage))
]
Interpretation:
- Higher values → unstable consumption
- Lower values → predictable behavior
This feature is crucial for detecting risk and uncertainty.
Rolling Min & Max (Recent Extremes)
Rolling minimum and maximum show recent boundaries.
They answer:
- How low has usage dropped recently?
- How high has it peaked?
rolling_min = [
np.min(usage[i-7:i]) if i >= 7 else None
for i in range(len(usage))
]
rolling_max = [
np.max(usage[i-7:i]) if i >= 7 else None
for i in range(len(usage))
]
This gives the model context about recent extremes, not just central tendency.
Why Rolling Features Are Powerful
- Summarize recent history
- Reduce noise sensitivity
- Improve stability of predictions
Tree-based models and boosting algorithms benefit enormously from rolling statistics.
Common Window Sizes
- 7 → weekly behavior
- 14 → bi-weekly trends
- 30 → monthly patterns
Window size should match real-world cycles.
Practice Questions
Q1. Why does rolling mean help more than lag-1?
Q2. When is rolling standard deviation important?
Key Takeaways
- Rolling windows summarize recent history
- Different statistics capture different behavior
- They are essential for robust forecasting
Next lesson: we’ll learn how to split time series data correctly for training and testing.