Feature Engineering for Time Series
Forecasting models do not understand time the way humans do. They only understand numbers placed in columns.
Feature engineering is the process of translating time-related patterns into numerical signals that models can learn from.
In this lesson, we will take a real-world time series and carefully transform it into meaningful features — step by step, with clear visuals.
A Real-World Example
Assume you are forecasting daily electricity consumption for a city.
Electricity usage depends on:
- Day of the week
- Recent usage patterns
- Seasonal behavior
Our goal is to convert these ideas into usable features.
Raw Time Series
Below is a simulated daily electricity usage series. It contains:
- A slow upward trend
- Weekly usage patterns
- Random fluctuations
import numpy as np
np.random.seed(10)
days = np.arange(180)
trend = 0.15 * days
seasonal = 12 * np.sin(2 * np.pi * days / 7)
noise = np.random.normal(0, 3, size=180)
usage = 120 + trend + seasonal + noise
This is what most real-world data looks like — useful information is mixed together.
Why Feature Engineering Is Necessary
A machine learning model does not automatically know:
- What happened yesterday
- What usually happens on weekends
- Whether recent values are increasing or decreasing
We must explicitly provide that information.
Lag Features (Recent Memory)
Lag features represent past values of the series.
They answer questions like:
- What was usage yesterday?
- What was usage two days ago?
lag_1 = np.roll(usage, 1)
lag_7 = np.roll(usage, 7)
What this plot shows:
- Lag-1 follows the original series closely
- Lag-7 aligns with weekly patterns
These features give the model short-term and seasonal memory.
Rolling Statistics (Local Context)
Rolling features summarize recent behavior instead of single points.
They answer questions like:
- Is usage increasing lately?
- Is recent usage stable or volatile?
window = 7
rolling_mean = np.convolve(usage, np.ones(window)/window, mode='same')
Rolling averages:
- Smooth short-term noise
- Reveal underlying direction
Models use this to understand local trends.
Calendar Features
Some patterns are driven by the calendar, not past values.
Examples:
- Weekdays vs weekends
- Workdays vs holidays
Below is a simple weekend indicator.
day_of_week = days % 7
is_weekend = (day_of_week >= 5).astype(int)
This feature helps the model learn:
- Usage behaves differently on weekends
- Patterns repeat weekly
Why These Features Work Together
Each feature captures a different aspect of time:
- Lag features → recent memory
- Rolling statistics → local behavior
- Calendar features → external structure
Together, they convert time into learnable signals.
Practice Questions
Q1. Why is lag-7 especially useful for daily data?
Q2. Why use rolling features instead of only lag features?
Key Takeaways
- Models do not understand time by default
- Feature engineering bridges that gap
- Good features often matter more than complex models
Next lesson: we will build more powerful lag-based features.