Lag Features in Time Series
Time series data carries memory. What happened in the past often influences what happens next.
Lag features are the simplest and most powerful way to give that memory to a forecasting model.
Real-World Context
Let’s continue with the same real-world example:
Daily electricity consumption of a city.
Think like a human for a moment. If electricity usage was high yesterday, what do you expect today?
- Probably similar
- Not completely random
Lag features formalize this intuition.
What Is a Lag Feature?
A lag feature is simply a past value of the time series used as an input variable.
Examples:
- Lag-1 → yesterday’s value
- Lag-7 → value one week ago
- Lag-30 → value one month ago
Each lag gives the model a different window into history.
Base Time Series
Below is our electricity usage series again.
import numpy as np
np.random.seed(4)
days = np.arange(180)
usage = 120 + 0.2*days + 15*np.sin(2*np.pi*days/7) + np.random.normal(0,4,180)
This series already contains:
- Trend
- Weekly seasonality
- Noise
Lag-1 Feature (Short-Term Memory)
Lag-1 represents yesterday’s electricity usage.
This is the strongest signal in many real datasets.
lag_1 = np.roll(usage, 1)
What this plot shows:
- Lag-1 closely follows the original series
- Small shifts reflect day-to-day continuity
This tells the model: today usually looks like yesterday.
Lag-7 Feature (Weekly Memory)
Electricity usage strongly depends on the day of the week.
Lag-7 captures that weekly repetition.
lag_7 = np.roll(usage, 7)
Notice:
- Patterns align every 7 days
- Weekend behavior repeats
Lag-7 teaches the model: this day behaves like the same weekday last week.
Multiple Lag Features Together
In practice, we rarely use just one lag.
Models work best when given a range of memory:
- Short-term (lag-1, lag-2)
- Medium-term (lag-7)
- Long-term (lag-14, lag-30)
lag_2 = np.roll(usage, 2)
lag_14 = np.roll(usage, 14)
This gives the model a richer historical context.
Why Lag Features Work So Well
Lag features:
- Convert time dependency into columns
- Require no complex math
- Work with almost every ML model
That is why they are the backbone of:
- Linear regression forecasting
- Tree-based models
- Gradient boosting
Common Mistakes With Lag Features
- Using future values (data leakage)
- Using too many lags without enough data
- Ignoring seasonal lags like 7 or 12
Lag features must respect the direction of time.
Practice Questions
Q1. Why is lag-1 often the strongest feature?
Q2. Why is lag-7 important for daily data?
Key Takeaways
- Lag features provide memory
- Different lags capture different time scales
- They are essential for ML forecasting
Next lesson: we’ll build rolling window features to summarize recent behavior instead of single points.