Train–Test Split for Time Series
Before any model is trained, one decision silently determines whether the results are meaningful or completely misleading.
That decision is how we split the data.
In time series, splitting data the wrong way does not just reduce accuracy — it breaks the logic of time itself.
The Real-World Situation
Imagine you are forecasting daily electricity demand for a city.
You have two years of historical data and want to predict future usage.
There is only one rule the real world follows:
The future is never allowed to influence the past.
Time series models must obey this rule exactly.
The Common (and Dangerous) Mistake
In regular machine learning, data is often split randomly.
That approach completely fails for time series.
Random splitting causes the model to see future information during training.
This is called data leakage.
Our Example Data
We continue with the same electricity usage example.
import numpy as np
np.random.seed(10)
days = np.arange(200)
usage = 130 + 0.15*days + 12*np.sin(2*np.pi*days/7) + np.random.normal(0,5,200)
This is a realistic series:
- Slow upward trend
- Weekly seasonality
- Random noise
What a Random Split Looks Like (Wrong)
A random split mixes past and future together.
Visually, it looks like this:
Why this is wrong:
- The model learns patterns from the future
- Evaluation becomes overly optimistic
- Real deployment will fail
This mistake is extremely common and very costly.
The Correct Way: Time-Based Split
Time series must be split chronologically.
Training uses the past.
Testing uses the future.
Nothing crosses that boundary.
Chronological Split in Code
split_point = int(len(usage) * 0.8)
train = usage[:split_point]
test = usage[split_point:]
Here is how the correct split looks visually:
Now the logic is preserved:
- Training sees only historical data
- Testing represents unseen future
- Evaluation reflects real performance
Why This Matters So Much
Forecasting models are judged on how well they predict the future.
If future data leaks into training:
- Accuracy numbers become meaningless
- Models appear better than they are
- Business decisions become risky
Correct splitting protects you from false confidence.
How Much Data Should Be Used for Testing?
There is no single rule, but common choices are:
- Last 20% of data
- Last 30 days
- Last full season (weekly, monthly, yearly)
The test set should represent the future period you actually care about.
What This Enables Later
Once splitting is done correctly, you can safely:
- Evaluate forecasting accuracy
- Compare models honestly
- Trust real-world deployment results
Practice Questions
Q1. Why is random splitting dangerous in time series?
Q2. What should the test set represent?
Key Takeaways
- Time series must be split chronologically
- Random splitting breaks forecasting logic
- Correct splits lead to trustworthy models
Next lesson: we’ll apply this split while training our first regression-based forecasting model.