Model Diagnostics & Residual Analysis
Building a forecasting model is not the end of the job. A model can produce forecasts — and still be completely wrong.
This lesson focuses on one critical question:
How do we know if a time series model is actually reliable?
A Real-World Problem
Imagine a company forecasting monthly sales.
The model gives clean forecasts, numbers look smooth, and management is happy. But suddenly:
- Inventory shortages occur
- Sales spikes are missed
- Unexpected drops are not captured
The issue is not forecasting — it is model quality.
Diagnostics help us verify whether the model truly learned the pattern or is just producing numbers.
What Are Residuals?
Residuals are the errors made by the model.
Residual = Actual Value − Predicted Value
If a model is good, residuals should behave like:
- Pure randomness
- No visible trend
- No seasonality
- No structure
If residuals show patterns, the model missed something important.
Example Setup: Monthly Sales Data
We will simulate a realistic sales time series:
- Upward trend
- Seasonal pattern
- Random noise
Create the Time Series
import numpy as np
import matplotlib.pyplot as plt
np.random.seed(10)
time = np.arange(120)
trend = time * 0.3
seasonal = 15 * np.sin(2 * np.pi * time / 12)
noise = np.random.normal(0, 5, 120)
sales = trend + seasonal + noise
plt.figure(figsize=(9,4))
plt.plot(sales)
plt.title("Monthly Sales Data")
plt.show()
This looks like typical business data:
- Growth over time
- Repeating seasonal spikes
- Random fluctuations
A Simple Forecast (Conceptual)
Assume a model captures trend and seasonality reasonably well. But not perfectly.
We simulate model predictions:
prediction = trend + seasonal
plt.figure(figsize=(9,4))
plt.plot(sales, label="Actual")
plt.plot(prediction, label="Predicted")
plt.legend()
plt.show()
The prediction looks close — but closeness alone is not enough.
Residual Time Series
Now we examine residuals.
residuals = sales - prediction
plt.figure(figsize=(9,4))
plt.plot(residuals)
plt.title("Residuals Over Time")
plt.show()
What we want to see:
- No trend
- No repeating cycles
- Centered around zero
If residuals show structure, the model is incomplete.
Residual Distribution
Residuals should be symmetrically distributed.
Interpretation:
- Bell-shaped → good sign
- Skewed → bias present
- Heavy tails → extreme errors exist
Autocorrelation of Residuals
Residuals should not be correlated with past residuals.
If they are, the model failed to capture time dependence.
Good residuals:
- ACF values close to zero
- No significant spikes
What Diagnostics Tell Us
Diagnostics answer critical questions:
- Did we remove trend?
- Did we remove seasonality?
- Is noise truly random?
If the answer is “no” to any of these, the model must be improved.
Common Diagnostic Failures
- Residual trend → underfitting
- Residual seasonality → missing seasonal terms
- Correlated residuals → wrong model order
Practice Questions
Q1. Can a model have good forecasts but bad residuals?
Q2. Should residuals show seasonality?
Key Takeaways
- Residuals reveal model truth
- Patterns in residuals mean model failure
- Good diagnostics = trustworthy forecasts
Next lesson: evaluating models quantitatively using error metrics.