Time Series Lesson 19 – Diagnostics | Dataplexa

Model Diagnostics & Residual Analysis

Building a forecasting model is not the end of the job. A model can produce forecasts — and still be completely wrong.

This lesson focuses on one critical question:

How do we know if a time series model is actually reliable?


A Real-World Problem

Imagine a company forecasting monthly sales.

The model gives clean forecasts, numbers look smooth, and management is happy. But suddenly:

  • Inventory shortages occur
  • Sales spikes are missed
  • Unexpected drops are not captured

The issue is not forecasting — it is model quality.

Diagnostics help us verify whether the model truly learned the pattern or is just producing numbers.


What Are Residuals?

Residuals are the errors made by the model.

Residual = Actual Value − Predicted Value

If a model is good, residuals should behave like:

  • Pure randomness
  • No visible trend
  • No seasonality
  • No structure

If residuals show patterns, the model missed something important.


Example Setup: Monthly Sales Data

We will simulate a realistic sales time series:

  • Upward trend
  • Seasonal pattern
  • Random noise

Create the Time Series

Python: Simulated Sales Data
import numpy as np
import matplotlib.pyplot as plt

np.random.seed(10)
time = np.arange(120)

trend = time * 0.3
seasonal = 15 * np.sin(2 * np.pi * time / 12)
noise = np.random.normal(0, 5, 120)

sales = trend + seasonal + noise

plt.figure(figsize=(9,4))
plt.plot(sales)
plt.title("Monthly Sales Data")
plt.show()

This looks like typical business data:

  • Growth over time
  • Repeating seasonal spikes
  • Random fluctuations

A Simple Forecast (Conceptual)

Assume a model captures trend and seasonality reasonably well. But not perfectly.

We simulate model predictions:

Python: Model Prediction
prediction = trend + seasonal

plt.figure(figsize=(9,4))
plt.plot(sales, label="Actual")
plt.plot(prediction, label="Predicted")
plt.legend()
plt.show()

The prediction looks close — but closeness alone is not enough.


Residual Time Series

Now we examine residuals.

Python: Residuals
residuals = sales - prediction

plt.figure(figsize=(9,4))
plt.plot(residuals)
plt.title("Residuals Over Time")
plt.show()

What we want to see:

  • No trend
  • No repeating cycles
  • Centered around zero

If residuals show structure, the model is incomplete.


Residual Distribution

Residuals should be symmetrically distributed.

Interpretation:

  • Bell-shaped → good sign
  • Skewed → bias present
  • Heavy tails → extreme errors exist

Autocorrelation of Residuals

Residuals should not be correlated with past residuals.

If they are, the model failed to capture time dependence.

Good residuals:

  • ACF values close to zero
  • No significant spikes

What Diagnostics Tell Us

Diagnostics answer critical questions:

  • Did we remove trend?
  • Did we remove seasonality?
  • Is noise truly random?

If the answer is “no” to any of these, the model must be improved.


Common Diagnostic Failures

  • Residual trend → underfitting
  • Residual seasonality → missing seasonal terms
  • Correlated residuals → wrong model order

Practice Questions

Q1. Can a model have good forecasts but bad residuals?

Yes. Visual fit can hide structural problems revealed by residuals.

Q2. Should residuals show seasonality?

No. Seasonality should be captured by the model, not left in residuals.

Key Takeaways

  • Residuals reveal model truth
  • Patterns in residuals mean model failure
  • Good diagnostics = trustworthy forecasts

Next lesson: evaluating models quantitatively using error metrics.