Model Monitoring
After deploying a machine learning model, the job is not finished. In fact, deployment is only the beginning of a model’s real life.
Once a model starts making predictions on real users and real data, its performance can change over time.
This lesson explains Model Monitoring — the practice of continuously checking whether a deployed model is still performing well and behaving as expected.
Why Model Monitoring Is Necessary
Machine learning models are trained on historical data.
But real-world data is never static. Customer behavior changes, markets shift, and new patterns appear.
If a model is not monitored, it may silently start making wrong predictions without anyone noticing.
Monitoring helps us detect problems early before they cause business damage.
What Can Go Wrong After Deployment?
Even a well-trained model can fail after deployment.
One common issue is data drift, where incoming data no longer looks like training data.
Another issue is concept drift, where the relationship between features and target changes.
Monitoring exists to catch these problems in time.
Monitoring Using Our Dataset
We continue using the same dataset (Dataplexa ML Housing & Customer Dataset) to understand monitoring concepts.
Imagine the model is deployed in a bank and receives loan applications every day.
We periodically compare recent prediction data with the original training distribution.
Tracking Prediction Accuracy
If actual outcomes are available, we can measure accuracy over time.
For example, we compare predicted loan approvals with real approval decisions.
from sklearn.metrics import accuracy_score
y_true = actual_outcomes
y_pred = model.predict(new_data)
accuracy_score(y_true, y_pred)
A steady drop in accuracy is a strong signal that something has changed.
Monitoring Input Data Distribution
Sometimes labels are not immediately available.
In such cases, we monitor input features themselves.
For example, if average income suddenly drops compared to training data, the model may behave unpredictably.
import numpy as np
np.mean(new_data["income"]),
np.mean(training_data["income"])
Large differences indicate possible data drift.
Real-World Monitoring Example
In production systems, monitoring is often automated.
Dashboards track metrics such as:
• Prediction confidence • Feature value ranges • Error rates over time • Request volumes
Alerts are triggered when metrics cross safe thresholds.
When Should a Model Be Retrained?
Monitoring does not always mean immediate retraining.
Retraining is required when:
• Accuracy drops consistently • Input data shifts significantly • Business rules change
Monitoring helps decide the right time to retrain.
Mini Practice
Compare feature averages between training data and new data and identify which feature changed the most.
Exercises
Exercise 1:
What is data drift?
Exercise 2:
Why is monitoring important even for accurate models?
Quick Quiz
Q1. Can monitoring prevent silent model failure?
In the next lesson, we move into a different learning paradigm by introducing Reinforcement Learning.