Overfitting and Underfitting
In the previous lesson, we understood bias and variance and why every deep learning model must balance them.
In this lesson, we connect that theory to real training behavior by understanding overfitting and underfitting.
These two problems explain why models fail even when code runs without errors.
What Is Underfitting?
Underfitting occurs when a model is too simple to capture the true structure of the data.
Such a model fails to learn meaningful patterns, even from the training data itself.
In practical terms, an underfit model performs poorly everywhere — on training data and on unseen data.
Underfitting is a direct consequence of high bias.
Why Underfitting Happens
Underfitting commonly occurs when:
The network is too shallow, has very few neurons, is trained for too few epochs, or uses overly restrictive assumptions.
Sometimes underfitting is intentional early in training, but it must be resolved before deployment.
What Is Overfitting?
Overfitting occurs when a model learns the training data too well — including noise and random fluctuations.
The model appears highly accurate during training, but performs poorly on new, unseen data.
This happens because the model memorizes instead of learning general patterns.
Overfitting is a direct consequence of high variance.
Why Overfitting Happens
Overfitting usually occurs when:
The model is very deep, has many parameters, is trained for too long, or lacks regularization.
Deep learning models are powerful, but that power must be controlled carefully.
Real-World Analogy
Think of preparing for an exam.
If you only memorize past questions, you may score well if the same questions appear, but fail when questions change.
That is overfitting.
If you study too little and understand nothing, you fail all questions.
That is underfitting.
True learning lies between these extremes.
Training vs Validation Performance
Overfitting and underfitting are best diagnosed by comparing training and validation performance.
When both training and validation loss are high, the model is underfitting.
When training loss is low but validation loss is high, the model is overfitting.
This gap between the two losses is a critical signal in deep learning practice.
How Model Capacity Affects Fit
Model capacity refers to the ability of a neural network to represent complex functions.
Low-capacity models tend to underfit.
High-capacity models tend to overfit unless properly controlled.
Choosing the right capacity is one of the most important design decisions.
Code Perspective
Even small architectural changes can shift a model from underfitting to overfitting.
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense
# Underfitting-prone model
model_small = Sequential([
Dense(8, activation="relu"),
Dense(1)
])
# Overfitting-prone model
model_large = Sequential([
Dense(256, activation="relu"),
Dense(256, activation="relu"),
Dense(1)
])
Both models are valid, but their behavior during training will differ significantly.
Why This Lesson Is Critical
Most deep learning failures are not caused by bugs, but by poor control of overfitting and underfitting.
Understanding this lesson deeply is essential before learning regularization techniques.
From the next lessons onward, we will actively control these behaviors in practice.
Exercises
Exercise 1:
What is the key difference between overfitting and underfitting?
Exercise 2:
Why do deep models overfit more easily?
Quick Quiz
Q1. Which situation indicates overfitting?
Q2. What usually helps reduce overfitting?
In the next lesson, we will introduce regularization and learn concrete techniques to control overfitting in deep neural networks.