DL Lesson 10 – Overfitting and Underfitting | Dataplexa

Overfitting and Underfitting

In the previous lesson, we understood bias and variance and why every deep learning model must balance them.

In this lesson, we connect that theory to real training behavior by understanding overfitting and underfitting.

These two problems explain why models fail even when code runs without errors.


What Is Underfitting?

Underfitting occurs when a model is too simple to capture the true structure of the data.

Such a model fails to learn meaningful patterns, even from the training data itself.

In practical terms, an underfit model performs poorly everywhere — on training data and on unseen data.

Underfitting is a direct consequence of high bias.


Why Underfitting Happens

Underfitting commonly occurs when:

The network is too shallow, has very few neurons, is trained for too few epochs, or uses overly restrictive assumptions.

Sometimes underfitting is intentional early in training, but it must be resolved before deployment.


What Is Overfitting?

Overfitting occurs when a model learns the training data too well — including noise and random fluctuations.

The model appears highly accurate during training, but performs poorly on new, unseen data.

This happens because the model memorizes instead of learning general patterns.

Overfitting is a direct consequence of high variance.


Why Overfitting Happens

Overfitting usually occurs when:

The model is very deep, has many parameters, is trained for too long, or lacks regularization.

Deep learning models are powerful, but that power must be controlled carefully.


Real-World Analogy

Think of preparing for an exam.

If you only memorize past questions, you may score well if the same questions appear, but fail when questions change.

That is overfitting.

If you study too little and understand nothing, you fail all questions.

That is underfitting.

True learning lies between these extremes.


Training vs Validation Performance

Overfitting and underfitting are best diagnosed by comparing training and validation performance.

When both training and validation loss are high, the model is underfitting.

When training loss is low but validation loss is high, the model is overfitting.

This gap between the two losses is a critical signal in deep learning practice.


How Model Capacity Affects Fit

Model capacity refers to the ability of a neural network to represent complex functions.

Low-capacity models tend to underfit.

High-capacity models tend to overfit unless properly controlled.

Choosing the right capacity is one of the most important design decisions.


Code Perspective

Even small architectural changes can shift a model from underfitting to overfitting.

from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense

# Underfitting-prone model
model_small = Sequential([
    Dense(8, activation="relu"),
    Dense(1)
])

# Overfitting-prone model
model_large = Sequential([
    Dense(256, activation="relu"),
    Dense(256, activation="relu"),
    Dense(1)
])

Both models are valid, but their behavior during training will differ significantly.


Why This Lesson Is Critical

Most deep learning failures are not caused by bugs, but by poor control of overfitting and underfitting.

Understanding this lesson deeply is essential before learning regularization techniques.

From the next lessons onward, we will actively control these behaviors in practice.


Exercises

Exercise 1:
What is the key difference between overfitting and underfitting?

Underfitting fails to learn patterns, while overfitting memorizes noise.

Exercise 2:
Why do deep models overfit more easily?

Because they have many parameters and high learning capacity.

Quick Quiz

Q1. Which situation indicates overfitting?

Low training loss and high validation loss.

Q2. What usually helps reduce overfitting?

Regularization, early stopping, and simpler models.

In the next lesson, we will introduce regularization and learn concrete techniques to control overfitting in deep neural networks.