Loss Functions in Deep Learning
So far, we have learned how neural networks make predictions using forward propagation and how gradients flow backward using backpropagation.
Now comes a very important question: How does the network know whether it is doing well or badly?
The answer is — Loss Functions.
What Is a Loss Function?
A loss function measures how far the model’s prediction is from the actual correct value.
In simple words:
Loss = Mistake made by the model
The goal of training is very simple: minimize this loss.
Real-World Example
Imagine you are throwing darts at a target.
If the dart hits the center, your mistake is zero. If it lands far away, your mistake is large.
A loss function plays the same role — it tells the model how bad the throw was.
Using Our Deep Learning Dataset
From this lesson onward, we will start using a single dataset consistently throughout this Deep Learning module.
Dataplexa Deep Learning Master Dataset
This dataset is designed so we can practice:
Regression, classification, CNNs, RNNs, and advanced deep learning concepts later.
Loading the Dataset
import pandas as pd
df = pd.read_csv("dataplexa_deep_learning_master_dataset.csv")
df.head()
At this stage, we will only observe the data. No preprocessing yet.
Why Loss Functions Matter
Neural networks do not understand language, logic, or common sense.
They only understand numbers.
The loss function converts “how wrong the prediction is” into a numerical value that optimization algorithms can minimize.
Common Types of Loss Functions
Different problems require different loss functions.
Let us understand this conceptually first.
1. Mean Squared Error (MSE)
Used mostly for regression problems.
It squares the difference between prediction and actual value. This heavily penalizes large errors.
import numpy as np
y_true = np.array([100, 150, 200])
y_pred = np.array([110, 140, 190])
mse = np.mean((y_true - y_pred) ** 2)
mse
If predictions are far from actual values, MSE grows quickly.
2. Binary Cross-Entropy
Used for binary classification problems (such as Yes/No, Spam/Not Spam).
Instead of distance, it measures probability error.
This loss becomes very high when the model is confident but wrong.
How Loss Connects to Training
During training:
1. The model predicts an output
2. The loss function calculates the error
3. Backpropagation adjusts weights
4. Loss reduces step by step
This cycle repeats thousands of times.
Mini Practice
Think about this carefully:
If two models make the same number of wrong predictions, but one makes very large mistakes, which model should be penalized more?
Exercises
Exercise 1:
What is the main purpose of a loss function?
Exercise 2:
Why does Mean Squared Error penalize large errors more?
Quick Quiz
Q1. Loss functions guide which process?
Q2. Can one loss function be used for all problems?
In the next lesson, we will go deeper into gradient descent variants and see how loss values actually get minimized.