ML Lesson 41 – Loss Functions | Dataplexa

Loss Functions

In the previous lesson, we learned how regularization helps control model complexity and reduce overfitting.

In this lesson, we focus on another core concept that directly controls how a model learns: Loss Functions.

A loss function tells the model how wrong its predictions are. Without a loss function, a model has no direction for improvement.


What Is a Loss Function?

A loss function measures the difference between predicted values and actual values.

Each prediction produces a loss value.

The model’s goal during training is to minimize this loss.

The smaller the loss, the better the model is performing.


Why Loss Functions Matter

Loss functions act like a compass.

They guide the model toward better predictions by showing how far it is from the target.

Choosing the wrong loss function can lead to poor learning, even if the model architecture is correct.


Loss Functions and Our Dataset

Using the Dataplexa ML dataset, we predict whether a loan is approved.

This is a classification problem.

The loss function must penalize wrong approvals and wrong rejections appropriately.

That is why classification models use specific loss functions.


Mean Squared Error (Regression)

Mean Squared Error (MSE) is commonly used for regression tasks.

It calculates the average of squared differences between predicted and actual values.

Squaring the error penalizes large mistakes more strongly.

from sklearn.metrics import mean_squared_error

mse = mean_squared_error(y_true, y_pred)
mse

MSE is sensitive to outliers, which can sometimes be a drawback.


Log Loss (Classification)

Log Loss is widely used for binary classification problems.

Instead of just checking whether predictions are right or wrong, it evaluates prediction confidence.

Confident wrong predictions are penalized heavily.

from sklearn.metrics import log_loss

loss = log_loss(y_true, y_pred_prob)
loss

Lower log loss indicates better performance.


Cross-Entropy Loss (Neural Networks)

Cross-entropy loss is the standard loss function for classification in neural networks.

It measures how well the predicted probability distribution matches the true distribution.

This loss function works well with softmax and sigmoid activations.


How Loss Guides Learning

During training, the model computes the loss after each prediction.

The loss is then used to update model weights in a direction that reduces error.

This process is repeated thousands of times until the model converges.


Real-World Interpretation

In loan approval systems, loss functions help balance risk.

Approving a risky loan may have a higher cost than rejecting a safe one.

Loss functions allow us to encode this importance into training.


Mini Practice

Compare MSE and Log Loss on a small sample of predictions.

Observe how each loss penalizes mistakes differently.


Exercises

Exercise 1:
Why is log loss preferred over accuracy during training?

Because it considers prediction confidence, not just correctness.

Exercise 2:
Why does MSE penalize large errors more?

Because errors are squared, increasing their impact.

Quick Quiz

Q1. Does minimizing loss always improve accuracy?

Usually yes, but not always due to class imbalance or noise.

In the next lesson, we study Optimizers, which determine how model weights are updated during training.