AI Course
Regularization
Regularization is a technique used to prevent machine learning models from becoming too complex. It helps control overfitting by discouraging the model from relying too heavily on any single feature.
When a model becomes too flexible, it starts memorizing training data instead of learning general patterns. Regularization adds a penalty that keeps the model simpler and more stable.
Why Regularization Is Needed
Complex models can fit training data extremely well but fail on new data. Regularization limits this behavior and improves generalization.
- Prevents overfitting
- Improves generalization
- Controls model complexity
- Stabilizes learning
Real-World Connection
Think of studying for an exam. If you memorize every question exactly, you may fail when questions change slightly. Regularization is like focusing on core concepts instead of memorization.
How Regularization Works
Regularization adds an extra term to the loss function that penalizes large model weights. This forces the model to keep weights small and balanced.
L1 Regularization (Lasso)
L1 Regularization adds the absolute value of weights to the loss function. It can reduce some weights to zero, effectively performing feature selection.
from sklearn.linear_model import Lasso
import numpy as np
X = np.array([[1, 1], [2, 2], [3, 3]])
y = np.array([1, 2, 3])
model = Lasso(alpha=0.1)
model.fit(X, y)
print(model.coef_)
One feature weight becomes zero, meaning it is removed from the model.
L2 Regularization (Ridge)
L2 Regularization adds the squared value of weights to the loss function. It shrinks weights but does not remove them completely.
from sklearn.linear_model import Ridge
model = Ridge(alpha=1.0)
model.fit(X, y)
print(model.coef_)
Both features are retained but their influence is reduced evenly.
L1 vs L2 Regularization
- L1 removes irrelevant features
- L2 reduces feature influence smoothly
- L1 produces sparse models
- L2 improves numerical stability
Elastic Net Regularization
Elastic Net combines both L1 and L2 regularization. It is useful when there are many correlated features.
from sklearn.linear_model import ElasticNet
model = ElasticNet(alpha=0.1, l1_ratio=0.5)
model.fit(X, y)
print(model.coef_)
Elastic Net balances feature selection and weight shrinkage.
When to Use Regularization
- When dataset is small
- When model overfits
- When features are highly correlated
- When interpretability matters
Practice Questions
Practice 1: What technique prevents overfitting by penalizing complexity?
Practice 2: Which regularization removes features completely?
Practice 3: Which regularization shrinks weights but keeps all features?
Quick Quiz
Quiz 1: Regularization mainly helps reduce?
Quiz 2: Which model uses L1 regularization?
Quiz 3: Which regularization combines L1 and L2?
Coming up next: Decision Trees — how models learn decisions step by step.