Linear Algebra (Basics) for Machine Learning
So far, we prepared and understood our data using visualization and statistics. Now we answer a very important question: how does a machine actually learn from this data?
The answer is linear algebra. Machine Learning models do not see rows and columns the way humans do. They see vectors, matrices, and mathematical transformations.
Why Linear Algebra Is Essential in ML
Every dataset is internally converted into numbers arranged in a matrix. Every model learns by performing operations on that matrix.
When a regression model predicts a price or a classifier predicts yes/no, it is performing matrix multiplication behind the scenes.
Understanding the basics removes fear and confusion when models become complex.
Our Dataset as a Matrix
We continue using the same dataset introduced in Lesson 4:
Dataplexa ML Housing & Customer Dataset
Each row represents one observation (one customer or one house). Each column represents one feature.
import pandas as pd
import numpy as np
df = pd.read_csv("dataplexa_ml_housing_customer_dataset.csv")
X = df.drop("purchase_decision", axis=1)
y = df["purchase_decision"]
X.shape, y.shape
Here, X is a matrix of features and y is a vector of targets.
What Is a Vector in ML?
A vector is a list of numbers. In ML, a single data point is represented as a vector.
For example, one customer can be represented as:
sample_vector = X.iloc[0].values
sample_vector
This vector contains information like income, house size, and other features.
The model learns by assigning importance (weights) to each element of this vector.
Understanding Matrices
When we stack many vectors together, we get a matrix. That matrix represents the entire dataset.
Most ML algorithms operate on matrices, not individual values.
matrix_X = X.values
matrix_X[:5]
This matrix is the mathematical form of our dataset.
Dot Product — The Core of Prediction
At the heart of linear models is the dot product.
A prediction is made by multiplying input features with learned weights and summing them.
weights = np.random.rand(X.shape[1])
prediction = np.dot(sample_vector, weights)
prediction
This simple operation is what drives linear regression and logistic regression.
Real-World Meaning
Think of weights as importance scores.
If house size has a higher weight than age, the model considers size more important than age when predicting price.
Linear algebra is how machines combine information logically.
Matrix Operations in Training
During training, models repeatedly:
Compare predictions with actual values, Measure error, Adjust weights using gradients.
All these steps are matrix operations performed efficiently by computers.
Why You Don’t Manually Do the Math
You do not manually compute matrices for large datasets. Libraries like NumPy and scikit-learn do it for you.
But understanding what they do internally helps you debug, optimize, and trust your models.
Mini Practice
Look at a single row in the dataset.
Ask yourself:
How many features does this vector have? How would changing one value affect prediction?
Exercises
Exercise 1:
Why is a dataset represented as a matrix?
Exercise 2:
What does a weight represent in a model?
Exercise 3:
Why is dot product important?
Quick Quiz
Q1. Do ML models understand columns and rows like humans?
Q2. Is linear algebra only used in linear models?
In the next lesson, we move from vectors and matrices into Probability for Machine Learning, which explains uncertainty and decision-making.