AI Course
Decision Trees
Decision Trees are supervised machine learning models that make decisions in a way that closely resembles human thinking. Instead of complex mathematical formulas, they use simple rules and conditions to arrive at an answer.
Because of this structure, decision trees are one of the easiest machine learning models to understand and explain.
Why Decision Trees Are Important
Many real-world decisions are made step by step. Decision trees follow the same approach, making them highly interpretable and practical.
Examples include:
- Loan approval systems
- Medical diagnosis
- Customer churn prediction
- Fraud detection
What Is a Decision Tree?
A decision tree is a flowchart-like structure where data is split at each step based on a condition.
- Root Node: Starting point of the tree
- Decision Nodes: Nodes where conditions are applied
- Leaf Nodes: Final predictions
Each path from root to leaf represents a decision rule.
Real-World Connection
Consider how a bank evaluates a loan application:
- Is the credit score above a threshold?
- Is the income stable?
- Does the applicant have existing loans?
Each answer leads to another question until a final decision is reached. This is exactly how a decision tree operates.
How Decision Trees Split Data
Decision trees split data using metrics that measure how well a split separates classes.
- Gini Impurity
- Entropy
- Information Gain
The goal is to create splits that make each group as pure as possible.
Simple Decision Tree Example
The following example demonstrates how to train a decision tree classifier using Python.
from sklearn.tree import DecisionTreeClassifier
# Sample data
X = [[25, 50000], [30, 60000], [45, 80000], [35, 65000], [50, 90000]]
y = [0, 0, 1, 0, 1] # 0 = Reject, 1 = Approve
# Create model
model = DecisionTreeClassifier()
# Train model
model.fit(X, y)
# Predict
result = model.predict([[40, 70000]])
print(result)
The model predicts approval for the new applicant based on learned decision rules.
How the Tree Makes This Decision
The model checks one condition at a time. For example, it may first check income, then age, and finally reach a leaf node with a decision.
Unlike SVM or logistic regression, decision trees do not rely on probabilities or distance-based logic.
Advantages of Decision Trees
- Easy to understand and explain
- Handles both numerical and categorical data
- No need for feature scaling
Limitations of Decision Trees
- Prone to overfitting
- Unstable with small data changes
- Lower accuracy compared to ensemble methods
Practice Questions
Practice 1: Decision trees make predictions using?
Practice 2: Where does a decision tree give the final output?
Practice 3: Name one metric used to split data.
Quick Quiz
Quiz 1: What is the starting point of a decision tree?
Quiz 2: A major drawback of decision trees is?
Quiz 3: One major strength of decision trees is?
Coming up next: Random Forest — combining multiple decision trees to build stronger models.