Architecture of Classic CNNs
In the previous lesson, we understood how filters and feature maps allow Convolutional Neural Networks to detect patterns from images.
Now we move one level higher and study how these components are arranged together to form complete CNN architectures.
This lesson explains how CNNs are structured end-to-end and why this structure works so well.
What Is a CNN Architecture?
A CNN architecture defines how different layers are stacked to transform raw input images into final predictions.
It specifies:
• The order of layers • The number of filters • The type of pooling • The transition to dense layers
Basic CNN Building Blocks
Most classic CNNs follow a repeating pattern:
Convolution → Activation → Pooling
This block is repeated multiple times to gradually extract higher-level features.
Typical CNN Flow
A standard CNN processes data as follows:
1. Input Image 2. Convolution Layers (feature extraction) 3. Pooling Layers (spatial reduction) 4. Flattening 5. Fully Connected Layers 6. Output Layer
Each stage has a specific purpose in learning meaningful representations.
Why CNNs Are Deep
Depth allows CNNs to learn complex patterns.
Shallow networks only detect simple edges.
Deep networks combine multiple simple patterns to detect objects and concepts.
This hierarchical learning is the key strength of CNNs.
Classic CNN Design Philosophy
Classic architectures follow a few important principles:
• Small convolution filters • Gradual increase in filters • Periodic spatial downsampling • Dense layers at the end
These ideas became the foundation for modern deep learning models.
Example: Simple CNN Architecture
Below is a minimal CNN used for image classification.
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Conv2D, MaxPooling2D, Flatten, Dense
model = Sequential([
Conv2D(32, (3,3), activation="relu", input_shape=(224,224,3)),
MaxPooling2D((2,2)),
Conv2D(64, (3,3), activation="relu"),
MaxPooling2D((2,2)),
Flatten(),
Dense(128, activation="relu"),
Dense(10, activation="softmax")
])
This structure is simple, yet extremely powerful.
Why Pooling Is Used Between Convolutions
Pooling layers reduce spatial dimensions.
This helps by:
• Reducing computation • Preventing overfitting • Making features position-invariant
Classic CNNs rely heavily on max pooling.
Transition to Fully Connected Layers
After convolutional layers, the network has extracted high-level features.
Flattening converts feature maps into vectors.
Dense layers then perform final reasoning and classification.
Real-World Intuition
Think of CNNs like human vision:
Eyes detect edges first.
Brain combines edges into shapes.
Higher reasoning identifies objects.
CNN architectures mimic this process in a mathematical way.
Mini Practice
Think carefully:
Why do CNNs place dense layers only at the end instead of the beginning?
Exercises
Exercise 1:
What is the main role of convolution layers?
Exercise 2:
Why are pooling layers important in CNNs?
Quick Quiz
Q1. Where do dense layers usually appear in CNNs?
Q2. Why do CNNs increase filters in deeper layers?
In the next lesson, we will study LeNet, the first successful CNN architecture, and understand how it influenced modern deep learning models.