DL Lesson 36 – Classic CNNs | Dataplexa

Architecture of Classic CNNs

In the previous lesson, we understood how filters and feature maps allow Convolutional Neural Networks to detect patterns from images.

Now we move one level higher and study how these components are arranged together to form complete CNN architectures.

This lesson explains how CNNs are structured end-to-end and why this structure works so well.


What Is a CNN Architecture?

A CNN architecture defines how different layers are stacked to transform raw input images into final predictions.

It specifies:

• The order of layers • The number of filters • The type of pooling • The transition to dense layers


Basic CNN Building Blocks

Most classic CNNs follow a repeating pattern:

Convolution → Activation → Pooling

This block is repeated multiple times to gradually extract higher-level features.


Typical CNN Flow

A standard CNN processes data as follows:

1. Input Image 2. Convolution Layers (feature extraction) 3. Pooling Layers (spatial reduction) 4. Flattening 5. Fully Connected Layers 6. Output Layer

Each stage has a specific purpose in learning meaningful representations.


Why CNNs Are Deep

Depth allows CNNs to learn complex patterns.

Shallow networks only detect simple edges.

Deep networks combine multiple simple patterns to detect objects and concepts.

This hierarchical learning is the key strength of CNNs.


Classic CNN Design Philosophy

Classic architectures follow a few important principles:

• Small convolution filters • Gradual increase in filters • Periodic spatial downsampling • Dense layers at the end

These ideas became the foundation for modern deep learning models.


Example: Simple CNN Architecture

Below is a minimal CNN used for image classification.

from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Conv2D, MaxPooling2D, Flatten, Dense

model = Sequential([
    Conv2D(32, (3,3), activation="relu", input_shape=(224,224,3)),
    MaxPooling2D((2,2)),

    Conv2D(64, (3,3), activation="relu"),
    MaxPooling2D((2,2)),

    Flatten(),
    Dense(128, activation="relu"),
    Dense(10, activation="softmax")
])

This structure is simple, yet extremely powerful.


Why Pooling Is Used Between Convolutions

Pooling layers reduce spatial dimensions.

This helps by:

• Reducing computation • Preventing overfitting • Making features position-invariant

Classic CNNs rely heavily on max pooling.


Transition to Fully Connected Layers

After convolutional layers, the network has extracted high-level features.

Flattening converts feature maps into vectors.

Dense layers then perform final reasoning and classification.


Real-World Intuition

Think of CNNs like human vision:

Eyes detect edges first.

Brain combines edges into shapes.

Higher reasoning identifies objects.

CNN architectures mimic this process in a mathematical way.


Mini Practice

Think carefully:

Why do CNNs place dense layers only at the end instead of the beginning?


Exercises

Exercise 1:
What is the main role of convolution layers?

To extract meaningful features from input data.

Exercise 2:
Why are pooling layers important in CNNs?

They reduce spatial size and improve generalization.

Quick Quiz

Q1. Where do dense layers usually appear in CNNs?

At the end of the network.

Q2. Why do CNNs increase filters in deeper layers?

To learn more complex and abstract features.

In the next lesson, we will study LeNet, the first successful CNN architecture, and understand how it influenced modern deep learning models.