DL Lesson 39 – VGG | Dataplexa

VGG Architecture

In the previous lesson, we studied AlexNet and saw how it revolutionized deep learning.

Now we move to another important milestone — VGG networks.

VGG showed that depth and simplicity can outperform complex designs.


What Is VGG?

VGG is a family of deep convolutional neural networks developed by the Visual Geometry Group (VGG) at the University of Oxford.

The most popular versions are:

VGG-16 VGG-19

The numbers indicate the total number of layers with weights.


The Core Idea Behind VGG

The main idea of VGG is very simple:

Use many small convolution filters instead of a few large ones

Instead of using large filters like 11×11, VGG uses repeated 3×3 convolutions.

This makes the network deeper, more expressive, and easier to optimize.


Why Small Filters Work Better

Stacking multiple 3×3 convolutions achieves the same receptive field as a large filter.

But it has two major advantages:

More non-linearities (ReLU layers) Fewer parameters overall

This improves learning without increasing complexity.


High-Level VGG Architecture

VGG follows a very uniform design:

Convolution → ReLU → Convolution → ReLU → Pooling (repeated multiple times)

At the end, fully connected layers perform classification.


VGG Block Structure

A typical VGG block looks like this:

Conv (3x3) → ReLU
Conv (3x3) → ReLU
Max Pooling

This block structure is repeated again and again, making the network deep and consistent.


VGG in Code (Simplified)

Below is a simplified VGG-style model using Keras.

from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Conv2D, MaxPooling2D, Flatten, Dense

model = Sequential([
    Conv2D(64, (3,3), activation="relu", padding="same", input_shape=(224,224,3)),
    Conv2D(64, (3,3), activation="relu", padding="same"),
    MaxPooling2D((2,2)),

    Conv2D(128, (3,3), activation="relu", padding="same"),
    Conv2D(128, (3,3), activation="relu", padding="same"),
    MaxPooling2D((2,2)),

    Flatten(),
    Dense(4096, activation="relu"),
    Dense(4096, activation="relu"),
    Dense(1000, activation="softmax")
])

This code shows the clean and repetitive structure that defines VGG.


VGG-16 vs VGG-19

The difference is depth:

VGG-16 has 16 weight layers VGG-19 has 19 weight layers

Deeper networks can learn more complex patterns, but they also require more computation.


Strengths of VGG

VGG is easy to understand, easy to implement, and very consistent.

Because of this, VGG is often used as a baseline model in research and practice.


Limitations of VGG

Despite its success, VGG has drawbacks:

Very large number of parameters High memory usage Slow training time

These limitations led to more efficient architectures later.


Real-World Understanding

Think of VGG like building knowledge step by step, using small and simple ideas, but repeating them deeply.

This disciplined approach leads to strong understanding.


Mini Practice

Think carefully:

Why do you think VGG avoids complex tricks and focuses on simplicity?


Exercises

Exercise 1:
What is the key idea behind VGG architecture?

Using many small 3×3 convolution filters stacked deeply.

Exercise 2:
Why does stacking small filters increase non-linearity?

Each convolution adds a ReLU activation, increasing expressive power.

Quick Quiz

Q1. What does VGG stand for?

Visual Geometry Group.

Q2. What is a major drawback of VGG?

High memory and computational cost.

In the next lesson, we will study ResNet and learn how residual connections solve the problem of very deep networks.