Generative AI Course
GAN Basics
So far, you have learned how generative models can learn structure using autoencoders and probability using VAEs.
Those approaches are powerful, but they still rely on reconstruction-based learning.
Now we introduce a completely different idea — one that changed generative AI forever: Generative Adversarial Networks (GANs).
The Core Problem GANs Solve
Imagine you want to generate images that look truly real.
Not just blurry reconstructions, but outputs that can fool a human.
VAEs struggle here because:
- They optimize likelihood, not realism
- They tend to produce smooth, blurry outputs
GANs were introduced to answer one question:
Can a model learn by competing instead of reconstructing?
The GAN Idea (Intuition First)
A GAN consists of two neural networks:
- Generator – creates fake data
- Discriminator – judges real vs fake
They are trained together in a competitive game.
The generator tries to fool the discriminator. The discriminator tries to catch the generator.
Over time, both improve.
How Engineers Think About GANs
Engineers do not start with equations.
They think in terms of roles:
- The generator is like a counterfeiter
- The discriminator is like a detective
If the detective becomes strong, the counterfeiter must improve.
If the counterfeiter becomes strong, the detective must learn finer details.
This feedback loop is what drives GAN learning.
High-Level GAN Training Loop
Before touching code, understand the loop:
- Sample real data
- Generate fake data
- Train discriminator
- Train generator
- Repeat
Every GAN implementation follows this pattern, even if the architecture changes.
Why GANs Are Hard to Train
GANs do not minimize a single loss function.
Instead, they solve a minimax game.
This leads to common challenges:
- Mode collapse
- Unstable training
- Vanishing gradients
Understanding these problems early will help you debug GANs later.
A Minimal GAN Architecture
Let’s build a very simple GAN to understand structure, not to achieve perfect results.
Our goal here is clarity.
import torch
import torch.nn as nn
Defining the Generator
The generator converts random noise into structured output.
Noise is important because it gives the model freedom to generate diverse samples.
class Generator(nn.Module):
def __init__(self):
super().__init__()
self.model = nn.Sequential(
nn.Linear(100, 256),
nn.ReLU(),
nn.Linear(256, 784),
nn.Tanh()
)
def forward(self, z):
return self.model(z)
Here:
- Input is random noise (latent vector)
- Output mimics real data shape
Defining the Discriminator
The discriminator acts like a binary classifier.
Its job is not to generate, but to judge authenticity.
class Discriminator(nn.Module):
def __init__(self):
super().__init__()
self.model = nn.Sequential(
nn.Linear(784, 256),
nn.LeakyReLU(0.2),
nn.Linear(256, 1),
nn.Sigmoid()
)
def forward(self, x):
return self.model(x)
Notice how:
- The discriminator outputs a probability
- It does not know about labels like cats or dogs
Why Training Is Adversarial
The discriminator learns from both:
- Real data → label 1
- Fake data → label 0
The generator learns indirectly by trying to make fake data that gets labeled as real.
This indirect feedback is what makes GANs unique.
Where GANs Are Used in Practice
GANs power or influence:
- Image generation
- Super-resolution
- Style transfer
- Data augmentation
Even modern diffusion models borrow ideas from GAN training dynamics.
Common Beginner Mistakes
- Training generator too fast
- Ignoring discriminator accuracy
- Using poor noise distributions
GAN debugging is about balance, not perfection.
Practice
Which network creates fake data?
Which network judges real vs fake?
What is the generator input?
Quick Quiz
GAN training is best described as:
How many networks does a GAN have?
GANs are especially good at generating:
Recap: GANs train two competing networks to generate highly realistic data through adversarial learning.
Next up: DCGAN — stabilizing GANs with convolutional architectures.