GenAI Lesson 20 – VAE | Dataplexa

Variational Autoencoders (VAE)

In the previous lesson, you learned how autoencoders compress data and reconstruct it through a bottleneck.

That works well for learning representations, but it still leaves one major limitation:

Autoencoders do not truly generate new data.

They reconstruct what they have already seen.

To move from compression to generation, we need a probabilistic approach. This is where Variational Autoencoders come in.

The Core Problem VAEs Solve

Imagine you want to:

  • Generate new images
  • Create new samples similar to training data
  • Explore a smooth latent space

A standard autoencoder cannot do this reliably.

Its latent space is unordered and discontinuous, which makes sampling unpredictable.

VAEs solve this by forcing the latent space to follow a structured probability distribution.

How Engineers Think About VAEs

Engineers do not start by saying:

“Let’s add probability because it sounds advanced.”

They ask:

How can we sample new points and still get meaningful outputs?

VAEs answer this by learning a distribution instead of a single point.

Key Difference: Autoencoder vs VAE

The most important mental shift is this:

  • Autoencoder → deterministic encoding
  • VAE → probabilistic encoding

Instead of mapping input → one latent vector, VAEs map input → a distribution defined by a mean and variance.

VAE Architecture Overview

A VAE still has an encoder and decoder, but the encoder now outputs:

  • Mean (μ)
  • Log variance (log σ²)

From these, the model samples a latent vector.

This sampling step is what enables generation.

Why Sampling Is Tricky

Sampling is not directly differentiable, which breaks backpropagation.

VAEs solve this using the reparameterization trick.

This is a critical concept for GenAI, so pay close attention to the logic.

Implementing a Simple VAE

Before coding, define the goal clearly:

We want to encode inputs into a smooth latent distribution and generate new samples from it.


import torch
import torch.nn as nn
  

We will extend what you already know from autoencoders.

Defining the Encoder

Notice that the encoder now produces two outputs.


class VAE(nn.Module):
    def __init__(self):
        super().__init__()
        self.fc1 = nn.Linear(784, 256)
        self.fc_mu = nn.Linear(256, 32)
        self.fc_logvar = nn.Linear(256, 32)
        self.fc2 = nn.Linear(32, 256)
        self.fc3 = nn.Linear(256, 784)
  

Here:

  • fc_mu learns the mean
  • fc_logvar learns variance

Reparameterization Trick

Instead of sampling directly, we sample noise and transform it.


def reparameterize(self, mu, logvar):
    std = torch.exp(0.5 * logvar)
    eps = torch.randn_like(std)
    return mu + eps * std
  

What is happening internally:

  • Noise is sampled independently
  • Distribution structure is preserved
  • Gradients can still flow

Forward Pass Logic

The forward pass connects all parts together.


def forward(self, x):
    h = torch.relu(self.fc1(x))
    mu = self.fc_mu(h)
    logvar = self.fc_logvar(h)
    z = self.reparameterize(mu, logvar)
    h2 = torch.relu(self.fc2(z))
    return self.fc3(h2), mu, logvar
  

At this stage, the model can both:

  • Reconstruct inputs
  • Generate new samples

Training Objective

VAEs use a combined loss function:

  • Reconstruction loss
  • KL divergence

KL divergence forces the latent space to match a normal distribution.

This is what enables smooth interpolation and sampling.

Why VAEs Matter in GenAI

VAEs introduce three critical GenAI ideas:

  • Latent space structure
  • Probabilistic generation
  • Controlled sampling

These ideas directly appear in:

  • Diffusion models
  • Image generation pipelines
  • Modern generative architectures

Practice

What does a VAE encoder output?



Which trick enables backpropagation?



Which term enforces latent regularization?



Quick Quiz

VAEs differ from autoencoders by being:





VAEs organize which space?





Main advantage of VAEs is:





Recap: Variational Autoencoders introduce probabilistic latent spaces that enable true data generation.

Next up: GAN Basics — competing networks and adversarial learning.