GenAI Lesson 4 – Gen vs Disc | Dataplexa

Generative vs Discriminative Models

In this lesson, we’ll clear one of the most important confusions in AI and GenAI.

Many engineers use models every day without realizing that they fall into two very different categories: discriminative and generative.

Understanding this difference changes how you design systems, choose models, and debug failures.

The Core Question Models Try to Answer

Every machine learning model is trying to answer a question.

The difference between discriminative and generative models is what question they answer.

Discriminative models ask:

“Given this input, what is the correct output?”

Generative models ask:

“Given this context, what could come next?”

That single shift changes everything.

Discriminative Models: Choosing Between Options

Discriminative models learn the boundary between classes.

They focus on separating data into predefined categories.

Typical tasks include:

  • Spam vs not spam
  • Fraud vs normal
  • Positive vs negative sentiment

The model does not care how the data was generated. It only cares about making the right decision.

A Discriminative Example (Classification)

Let’s look at a simple text classifier.


from sklearn.feature_extraction.text import CountVectorizer
from sklearn.linear_model import LogisticRegression

texts = [
    "your account has been suspended",
    "meeting scheduled for tomorrow",
    "win money now",
    "project update attached"
]

labels = [1, 0, 1, 0]  # 1 = spam, 0 = not spam

vectorizer = CountVectorizer()
X = vectorizer.fit_transform(texts)

model = LogisticRegression()
model.fit(X, labels)

prediction = model.predict(
    vectorizer.transform(["free money offer"])
)

print(prediction)
  

Here’s what the model is doing internally:

It learns a boundary that separates “spam-like” text from “normal” text.

It does not generate emails. It does not rewrite messages. It only chooses a class.

[1]

Limitations of Discriminative Models

Discriminative models are extremely powerful, but they are limited by design.

They cannot:

  • Create new content
  • Explain reasoning in natural language
  • Handle open-ended outputs

If the answer is not in the predefined label set, the model simply cannot express it.

Generative Models: Modeling the Data Itself

Generative models take a different approach.

Instead of learning decision boundaries, they learn how the data itself is structured.

They try to learn the underlying probability distribution of the data.

This allows them to:

  • Generate new samples
  • Fill in missing pieces
  • Continue sequences

In language models, this means learning how text flows.

A Generative Example (Next-Token Prediction)

Let’s simulate the generative mindset with a simple example.


import random

context = ["generative", "models", "learn"]
possible_next_tokens = ["patterns", "distributions", "rules"]

next_token = random.choice(possible_next_tokens)
print(" ".join(context + [next_token]))
  

This model is not choosing between classes.

It is continuing a sequence.

Large language models do this with learned probabilities, not random choice.

generative models learn patterns

Probability Perspective (Why the Difference Matters)

From a mathematical point of view:

Discriminative models learn:

P(label | input)

Generative models learn:

P(input) or P(next | context)

This is why generative models can create data, while discriminative models cannot.

Why Generative Models Feel “Creative”

Generative models don’t return a single fixed answer.

They produce a probability distribution over possible outputs.

Sampling from that distribution introduces variability.

That variability is what we experience as creativity.

Where Each Type Is Used in Real Systems

In real production systems, both are used together.

Discriminative models are commonly used for:

  • Filtering
  • Ranking
  • Safety classification

Generative models are used for:

  • Content creation
  • Conversation
  • Summarization

A chatbot might generate a response and then pass it through discriminative filters.

Engineering Trade-offs

Discriminative models:

  • Faster
  • Cheaper
  • Easier to evaluate

Generative models:

  • More flexible
  • Harder to control
  • Require system-level guardrails

Good GenAI systems combine both intelligently.

Practice

Which type of model chooses between predefined classes?



Which type of model can create new content?



What do generative models learn to enable generation?



Quick Quiz

Spam detection is an example of:





Text generation primarily relies on:





In real GenAI systems, which models are typically used?





Recap: Discriminative models choose between options, while generative models learn data distributions and create new outputs.

Next up: We’ll dive into the key concepts that power GenAI, including tokens, parameters, context, and sampling.