GenAI Lesson 11 – Word Embeddings | Dataplexa

Word Embeddings

To understand modern Generative AI systems, you must first understand how machines represent meaning.

Raw text has no meaning to a computer. Before any reasoning, search, or generation can happen, text must be converted into numbers.

That numerical representation is called an embedding.

Why Word Embeddings Exist

Computers cannot compare words like humans do.

To a machine, the words “king” and “queen” are just strings unless we encode their relationships numerically.

Word embeddings solve this by mapping words into vectors that capture semantic meaning.

Thinking Before Coding

Ask yourself:

How can a machine know that “cat” is closer to “dog” than to “car”?

The answer is distance in vector space.

What Is a Word Embedding?

A word embedding is a dense numerical vector that represents a word’s meaning based on context.

Words that appear in similar contexts end up close together in embedding space.

From Text to Numbers

Before embeddings existed, text was often represented using one-hot encoding.

Why One-Hot Encoding Fails


words = ["cat", "dog", "car"]
one_hot = {
    "cat": [1,0,0],
    "dog": [0,1,0],
    "car": [0,0,1]
}

print(one_hot["cat"])
  

This representation has two major problems:

  • No notion of similarity
  • Extremely high dimensionality
[1, 0, 0]

“Cat” and “dog” are just as far apart as “cat” and “car.”

Dense Vector Representation

Embeddings solve this by using dense vectors.

Each dimension captures latent semantic information.

Embedding Example


embeddings = {
    "cat":  [0.9, 0.1],
    "dog":  [0.85, 0.15],
    "car":  [0.1, 0.9]
}

print(embeddings["cat"])
  

Now similarity can be measured mathematically.

[0.9, 0.1]

Measuring Similarity

Similarity between embeddings is usually computed using cosine similarity.

Cosine Similarity Example


import numpy as np

def cosine_similarity(a, b):
    return np.dot(a, b) / (np.linalg.norm(a) * np.linalg.norm(b))

print(cosine_similarity(embeddings["cat"], embeddings["dog"]))
print(cosine_similarity(embeddings["cat"], embeddings["car"]))
  

The similarity score is higher for semantically closer words.

0.996 0.219

This numerical difference is what powers semantic search.

How Word Embeddings Are Learned

Word embeddings are learned during training by observing word co-occurrence.

Words that appear in similar contexts receive similar vectors.

Context-Based Learning Intuition

If two words often appear next to the same neighbors, their meanings are related.

This idea underpins models like Word2Vec and GloVe.

Why Word Embeddings Matter in GenAI

Word embeddings enable:

  • Semantic search
  • Clustering
  • Recommendation systems
  • Retrieval-augmented generation

Without embeddings, GenAI would be keyword-based and brittle.

Limitations of Word Embeddings

Single-word embeddings have constraints:

  • No understanding of full sentence meaning
  • Same embedding regardless of context
  • Cannot capture word sense ambiguity

These limitations lead to sentence and document embeddings, which you will learn next.

Practice

What do word embeddings represent words as?



What property allows machines to compare word meaning?



Which metric is commonly used to compare embeddings?



Quick Quiz

Word embeddings are best described as:





Embeddings are learned primarily from:





Which application relies heavily on embeddings?





Recap: Word embeddings transform text into numerical meaning that machines can compare and reason about.

Next up: We move from words to full sentences — capturing meaning beyond individual tokens.