Generative AI Course
Word Embeddings
To understand modern Generative AI systems, you must first understand how machines represent meaning.
Raw text has no meaning to a computer. Before any reasoning, search, or generation can happen, text must be converted into numbers.
That numerical representation is called an embedding.
Why Word Embeddings Exist
Computers cannot compare words like humans do.
To a machine, the words “king” and “queen” are just strings unless we encode their relationships numerically.
Word embeddings solve this by mapping words into vectors that capture semantic meaning.
Thinking Before Coding
Ask yourself:
How can a machine know that “cat” is closer to “dog” than to “car”?
The answer is distance in vector space.
What Is a Word Embedding?
A word embedding is a dense numerical vector that represents a word’s meaning based on context.
Words that appear in similar contexts end up close together in embedding space.
From Text to Numbers
Before embeddings existed, text was often represented using one-hot encoding.
Why One-Hot Encoding Fails
words = ["cat", "dog", "car"]
one_hot = {
"cat": [1,0,0],
"dog": [0,1,0],
"car": [0,0,1]
}
print(one_hot["cat"])
This representation has two major problems:
- No notion of similarity
- Extremely high dimensionality
“Cat” and “dog” are just as far apart as “cat” and “car.”
Dense Vector Representation
Embeddings solve this by using dense vectors.
Each dimension captures latent semantic information.
Embedding Example
embeddings = {
"cat": [0.9, 0.1],
"dog": [0.85, 0.15],
"car": [0.1, 0.9]
}
print(embeddings["cat"])
Now similarity can be measured mathematically.
Measuring Similarity
Similarity between embeddings is usually computed using cosine similarity.
Cosine Similarity Example
import numpy as np
def cosine_similarity(a, b):
return np.dot(a, b) / (np.linalg.norm(a) * np.linalg.norm(b))
print(cosine_similarity(embeddings["cat"], embeddings["dog"]))
print(cosine_similarity(embeddings["cat"], embeddings["car"]))
The similarity score is higher for semantically closer words.
This numerical difference is what powers semantic search.
How Word Embeddings Are Learned
Word embeddings are learned during training by observing word co-occurrence.
Words that appear in similar contexts receive similar vectors.
Context-Based Learning Intuition
If two words often appear next to the same neighbors, their meanings are related.
This idea underpins models like Word2Vec and GloVe.
Why Word Embeddings Matter in GenAI
Word embeddings enable:
- Semantic search
- Clustering
- Recommendation systems
- Retrieval-augmented generation
Without embeddings, GenAI would be keyword-based and brittle.
Limitations of Word Embeddings
Single-word embeddings have constraints:
- No understanding of full sentence meaning
- Same embedding regardless of context
- Cannot capture word sense ambiguity
These limitations lead to sentence and document embeddings, which you will learn next.
Practice
What do word embeddings represent words as?
What property allows machines to compare word meaning?
Which metric is commonly used to compare embeddings?
Quick Quiz
Word embeddings are best described as:
Embeddings are learned primarily from:
Which application relies heavily on embeddings?
Recap: Word embeddings transform text into numerical meaning that machines can compare and reason about.
Next up: We move from words to full sentences — capturing meaning beyond individual tokens.