NLP Lesson 19 – GloVe | Dataplexa

GloVe – Global Vectors for Word Representation

In the previous lesson, you learned Word2Vec, which learns word embeddings using local context windows.

In this lesson, we explore GloVe, a powerful embedding method that combines global statistics with vector learning.

By the end of this lesson, you will clearly understand:

  • Why GloVe was introduced
  • How it differs from Word2Vec
  • How GloVe captures global meaning
  • When to use GloVe in NLP tasks

Why Do We Need GloVe?

Word2Vec learns word meaning based on local context windows. It looks at nearby words but does not directly use global word statistics.

Example problem:

Word2Vec may not fully capture how frequently two words co-occur across the entire corpus.

GloVe was introduced to solve this limitation by using global co-occurrence information.


What Is GloVe?

GloVe stands for Global Vectors.

It is a word embedding technique that learns vectors by analyzing how often words appear together across the entire dataset.

Key idea:

Word meaning comes from global word co-occurrence statistics.


Core Intuition Behind GloVe

GloVe builds a large matrix called the co-occurrence matrix.

Each cell represents:

  • How often word A appears near word B

Example:

  • “ice” appears often near “cold”
  • “fire” appears often near “hot”

GloVe learns vectors so that ratios of co-occurrence probabilities encode meaning.


Simple Example (Conceptual)

Consider the words:

  • king
  • queen
  • man
  • woman

GloVe captures global patterns such as:

king − man + woman ≈ queen

This happens because GloVe preserves global semantic relationships.


How GloVe Works (High-Level Steps)

GloVe training follows these steps:

  • Scan the entire corpus
  • Build a word–word co-occurrence matrix
  • Apply a weighted least-squares objective
  • Learn word vectors that encode ratios of co-occurrences

Unlike Word2Vec, GloVe is not a prediction model. It is a matrix factorization–based approach.


Word2Vec vs GloVe (Key Difference)

Aspect Word2Vec GloVe
Learning method Predictive Count-based + optimization
Context type Local window Global corpus
Uses co-occurrence matrix No (implicit) Yes (explicit)
Semantic relationships Good Very strong
Training speed Fast Slower (large matrix)

Why GloVe Produces Better Semantic Structure

Because GloVe uses global statistics, it captures:

  • Word similarity
  • Analogies
  • Long-range relationships

This makes GloVe especially useful for semantic-heavy NLP tasks.


Using Pretrained GloVe Embeddings

In practice, we usually do NOT train GloVe from scratch.

Instead, we use pretrained embeddings such as:

  • GloVe 50d, 100d, 200d, 300d
  • Trained on Wikipedia or Common Crawl

These embeddings already contain rich language knowledge.


Simple Code Example (Loading GloVe)

Let us see how to load pretrained GloVe vectors.

Where to run this code:

  • Google Colab (recommended)
  • Jupyter Notebook
Python Example: Loading GloVe Embeddings
import numpy as np

glove_path = "glove.6B.50d.txt"

embeddings = {}

with open(glove_path, "r", encoding="utf-8") as f:
    for line in f:
        values = line.split()
        word = values[0]
        vector = np.asarray(values[1:], dtype="float32")
        embeddings[word] = vector

print(embeddings["king"][:10])

Output Explanation:

  • Each word maps to a dense numeric vector
  • 50 numbers represent the meaning of the word
  • Similar words have similar vectors

Where GloVe Is Used

  • Text classification
  • Sentiment analysis
  • Named Entity Recognition
  • Machine translation
  • Semantic search

GloVe embeddings are widely used in both research and industry.


Assignment / Homework

Theory:

  • Explain how GloVe differs from Word2Vec
  • Explain what a co-occurrence matrix is

Practical:

  • Download GloVe embeddings from Stanford NLP
  • Load vectors for at least 5 words
  • Compare similarity between related words

Practice Questions

Q1. What does GloVe stand for?

Global Vectors for Word Representation.

Q2. What type of information does GloVe mainly use?

Global word co-occurrence statistics.

Quick Quiz

Q1. Which model is predictive?

Word2Vec.

Q2. Which model explicitly uses a co-occurrence matrix?

GloVe.

Quick Recap

  • GloVe uses global co-occurrence statistics
  • It combines count-based and embedding approaches
  • Produces strong semantic word vectors
  • Often used via pretrained embeddings

In the next lesson, we will study FastText, which improves embeddings by using subword information.