FastText – Subword-Based Word Embeddings
In the previous lesson, you learned GloVe, which captures word meaning using global co-occurrence statistics.
In this lesson, we move one step further with FastText, a powerful embedding method that understands word structure itself.
FastText is especially useful when dealing with:
- Rare words
- Misspellings
- Morphologically rich languages
By the end of this lesson, you will clearly understand why FastText improves over Word2Vec and GloVe.
Why FastText Was Introduced
Word2Vec and GloVe treat each word as a single, indivisible unit.
This causes problems when:
- A word was never seen during training (OOV words)
- A word appears very rarely
- Words share similar roots but are treated differently
Example:
- play
- playing
- played
Word2Vec treats them as unrelated tokens. FastText solves this problem.
What Is FastText?
FastText is an extension of Word2Vec developed by Facebook AI.
Instead of learning embeddings only for words, FastText learns embeddings for character n-grams.
A word vector is built by combining its subword vectors.
Core Idea of FastText (Very Important)
FastText breaks words into character-level pieces.
Example word:
“playing”
Character n-grams (n=3):
- pla
- lay
- ayi
- yin
- ing
FastText learns vectors for each n-gram, then sums them to create the word embedding.
Why Subwords Matter
Because subwords capture:
- Prefixes
- Suffixes
- Root words
This allows FastText to:
- Handle unseen words
- Understand misspellings
- Generalize better across languages
Word2Vec vs GloVe vs FastText
| Aspect | Word2Vec | GloVe | FastText |
|---|---|---|---|
| Uses subwords | No | No | Yes |
| Handles OOV words | No | No | Yes |
| Semantic quality | Good | Very strong | Very strong + robust |
| Best for | General NLP | Semantic tasks | Morphology-heavy tasks |
FastText in Real Life
FastText is widely used in:
- Search engines
- Chatbots
- Spell correction
- Low-resource languages
- Text classification
It is extremely effective for noisy real-world text.
Simple Code Example: FastText Embeddings
We will use the gensim library.
Where to run this code:
- Google Colab (recommended)
- Jupyter Notebook
from gensim.models import FastText
sentences = [
["i", "love", "nlp"],
["nlp", "is", "powerful"],
["i", "enjoy", "learning"]
]
model = FastText(
sentences,
vector_size=50,
window=3,
min_count=1,
sg=1
)
print(model.wv["learning"])
print(model.wv["learn"]) # unseen word example
Output Explanation:
- FastText returns vectors even for unseen words
- This is possible due to subword modeling
Why FastText Handles Unseen Words
Even if a word was never seen during training, its character n-grams may have been seen.
FastText combines those n-grams to create a meaningful vector.
This makes FastText extremely powerful in practice.
When Should You Use FastText?
- Small datasets
- Noisy user-generated text
- Languages with rich morphology
- Applications needing robustness
FastText is often a better default choice than Word2Vec for production systems.
Assignment / Homework
Theory:
- Explain why FastText handles OOV words
- Explain the role of character n-grams
Practical:
- Train FastText on your own sentences
- Test vectors for misspelled words
- Compare results with Word2Vec
Where to practice:
- Google Colab
- Local Python (Anaconda + Jupyter)
Practice Questions
Q1. What problem does FastText mainly solve?
Q2. Does FastText use character information?
Quick Quiz
Q1. Which model supports unseen words?
Q2. FastText is an extension of which model?
Quick Recap
- FastText uses character-level subwords
- It handles rare and unseen words
- More robust than Word2Vec and GloVe
- Ideal for real-world noisy text
In the next lesson, we move to Text Classification Basics, where embeddings meet real NLP tasks.