NLP Lesson 20 – FastText | Dataplexa

FastText – Subword-Based Word Embeddings

In the previous lesson, you learned GloVe, which captures word meaning using global co-occurrence statistics.

In this lesson, we move one step further with FastText, a powerful embedding method that understands word structure itself.

FastText is especially useful when dealing with:

Rare words
Misspellings
Morphologically rich languages

By the end of this lesson, you will clearly understand why FastText improves over Word2Vec and GloVe.

Why FastText Was Introduced

Word2Vec and GloVe treat each word as a single, indivisible unit.

This causes problems when:

A word was never seen during training (OOV words)
A word appears very rarely
Words share similar roots but are treated differently

Example:

play
playing
played

Word2Vec treats them as unrelated tokens. FastText solves this problem.

What Is FastText?

FastText is an extension of Word2Vec developed by Facebook AI.

Instead of learning embeddings only for words, FastText learns embeddings for character n-grams.

A word vector is built by combining its subword vectors.

Core Idea of FastText (Very Important)

FastText breaks words into character-level pieces.

Example word:

“playing”

Character n-grams (n=3):

FastText learns vectors for each n-gram, then sums them to create the word embedding.

Why Subwords Matter

Because subwords capture:

Prefixes
Suffixes
Root words

This allows FastText to:

Handle unseen words
Understand misspellings
Generalize better across languages

Word2Vec vs GloVe vs FastText

Aspect	Word2Vec	GloVe	FastText
Uses subwords	No	No	Yes
Handles OOV words	No	No	Yes
Semantic quality	Good	Very strong	Very strong + robust
Best for	General NLP	Semantic tasks	Morphology-heavy tasks

FastText in Real Life

FastText is widely used in:

Search engines
Chatbots
Spell correction
Low-resource languages
Text classification

It is extremely effective for noisy real-world text.

Simple Code Example: FastText Embeddings

We will use the gensim library.

Where to run this code:

Google Colab (recommended)
Jupyter Notebook

Python Example: Training FastText

from gensim.models import FastText

sentences = [
    ["i", "love", "nlp"],
    ["nlp", "is", "powerful"],
    ["i", "enjoy", "learning"]
]

model = FastText(
    sentences,
    vector_size=50,
    window=3,
    min_count=1,
    sg=1
)

print(model.wv["learning"])
print(model.wv["learn"])  # unseen word example

Output Explanation:

FastText returns vectors even for unseen words
This is possible due to subword modeling

Why FastText Handles Unseen Words

Even if a word was never seen during training, its character n-grams may have been seen.

FastText combines those n-grams to create a meaningful vector.

This makes FastText extremely powerful in practice.

When Should You Use FastText?

Small datasets
Noisy user-generated text
Languages with rich morphology
Applications needing robustness

FastText is often a better default choice than Word2Vec for production systems.

Assignment / Homework

Theory:

Explain why FastText handles OOV words
Explain the role of character n-grams

Practical:

Train FastText on your own sentences
Test vectors for misspelled words
Compare results with Word2Vec

Where to practice:

Google Colab
Local Python (Anaconda + Jupyter)

Practice Questions

Q1. What problem does FastText mainly solve?

Handling rare and unseen (OOV) words using subwords.

Q2. Does FastText use character information?

Yes, it uses character n-grams.

Quick Quiz

Q1. Which model supports unseen words?

FastText.

Q2. FastText is an extension of which model?

Word2Vec.

Quick Recap

FastText uses character-level subwords
It handles rare and unseen words
More robust than Word2Vec and GloVe
Ideal for real-world noisy text

In the next lesson, we move to Text Classification Basics, where embeddings meet real NLP tasks.

← Previous Course Index Next →