NLP Lesson 20 – FastText | Dataplexa

FastText – Subword-Based Word Embeddings

In the previous lesson, you learned GloVe, which captures word meaning using global co-occurrence statistics.

In this lesson, we move one step further with FastText, a powerful embedding method that understands word structure itself.

FastText is especially useful when dealing with:

  • Rare words
  • Misspellings
  • Morphologically rich languages

By the end of this lesson, you will clearly understand why FastText improves over Word2Vec and GloVe.


Why FastText Was Introduced

Word2Vec and GloVe treat each word as a single, indivisible unit.

This causes problems when:

  • A word was never seen during training (OOV words)
  • A word appears very rarely
  • Words share similar roots but are treated differently

Example:

  • play
  • playing
  • played

Word2Vec treats them as unrelated tokens. FastText solves this problem.


What Is FastText?

FastText is an extension of Word2Vec developed by Facebook AI.

Instead of learning embeddings only for words, FastText learns embeddings for character n-grams.

A word vector is built by combining its subword vectors.


Core Idea of FastText (Very Important)

FastText breaks words into character-level pieces.

Example word:

“playing”

Character n-grams (n=3):

  • pla
  • lay
  • ayi
  • yin
  • ing

FastText learns vectors for each n-gram, then sums them to create the word embedding.


Why Subwords Matter

Because subwords capture:

  • Prefixes
  • Suffixes
  • Root words

This allows FastText to:

  • Handle unseen words
  • Understand misspellings
  • Generalize better across languages

Word2Vec vs GloVe vs FastText

Aspect Word2Vec GloVe FastText
Uses subwords No No Yes
Handles OOV words No No Yes
Semantic quality Good Very strong Very strong + robust
Best for General NLP Semantic tasks Morphology-heavy tasks

FastText in Real Life

FastText is widely used in:

  • Search engines
  • Chatbots
  • Spell correction
  • Low-resource languages
  • Text classification

It is extremely effective for noisy real-world text.


Simple Code Example: FastText Embeddings

We will use the gensim library.

Where to run this code:

  • Google Colab (recommended)
  • Jupyter Notebook
Python Example: Training FastText
from gensim.models import FastText

sentences = [
    ["i", "love", "nlp"],
    ["nlp", "is", "powerful"],
    ["i", "enjoy", "learning"]
]

model = FastText(
    sentences,
    vector_size=50,
    window=3,
    min_count=1,
    sg=1
)

print(model.wv["learning"])
print(model.wv["learn"])  # unseen word example

Output Explanation:

  • FastText returns vectors even for unseen words
  • This is possible due to subword modeling

Why FastText Handles Unseen Words

Even if a word was never seen during training, its character n-grams may have been seen.

FastText combines those n-grams to create a meaningful vector.

This makes FastText extremely powerful in practice.


When Should You Use FastText?

  • Small datasets
  • Noisy user-generated text
  • Languages with rich morphology
  • Applications needing robustness

FastText is often a better default choice than Word2Vec for production systems.


Assignment / Homework

Theory:

  • Explain why FastText handles OOV words
  • Explain the role of character n-grams

Practical:

  • Train FastText on your own sentences
  • Test vectors for misspelled words
  • Compare results with Word2Vec

Where to practice:

  • Google Colab
  • Local Python (Anaconda + Jupyter)

Practice Questions

Q1. What problem does FastText mainly solve?

Handling rare and unseen (OOV) words using subwords.

Q2. Does FastText use character information?

Yes, it uses character n-grams.

Quick Quiz

Q1. Which model supports unseen words?

FastText.

Q2. FastText is an extension of which model?

Word2Vec.

Quick Recap

  • FastText uses character-level subwords
  • It handles rare and unseen words
  • More robust than Word2Vec and GloVe
  • Ideal for real-world noisy text

In the next lesson, we move to Text Classification Basics, where embeddings meet real NLP tasks.