Part-of-Speech (POS) Tagging
When humans read a sentence, we naturally understand which words are nouns, verbs, adjectives, or prepositions.
Computers do not have this natural understanding. POS tagging helps machines identify the grammatical role of each word in a sentence.
In this lesson, you will learn what POS tagging is, why it is important in NLP pipelines, how it works internally, and how to perform POS tagging using Python.
What Is Part-of-Speech (POS) Tagging?
Part-of-Speech tagging is the process of assigning grammatical labels to each word in a sentence.
Each label represents the role the word plays in the sentence.
Example:
Sentence: “NLP makes machines understand language”
| Word | POS Tag | Meaning |
|---|---|---|
| NLP | Noun | Subject |
| makes | Verb | Action |
| machines | Noun | Object |
| understand | Verb | Action |
| language | Noun | Object |
Why POS Tagging Is Important in NLP
POS tagging adds syntactic understanding to raw text.
It helps NLP systems:
- Understand sentence structure
- Improve text parsing
- Support Named Entity Recognition
- Improve machine translation
- Improve question answering systems
Without POS tags, language understanding remains shallow.
Common POS Tags You Must Know
Different libraries use different tag sets, but the basic idea remains the same.
| POS Tag | Description | Example |
|---|---|---|
| NN | Noun | book, data |
| VB | Verb | run, learn |
| JJ | Adjective | good, powerful |
| RB | Adverb | quickly |
| PRP | Pronoun | he, she, it |
| IN | Preposition | in, on, at |
POS Tagging vs Text Cleaning
POS tagging is performed after text cleaning.
The usual NLP flow so far:
- Raw text
- Text cleaning
- Tokenization
- POS tagging
Clean text leads to more accurate POS tags.
POS Tagging Using NLTK (Practical Demo)
Now let us perform POS tagging using Python.
Where to run this code:
- Google Colab (recommended)
- Jupyter Notebook
- VS Code with Python
This example uses the NLTK library.
import nltk
from nltk.tokenize import word_tokenize
from nltk import pos_tag
# Download required resources
nltk.download('punkt')
nltk.download('averaged_perceptron_tagger')
sentence = "NLP makes machines understand human language"
tokens = word_tokenize(sentence)
tags = pos_tag(tokens)
print(tags)
Output:
[('NLP', 'NNP'),
('makes', 'VBZ'),
('machines', 'NNS'),
('understand', 'VB'),
('human', 'JJ'),
('language', 'NN')]
How to Read the Output
Each word is paired with a POS tag.
| Word | Tag | Meaning |
|---|---|---|
| NLP | NNP | Proper noun |
| makes | VBZ | Verb (3rd person) |
| machines | NNS | Plural noun |
| human | JJ | Adjective |
This structured information is extremely useful for higher-level NLP tasks.
Real-Life Applications of POS Tagging
- Grammar checking tools
- Voice assistants
- Search engines
- Chatbots
- Machine translation
POS tagging acts as a foundation for understanding sentence meaning.
POS Tagging in Competitive Exams
Exam questions usually test:
- Definition of POS tagging
- Examples of POS tags
- Difference between noun, verb, adjective
Understanding concept + example is enough to solve MCQs.
Assignment / Homework
Practice Environment:
- Google Colab
- Jupyter Notebook
Tasks:
- Perform POS tagging on a paragraph
- Count number of nouns and verbs
- Compare POS tags before and after cleaning
- Try a different sentence structure
Practice Questions
Q1. What does POS tagging identify?
Q2. Is POS tagging done before tokenization?
Quick Quiz
Q1. Which POS tag represents adjectives?
Q2. Why is POS tagging useful?
Quick Recap
- POS tagging assigns grammatical labels
- It improves language understanding
- Performed after cleaning and tokenization
- Used in NER, translation, QA systems
- Essential for advanced NLP pipelines