Introduction to Recurrent Neural Networks (RNNs)
Until now, you have worked with classic NLP techniques such as Bag of Words, TF-IDF, similarity, and clustering. These methods are powerful, but they have a major limitation.
They do not understand word order or context.
This lesson introduces Recurrent Neural Networks (RNNs), the first deep learning model designed to handle sequential data like language.
Why Classic NLP Models Are Not Enough
Consider these two sentences:
- “I am happy”
- “I am not happy”
Classic models may treat them as very similar because most words overlap.
But humans understand that the meanings are opposite. This happens because meaning depends on word order and context.
The Core Problem: Sequence Understanding
Language is sequential:
- Words appear in order
- Earlier words affect later meaning
- Context builds gradually
Traditional ML models treat input as independent features, but sequences require memory.
This is exactly why RNNs were created.
What Is a Recurrent Neural Network?
A Recurrent Neural Network is a neural network designed to:
- Process sequences step by step
- Remember previous information
- Use past context to influence current output
Unlike traditional neural networks, RNNs have loops.
These loops allow information to persist.
Understanding the Idea of “Memory” in RNNs
At each step in a sequence, an RNN takes:
- The current input
- The hidden state from the previous step
It then produces:
- A new hidden state
- An output (optional)
The hidden state acts as the network’s memory.
RNN Processing: Step-by-Step Intuition
Imagine reading a sentence word by word.
- When you read the first word, context is small
- As you read more words, understanding improves
- Each word updates your internal memory
An RNN works in the same way.
This makes RNNs suitable for:
- Text
- Speech
- Time series
How RNNs Differ from Feedforward Networks
| Aspect | Feedforward NN | RNN |
|---|---|---|
| Input handling | Fixed size | Variable-length sequences |
| Memory | No memory | Has memory (hidden state) |
| Order awareness | No | Yes |
| Use cases | Images, tabular data | Text, speech, sequences |
Where RNNs Are Used in NLP
RNNs are used in many language tasks:
- Sentiment analysis
- Language modeling
- Text generation
- Speech recognition
- Machine translation (early models)
They were the foundation of NLP deep learning before transformers.
Basic RNN Architecture (Conceptual)
An RNN cell contains:
- Input layer
- Hidden state
- Output layer
Mathematically:
- Hidden state = f(input, previous hidden state)
The same weights are reused at every time step.
Simple Code Example: RNN for Sequences
This example shows how to define a simple RNN layer.
Where to run:
- Google Colab
- Jupyter Notebook (with TensorFlow installed)
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import SimpleRNN, Dense
model = Sequential()
model.add(SimpleRNN(32, input_shape=(10, 1)))
model.add(Dense(1))
model.summary()
How to Understand This Code
Key points:
- SimpleRNN(32): 32 hidden units (memory cells)
- input_shape=(10,1): sequence length = 10
- Model processes input step by step
This is only a structural example. Actual NLP requires embeddings and real text data, which we will cover later.
Limitations of Basic RNNs
Although RNNs introduced memory, they have serious problems:
- Difficulty learning long-term dependencies
- Vanishing gradient problem
- Training instability
These issues led to improved architectures like LSTM and GRU.
RNNs in Exams and Interviews
Common questions:
- Why do we need RNNs?
- How do RNNs store memory?
- Difference between RNN and feedforward networks
Focus on the idea of sequence + memory.
Assignment / Homework
Theory Task:
- Explain why word order matters in language
- Compare Bag of Words vs RNNs
Practical Task:
- Create a SimpleRNN model
- Experiment with different hidden sizes
Practice Environment:
- Google Colab
- Local Jupyter Notebook
Practice Questions
Q1. Why are RNNs better than TF-IDF for language understanding?
Q2. What acts as memory in an RNN?
Quick Quiz
Q1. Which type of data are RNNs designed for?
Q2. Do RNNs reuse weights at every time step?
Quick Recap
- Classic NLP ignores word order
- RNNs introduce memory
- They process sequences step by step
- Hidden state stores context
- Foundation for LSTM and GRU
Next lesson: RNNs for NLP – Practical Applications