CNN–LSTM Hybrid Models
Some time series problems require understanding both local patterns and long-term dependencies.
CNN–LSTM hybrid models combine the strengths of both:
- CNNs extract short-term patterns
- LSTMs understand long-term sequence behavior
Why a Hybrid Model Is Needed
Consider real-world signals like:
- Daily electricity consumption
- Traffic volume across weeks
- Website load patterns
Each of these contains:
- Short spikes (local behavior)
- Long-term trends and cycles
Using only CNNs ignores long memory. Using only LSTMs makes learning slow and noisy.
The hybrid model solves both problems together.
Real-World Example: Power Usage Forecasting
Imagine forecasting electricity demand for a city.
- Short-term spikes → sudden usage bursts
- Long-term pattern → daily and weekly cycles
The CNN detects local spikes. The LSTM understands how those spikes affect future demand.
Simulated Power Usage Signal
Below is a signal representing power usage:
- Base daily cycle
- Random short-term spikes
How the CNN–LSTM Pipeline Works
- CNN scans small windows of time
- Local patterns are converted into features
- LSTM processes feature sequences
- Final output is forecasted
Conceptual CNN–LSTM Structure
# Input shape: (batch, time_steps, features)
cnn = Conv1D(filters=64, kernel_size=5, activation="relu")(input_series)
cnn = MaxPooling1D(pool_size=2)(cnn)
lstm = LSTM(64, return_sequences=False)(cnn)
output = Dense(1)(lstm)
Important idea:
- CNN simplifies the sequence
- LSTM focuses on meaningful structure
Feature Extraction Effect
Below you can see how CNN-style filtering highlights spikes before passing data to LSTM.
What the LSTM Learns Here
- How spikes influence future demand
- Daily usage rhythm
- Delayed effects of abnormal activity
This is something CNNs alone cannot model.
When to Use CNN–LSTM Models
- High-frequency signals
- Long forecasting horizons
- Complex temporal structure
When Not to Use Them
- Very small datasets
- Purely linear trends
- Simple seasonality
In those cases, simpler models perform better.
Practice Questions
Q1. Why not feed raw time series directly into LSTM?
Q2. What happens if CNN window size is too large?
Next lesson: Attention-based models for time series.