Time Series Lesson 25 – Feature Eng | Dataplexa

Feature Engineering for Time Series

Forecasting models do not understand time the way humans do. They only understand numbers placed in columns.

Feature engineering is the process of translating time-related patterns into numerical signals that models can learn from.

In this lesson, we will take a real-world time series and carefully transform it into meaningful features — step by step, with clear visuals.


A Real-World Example

Assume you are forecasting daily electricity consumption for a city.

Electricity usage depends on:

  • Day of the week
  • Recent usage patterns
  • Seasonal behavior

Our goal is to convert these ideas into usable features.


Raw Time Series

Below is a simulated daily electricity usage series. It contains:

  • A slow upward trend
  • Weekly usage patterns
  • Random fluctuations
Python: Raw Series
import numpy as np

np.random.seed(10)
days = np.arange(180)
trend = 0.15 * days
seasonal = 12 * np.sin(2 * np.pi * days / 7)
noise = np.random.normal(0, 3, size=180)

usage = 120 + trend + seasonal + noise

This is what most real-world data looks like — useful information is mixed together.


Why Feature Engineering Is Necessary

A machine learning model does not automatically know:

  • What happened yesterday
  • What usually happens on weekends
  • Whether recent values are increasing or decreasing

We must explicitly provide that information.


Lag Features (Recent Memory)

Lag features represent past values of the series.

They answer questions like:

  • What was usage yesterday?
  • What was usage two days ago?
Python: Lag Features
lag_1 = np.roll(usage, 1)
lag_7 = np.roll(usage, 7)

What this plot shows:

  • Lag-1 follows the original series closely
  • Lag-7 aligns with weekly patterns

These features give the model short-term and seasonal memory.


Rolling Statistics (Local Context)

Rolling features summarize recent behavior instead of single points.

They answer questions like:

  • Is usage increasing lately?
  • Is recent usage stable or volatile?
Python: Rolling Mean
window = 7
rolling_mean = np.convolve(usage, np.ones(window)/window, mode='same')

Rolling averages:

  • Smooth short-term noise
  • Reveal underlying direction

Models use this to understand local trends.


Calendar Features

Some patterns are driven by the calendar, not past values.

Examples:

  • Weekdays vs weekends
  • Workdays vs holidays

Below is a simple weekend indicator.

Python: Day Type Feature
day_of_week = days % 7
is_weekend = (day_of_week >= 5).astype(int)

This feature helps the model learn:

  • Usage behaves differently on weekends
  • Patterns repeat weekly

Why These Features Work Together

Each feature captures a different aspect of time:

  • Lag features → recent memory
  • Rolling statistics → local behavior
  • Calendar features → external structure

Together, they convert time into learnable signals.


Practice Questions

Q1. Why is lag-7 especially useful for daily data?

Because it captures weekly seasonality.

Q2. Why use rolling features instead of only lag features?

Rolling features summarize recent behavior and reduce noise.

Key Takeaways

  • Models do not understand time by default
  • Feature engineering bridges that gap
  • Good features often matter more than complex models

Next lesson: we will build more powerful lag-based features.