Pandas Lesson 23 – Window Functions | Dataplexa

Window Functions in Pandas

So far, you have learned how to group, filter, and aggregate data. However, many real-world problems require calculations that look across rows without collapsing them.

This is where window functions become extremely useful.


What Are Window Functions?

A window function performs a calculation across a set of rows related to the current row, while keeping all rows intact.

Unlike groupby(), window functions:

  • Do not reduce the number of rows
  • Allow comparisons between neighboring rows
  • Enable running totals and rankings

Why Window Functions Matter

Common business use cases include:

  • Running total of sales
  • Comparing current row with previous row
  • Ranking products within a category

All of these can be solved using window functions.


Using rolling() for Moving Calculations

A rolling window applies calculations over a fixed number of rows.

Example: Calculate a 3-row moving average of sales amount.

import pandas as pd

sales = pd.read_csv("dataplexa_pandas_sales.csv")

sales["moving_avg"] = sales["sales_amount"].rolling(window=3).mean()

This calculates the average of the current row and the previous two rows.


Understanding Rolling Window Output

The first few rows may contain NaN values because there are not enough previous rows to complete the window.

This behavior is expected and normal.


Using expanding() for Cumulative Calculations

An expanding window starts from the first row and grows with each row.

Example: Running total of sales.

sales["running_total"] = sales["sales_amount"].expanding().sum()

Each row includes all previous rows in the calculation.


Using shift() to Compare Rows

The shift() function moves data up or down. This allows row-to-row comparisons.

Example: Compare current sales with previous day sales.

sales["previous_sales"] = sales["sales_amount"].shift(1)

sales["sales_change"] = sales["sales_amount"] - sales["previous_sales"]

Ranking Data with rank()

Window functions also help rank values without grouping.

Example: Rank sales amount across all records.

sales["sales_rank"] = sales["sales_amount"].rank(ascending=False)

Higher sales receive better ranks.


Ranking Within Groups

You can combine window functions with grouping logic.

Example: Rank products within each region.

sales["region_rank"] = (
    sales.groupby("region")["sales_amount"]
    .rank(ascending=False)
)

Each region now has its own ranking system.


Common Window Function Patterns

  • Rolling averages for trends
  • Running totals for growth analysis
  • Lag comparisons using shift()
  • Ranking without losing row data

Practice Exercise

Try the following:

  • Create a running total of sales
  • Calculate a 5-row moving average
  • Compare current sales with previous sales

What’s Next?

In the next lesson, you will learn about advanced aggregations for deeper analytical insights.