Pandas Lesson 18 – Pivot Tables | Dataplexa

Pivot Tables in Pandas

Pivot tables provide a powerful and easy way to summarize large datasets. They are similar to Excel pivot tables and are commonly used for reporting and analysis.

In this lesson, you will learn how to create pivot tables using Pandas to analyze sales data efficiently.


Loading the Dataset

We continue using the same dataset used throughout the Pandas course.

import pandas as pd

df = pd.read_csv("dataplexa_pandas_sales.csv")
df["order_date"] = pd.to_datetime(df["order_date"])

What Is a Pivot Table?

A pivot table allows you to:

  • Group data by rows and columns
  • Apply aggregation functions
  • Summarize data in a matrix format

Pivot tables are ideal when you want a structured summary rather than a long list of grouped values.


Basic Pivot Table

Let’s calculate total sales per region using a pivot table.

pd.pivot_table(
    df,
    values="sales_amount",
    index="region",
    aggfunc="sum"
)

This creates a summary showing total sales for each region.


Pivot Table with Rows and Columns

Now let’s analyze total sales by region and product.

pd.pivot_table(
    df,
    values="sales_amount",
    index="region",
    columns="product",
    aggfunc="sum"
)

Each row represents a region, each column represents a product, and each cell contains total sales.


Changing the Aggregation Function

You can change the aggregation function to calculate averages, counts, minimums, or maximums.

Example: average sales per product in each region.

pd.pivot_table(
    df,
    values="sales_amount",
    index="region",
    columns="product",
    aggfunc="mean"
)

Multiple Aggregations in Pivot Tables

You can apply multiple aggregation functions at once.

pd.pivot_table(
    df,
    values="sales_amount",
    index="product",
    aggfunc=["sum", "mean", "count"]
)

This shows total sales, average sales, and order count per product.


Handling Missing Values

Some combinations of rows and columns may not have data. You can replace missing values using fill_value.

pd.pivot_table(
    df,
    values="sales_amount",
    index="region",
    columns="product",
    aggfunc="sum",
    fill_value=0
)

Pivot Tables with Dates

Pivot tables work well with date-based analysis.

Example: monthly sales summary.

pd.pivot_table(
    df,
    values="sales_amount",
    index=df["order_date"].dt.month,
    aggfunc="sum"
)

Sorting Pivot Table Results

You can sort pivot table results just like a DataFrame.

pivot = pd.pivot_table(
    df,
    values="sales_amount",
    index="product",
    aggfunc="sum"
)

pivot.sort_values(by="sales_amount", ascending=False)

Pivot Tables vs GroupBy

Both pivot tables and GroupBy summarize data, but:

  • GroupBy is more flexible and code-oriented
  • Pivot tables are easier for structured summaries

In practice, both are used depending on the task.


Practice Exercise

Using the dataset:

  • Create a pivot table showing total sales by region
  • Create a pivot table with region as rows and product as columns
  • Calculate average sales per product
  • Fill missing values with zero

What’s Next?

In the next lesson, you will learn how to merge DataFrames, which allows you to combine multiple datasets into one.