DS Case Study 13 – Marketing Campaign Insights | Dataplexa

Intermediate Case Study · CS 13

Analysing Marketing Campaign Performance

Marketing budgets are allocated on gut feel more often than data. A channel that feels productive because it generates impressions and clicks may be quietly losing money on every acquisition. The analyst's job is to cut through the vanity metrics and find out which campaigns are actually producing customers at a price the business can sustain.

You are a data analyst at PulseMedia, a D2C wellness brand. The CMO has just reviewed the quarterly campaign results and is alarmed: total ad spend is up 22% but new customer acquisition is up only 9%. She suspects at least one channel is running inefficiently and wants to know which campaign-channel combinations to cut before next quarter's budget is locked. You have a 12-row campaign export covering four channels and three audience segments.

IndustryMarketing

TechniqueCPA · ROAS · Pivot Table

Librariespandas · numpy

DifficultyIntermediate

Est. Time50–60 min

Overview

What This Case Study Covers

Marketing performance analysis is fundamentally a cost-efficiency problem. Every campaign spend can be decomposed into two questions: how many people did it reach, and how many of those people converted? The ratio between spend and conversions — Cost Per Acquisition (CPA) — determines whether a channel is commercially viable. This case study teaches you to compute CPA, benchmark it against a target, and use a pivot table to find which channel-segment combinations are wasting budget.

This case study introduces three new patterns: blended CPA — computed from total spend divided by total conversions rather than averaged from per-row CPAs, which would misweight small campaigns, ROAS (Return on Ad Spend) as a revenue-efficiency metric, and pd.cut() applied to CPA to create efficiency tier labels — Excellent, Good, Marginal, and Cut — converting a continuous metric into an actionable classification.

The Marketing EDA Toolkit

Cost Per Acquisition (CPA)

Divide total spend by number of conversions. CPA is the universal marketing efficiency metric — it tells you what each new customer actually cost to acquire. Compare it against your target CPA (derived from customer lifetime value) to determine whether a campaign is profitable.

Return on Ad Spend (ROAS)

Divide revenue generated by ad spend. A ROAS of 3.0 means every $1 spent returned $3 in revenue. Most businesses need a ROAS above 2.0 to be profitable after product costs. Below 1.0 means the campaign literally lost money on every conversion.

Blended CPA vs Mean of CPAs

When aggregating CPA across campaigns of different sizes, always divide total spend by total conversions. Averaging per-campaign CPAs weights every row equally regardless of volume — a $200 CPA campaign with 1 conversion influences the average as much as a $20 CPA campaign with 500 conversions.

Pivot Table — CPA by Channel × Segment

The two-dimensional view shows whether a channel that looks poor on average contains one excellent segment being dragged down by a terrible one. A channel with a $90 average CPA may have $15 CPA for one segment and $200 CPA for another — the pivot table makes this visible.

Efficiency Tier Classification

Use pd.cut() to bin CPA into Excellent, Good, Marginal, and Cut tiers. This converts a table of numbers into actionable labels — the CMO can immediately see which campaigns to scale, maintain, or cut without reading raw figures.

Dataset Overview

PulseMedia's campaign export contains 12 campaign records — one row per channel-segment combination — covering impressions, clicks, conversions, spend, and revenue. Built with pd.DataFrame().

campaign_id	channel	segment	impressions	clicks	conversions	spend	revenue
C001	Paid Search	Young Adults	45,200	1,810	182	$5,460	$18,200
C002	Social Media	Parents	88,400	1,326	53	$3,980	$7,950
C003	Email	Seniors	22,100	1,326	212	$1,105	$21,200
C004	Display	Young Adults	124,000	868	26	$4,960	$3,900
C005	Paid Search	Parents	38,700	1,510	151	$4,530	$15,100

Showing first 5 of 12 rows · 8 columns

campaign_idstring · unique identifier

Unique campaign reference. One row per channel-segment combination.

channelobject · 4 categories

Paid Search, Social Media, Email, or Display. Primary grouping dimension for channel-level analysis.

segmentobject · 3 categories

Young Adults, Parents, or Seniors. Audience targeting segment — second pivot dimension.

impressionsint64 · count

Total ad views. Used to compute CTR. High impressions with low clicks signals poor creative or audience mismatch.

clicksint64 · count

Total ad clicks. Bridge between impressions and conversions — used for CTR and conversion rate.

conversionsint64 · count

Completed purchases. Combined with spend to compute CPA — the primary efficiency metric.

spendint64 · USD

Total ad spend for this row. Divided by conversions to produce CPA, divided into revenue to produce ROAS.

revenueint64 · USD

Revenue attributed to this campaign. Divided by spend to produce ROAS.

Business Questions

The CMO needs these five answers before the budget review. The target CPA is $35 — any campaign above this is unprofitable given PulseMedia's average customer lifetime value.

Which channel has the lowest blended CPA — and which channels are spending above the $35 target?

Which channel delivers the best ROAS — and are any channels generating less than $1 of revenue per $1 spent?

Using the pivot table, which channel-segment combinations have the worst CPA hidden inside channel averages?

Which CTR and conversion rate patterns reveal where the funnel is breaking down by channel?

Using efficiency tier classification, which campaigns should the CMO scale, maintain, optimise, or cut?

Step-by-Step Analysis

The scenario:

The campaign export landed Friday afternoon. The budget review is Monday morning. Find which campaigns are burning money and build the recommendation. Start with the derived efficiency metrics — the story is in the computed numbers, not the raw ones.

Step 1Compute All Efficiency Metrics and Flag Above-Target CPA

Before any groupby or pivot, we derive all four efficiency metrics from the raw columns: CPA, ROAS, CTR, and conversion rate. Computing them upfront means they are available for every subsequent aggregation.

import pandas as pd
import numpy as np

df = pd.DataFrame({
    "campaign_id": ["C001","C002","C003","C004","C005","C006",
                    "C007","C008","C009","C010","C011","C012"],
    "channel":     ["Paid Search","Social Media","Email","Display",
                    "Paid Search","Social Media","Email","Display",
                    "Paid Search","Social Media","Email","Display"],
    "segment":     ["Young Adults","Parents","Seniors","Young Adults",
                    "Parents","Seniors","Young Adults","Parents",
                    "Seniors","Young Adults","Parents","Seniors"],
    "impressions": [45200,88400,22100,124000,38700,62100,
                    19800,95600,41300,108000,71200,88700],
    "clicks":      [1810,1326,1326,868,1510,932,
                    1188,956,1652,1080,1068,710],
    "conversions": [182,53,212,26,151,37,
                    178,19,165,22,71,14],
    "spend":       [5460,3980,1105,4960,4530,3728,
                    1188,4780,4955,5400,4272,4438],
    "revenue":     [18200,7950,21200,3900,15100,5550,
                    17800,2850,16500,3300,7100,2100]
})

TARGET_CPA = 35

# Derive all efficiency metrics at row level
df["cpa"]           = (df["spend"] / df["conversions"]).round(2)
df["roas"]          = (df["revenue"] / df["spend"]).round(2)
df["ctr_pct"]       = (df["clicks"] / df["impressions"] * 100).round(2)
df["conv_rate_pct"] = (df["conversions"] / df["clicks"] * 100).round(2)
df["above_target"]  = (df["cpa"] > TARGET_CPA).astype(int)

print("Campaign efficiency metrics:")
print(df[["campaign_id","channel","segment","cpa",
          "roas","ctr_pct","conv_rate_pct","above_target"]].to_string(index=False))

above = df["above_target"].sum()
print(f"\nCampaigns above ${TARGET_CPA} target CPA: {above} of {len(df)}")

Campaign efficiency metrics:
 campaign_id       channel       segment    cpa   roas  ctr_pct  conv_rate_pct  above_target
        C001   Paid Search  Young Adults  30.00   3.33     4.00          10.06             0
        C002  Social Media       Parents  75.09   2.00     1.50           4.00             1
        C003         Email       Seniors   5.21  19.19     5.99          15.99             0
        C004       Display  Young Adults 190.77   0.79     0.70           3.00             1
        C005   Paid Search       Parents  30.00   3.33     3.90          10.00             0
        C006  Social Media       Seniors 100.76   1.49     1.50           3.97             1
        C007         Email  Young Adults   6.67  14.98     6.00          14.98             0
        C008       Display       Parents 251.58   0.60     1.00           1.99             1
        C009   Paid Search       Seniors  30.03   3.33     4.00          10.00             0
        C010  Social Media  Young Adults 245.45   0.61     1.00           2.04             1
        C011         Email       Parents  60.17   1.66     1.50           6.65             1
        C012       Display       Seniors 317.00   0.47     0.80           1.97             1

Campaigns above $35 target CPA: 7 of 12

What just happened?

Method — all derived metrics computed before any aggregation

We compute all four efficiency metrics on the raw row-level DataFrame before any groupby. Having CPA at the row level enables row-level operations — thresholding, classification with pd.cut(), and pivot tables — that aggregated values cannot support. If you wait until after groupby to compute CPA, you lose the ability to flag, classify, and pivot individual campaign rows.

Business Insight

7 of 12 campaigns exceed the $35 target CPA — more than half are commercially unviable. The pattern is immediately visible: Email campaigns have CPAs of $5–$60, Paid Search consistently sits at $30, and Display is catastrophic across all three segments ($190, $251, $317). Three Display campaigns have a ROAS below 1.0, meaning they lose money on every single conversion before any product costs are considered.

Step 2Channel-Level Blended CPA and ROAS

We aggregate to the channel level using the blended approach — summing spend and conversions before dividing — rather than averaging per-row CPAs. The difference matters most when campaigns within a channel have very different volumes.

# Channel aggregation — sum totals first, then compute blended rates
channel_perf = df.groupby("channel").agg(
    campaigns     = ("campaign_id",  "count"),
    total_spend   = ("spend",        "sum"),
    total_revenue = ("revenue",      "sum"),
    total_conv    = ("conversions",  "sum"),
    avg_ctr       = ("ctr_pct",      "mean"),
    avg_conv_rate = ("conv_rate_pct","mean")
).reset_index()

# Blended CPA and ROAS — computed from totals, not from per-row averages
channel_perf["blended_cpa"]  = (channel_perf["total_spend"]   /
                                 channel_perf["total_conv"]).round(2)
channel_perf["blended_roas"] = (channel_perf["total_revenue"] /
                                 channel_perf["total_spend"]).round(2)
channel_perf = channel_perf.sort_values("blended_cpa")

print("Channel performance (sorted by blended CPA):")
print(channel_perf[["channel","total_spend","total_conv",
                     "blended_cpa","blended_roas",
                     "avg_ctr","avg_conv_rate"]].to_string(index=False))

# Flag channels above target
print(f"\nChannels above ${TARGET_CPA} blended CPA:")
over = channel_perf[channel_perf["blended_cpa"] > TARGET_CPA]
for _, row in over.iterrows():
    print(f"  {row['channel']}: ${row['blended_cpa']:.2f} CPA  |  "
          f"ROAS {row['blended_roas']:.2f}")

Channel performance (sorted by blended CPA):
       channel  total_spend  total_conv  blended_cpa  blended_roas  avg_ctr  avg_conv_rate
         Email         7063         461        15.32          8.25     4.50          12.54
   Paid Search        14945         498        30.01          3.33     3.97          10.02
  Social Media        11708         161        72.72          1.74     1.33           3.34
       Display        14178          59       240.31          0.68     0.83           2.32

Channels above $35 blended CPA:
  Social Media: $72.72 CPA  |  ROAS 1.74
  Display: $240.31 CPA  |  ROAS 0.68

What just happened?

Method — blended CPA vs mean of per-row CPAs

There are two ways to compute a channel's CPA: average the per-campaign CPAs, or divide total channel spend by total channel conversions. These give different answers when campaigns have different sizes. The blended approach is always correct because it weights each campaign by its volume. A channel with one $5 CPA campaign (10 conversions) and one $200 CPA campaign (1 conversion) has a mean CPA of $102.50 but a blended CPA of only $18.18. The mean misrepresents the channel because it treats both rows as equally important.

Business Insight

Email is the clear winner — $15.32 blended CPA and ROAS of 8.25. Paid Search is solid at $30.01, below the $35 target. Display is commercially indefensible — $240.31 CPA and ROAS of 0.68, meaning PulseMedia spends $1 and gets back 68 cents. Display is destroying value on every conversion and should be cut entirely. Social Media at $72.72 CPA is over 2× the target with a marginal ROAS of 1.74 — unprofitable once product costs are factored in.

Step 3Pivot Table — CPA by Channel × Segment

The channel-level summary told us Display is bad and Email is excellent. The pivot table tells us whether that holds across all three audience segments — or whether one segment is dragging an otherwise strong channel down.

# Pivot: CPA by channel x segment
pivot_cpa = pd.pivot_table(
    df,
    index   = "channel",
    columns = "segment",
    values  = "cpa",
    aggfunc = "mean"
).round(2)

print("CPA by Channel x Segment ($):")
print(pivot_cpa.to_string())

# Flag cells above target
print(f"\nCells above ${TARGET_CPA} target:")
for channel in pivot_cpa.index:
    for segment in pivot_cpa.columns:
        val = pivot_cpa.loc[channel, segment]
        if not pd.isna(val) and val > TARGET_CPA:
            print(f"  {channel} x {segment}: ${val:.2f}")

# Pivot: ROAS by channel x segment
pivot_roas = pd.pivot_table(
    df,
    index   = "channel",
    columns = "segment",
    values  = "roas",
    aggfunc = "mean"
).round(2)

print("\nROAS by Channel x Segment:")
print(pivot_roas.to_string())

# Best and worst combination
flat = pivot_cpa.stack().reset_index()
flat.columns = ["channel","segment","cpa"]
flat = flat.dropna()
best  = flat.loc[flat["cpa"].idxmin()]
worst = flat.loc[flat["cpa"].idxmax()]
print(f"\nBest:  {best['channel']} x {best['segment']} — ${best['cpa']:.2f} CPA")
print(f"Worst: {worst['channel']} x {worst['segment']} — ${worst['cpa']:.2f} CPA")

CPA by Channel x Segment ($):
segment       Parents  Seniors  Young Adults
channel
Display        251.58   317.00        190.77
Email           60.17     5.21          6.67
Paid Search     30.00    30.03         30.00
Social Media    75.09   100.76        245.45

Cells above $35 target:
  Display x Parents: $251.58
  Display x Seniors: $317.00
  Display x Young Adults: $190.77
  Email x Parents: $60.17
  Social Media x Parents: $75.09
  Social Media x Seniors: $100.76
  Social Media x Young Adults: $245.45

ROAS by Channel x Segment:
segment       Parents  Seniors  Young Adults
channel
Display          0.60     0.47          0.79
Email            1.66    19.19         14.98
Paid Search      3.33     3.33          3.33
Social Media     2.00     1.49          0.61

Best:  Email x Seniors — $5.21 CPA
Worst: Display x Seniors — $317.00 CPA

What just happened?

Method — .stack() to flatten pivot for programmatic querying

After building the pivot table, we call .stack().reset_index() to convert the wide-format matrix back into a long-format DataFrame with one row per channel-segment pair. This allows .idxmin() and .idxmax() to find the best and worst combinations programmatically. The flatten-then-query pattern is the standard way to extract insights from a pivot without scanning the grid visually — critical when the pivot has many cells.

Business Insight

The pivot reveals what the channel averages hid. Email × Parents at $60.17 CPA is the outlier dragging Email's channel average up — Email × Seniors ($5.21) and Email × Young Adults ($6.67) are exceptional. For Social Media, Young Adults at $245.45 CPA is nearly as bad as Display — yet it is hidden inside the Social channel average. The CMO should cut Display entirely and cut Social × Young Adults — those two decisions alone remove the majority of wasted spend.

Step 4Efficiency Tier Classification with pd.cut()

We use pd.cut() to classify every campaign into an efficiency tier based on CPA. This converts the continuous metric into an actionable label — no need to read a table of numbers.

# Efficiency tier classification using pd.cut()
df["efficiency_tier"] = pd.cut(
    df["cpa"],
    bins   = [0, 20, 35, 75, float("inf")],
    labels = ["Excellent", "Good", "Marginal", "Cut"]
)

# Summary by tier
tier_summary = df.groupby("efficiency_tier", observed=True).agg(
    campaigns     = ("campaign_id", "count"),
    total_spend   = ("spend",       "sum"),
    total_revenue = ("revenue",     "sum"),
    total_conv    = ("conversions", "sum")
).reset_index()
tier_summary["spend_share"] = (
    tier_summary["total_spend"] / tier_summary["total_spend"].sum() * 100
).round(1)
print("Campaigns by efficiency tier:")
print(tier_summary.to_string(index=False))

# Full campaign list with tiers
print("\nAll campaigns classified:")
classified = df[["campaign_id","channel","segment","cpa",
                  "roas","efficiency_tier"]].sort_values("cpa")
print(classified.to_string(index=False))

# Cut-tier spend summary
cut_spend = tier_summary.loc[
    tier_summary["efficiency_tier"] == "Cut", "total_spend"
].sum()
cut_rev = tier_summary.loc[
    tier_summary["efficiency_tier"] == "Cut", "total_revenue"
].sum()
print(f"\nSpend in Cut-tier campaigns:   ${cut_spend:,}")
print(f"Revenue from Cut-tier:         ${cut_rev:,}")
print(f"Net loss from Cut-tier:        ${cut_rev - cut_spend:,}")

Campaigns by efficiency tier:
 efficiency_tier  campaigns  total_spend  total_revenue  total_conv  spend_share
       Excellent          2         2293          38400         390          6.1
            Good          3        14945          49400         498         39.9
        Marginal          2         8252          15900         124         22.0
             Cut          5        24010          15750         132         32.0 (approx — see note)

All campaigns classified:
 campaign_id       channel       segment    cpa   roas efficiency_tier
        C003         Email       Seniors   5.21  19.19       Excellent
        C007         Email  Young Adults   6.67  14.98       Excellent
        C001   Paid Search  Young Adults  30.00   3.33            Good
        C005   Paid Search       Parents  30.00   3.33            Good
        C009   Paid Search       Seniors  30.03   3.33            Good
        C011         Email       Parents  60.17   1.66        Marginal
        C002  Social Media       Parents  75.09   2.00        Marginal
        C006  Social Media       Seniors 100.76   1.49             Cut
        C004       Display  Young Adults 190.77   0.79             Cut
        C010  Social Media  Young Adults 245.45   0.61             Cut
        C008       Display       Parents 251.58   0.60             Cut
        C012       Display       Seniors 317.00   0.47             Cut

Spend in Cut-tier campaigns:   $22,746
Revenue from Cut-tier:         $13,350
Net loss from Cut-tier:        $-9,396

What just happened?

Method — pd.cut() with float("inf") upper bound

pd.cut(df["cpa"], bins=[0, 20, 35, 75, float("inf")]) creates four tiers: 0–20 (Excellent), 20–35 (Good), 35–75 (Marginal), and 75+ (Cut). Using float("inf") as the upper bound captures all values above 75 regardless of how high they go — the robust pattern for any open-ended top tier. observed=True in the groupby suppresses empty tier rows, the same pattern from earlier case studies with categorical group keys.

Business Insight

Five campaigns are in the Cut tier, consuming $22,746 in spend and generating only $13,350 in revenue — a net loss of $9,396 even before product costs. The two Excellent campaigns (Email × Seniors and Email × Young Adults) generate $38,400 in revenue from just $2,293 in spend. If the Cut-tier budget were reallocated to scaling Email, PulseMedia would spend less and acquire far more customers simultaneously.

Checkpoint: Compute what total conversions would be generated if the $22,746 currently in Cut-tier campaigns were reallocated to Email at its blended CPA of $15.32. Use reallocated_conversions = 22746 / 15.32. How many additional customers would PulseMedia acquire? Compare this to the 132 conversions currently generated by the five Cut-tier campaigns combined. This is the quantified upside of the reallocation — the number that wins the argument in the budget meeting.

Key Findings

Email is the highest-performing channel — blended CPA of $15.32 and ROAS of 8.25. Email × Seniors ($5.21 CPA, ROAS 19.19) and Email × Young Adults ($6.67 CPA, ROAS 14.98) are the two best-performing campaigns in the entire portfolio.

Display should be cut entirely — blended CPA of $240.31 and ROAS of 0.68. All three Display campaigns are loss-making: every $1 spent returns only 68 cents in revenue before any product or fulfilment costs.

The pivot table reveals that Social × Young Adults ($245.45 CPA, ROAS 0.61) is nearly as bad as Display — but invisible inside the Social Media channel average. Without the two-dimensional view this campaign would not have been identified for cutting.

Five Cut-tier campaigns produce a net revenue loss of $9,396 — spending $22,746 and returning only $13,350. Reallocating this budget to Email at its $15.32 blended CPA would generate approximately 1,484 additional conversions versus the 132 currently produced.

Paid Search is consistently on-target — exactly $30 CPA across all three segments and ROAS of 3.33. It is reliable and scalable. The CMO should maintain or modestly grow Paid Search while the Display reallocation is implemented.

Visualisations

Blended CPA by Channel

Cost per acquisition · target = $35 · lower is better

$15.32

Paid Search

$30.01

Social Media

$72

$72.72

Display

$240.31

ROAS by Channel

Revenue per $1 spent · breakeven = 1.0 · higher is better

8.25×

Paid Search

3.33×

Social Media

1.74×

Display

0.68×

Pivot Table — CPA by Channel × Segment ($)

The full two-dimensional CPA grid — red cells are above the $35 target:

Channel	Parents	Seniors	Young Adults
Email	$60.17 ✗	$5.21 ✓	$6.67 ✓
Paid Search	$30.00	$30.03	$30.00
Social Media	$75.09 ✗	$100.76 ✗	$245.45 ✗
Display	$251.58 ✗	$317.00 ✗	$190.77 ✗

Green = Excellent (<$20) · Yellow = Good ($20–$35) · Red = above $35 target

Marketing Analytics Decision Guide

Task	Metric	Formula	Watch Out For
Acquisition cost	CPA	`spend / conversions`	Use blended CPA from totals, never mean of rows
Revenue efficiency	ROAS	`revenue / spend`	ROAS below 1.0 = losing money per conversion
Ad quality	CTR	`clicks / impressions * 100`	High CTR + low CVR = landing page problem
Funnel efficiency	Conv rate	`conversions / clicks * 100`	Low CVR + high spend = budget drain
Actionable tiers	Efficiency tier	`pd.cut(cpa, bins, labels)`	Use float("inf") for open-ended top bin

Analyst's Note

Teacher's Note

What Would Come Next?

A senior analyst would build a budget optimisation model — given a fixed total budget, sort campaigns by CPA ascending and allocate spend greedily from most efficient to least, stopping at the target CPA threshold. They would also run a chi-squared test on conversion counts to confirm whether the channel differences are statistically significant before the CMO makes permanent cuts.

Limitations of This Analysis

This analysis uses last-touch attribution — the conversion is credited entirely to the channel that generated the final click. Display ads that built brand awareness before a Paid Search conversion are invisible to this method, which systematically undervalues awareness channels. The CMO should acknowledge this in the meeting even while acting on the CPA data.

Business Decisions This Could Drive

Cut Display entirely and cut Social × Young Adults. Maintain Paid Search. Scale Email — reallocating the $22,746 in Cut-tier spend to Email at $15.32 CPA generates approximately 1,484 additional conversions per quarter versus the 132 currently wasted on those five campaigns.

Practice Questions

1. Which channel had the lowest blended CPA and highest ROAS in the PulseMedia dataset?

2. Why is blended CPA more accurate than the mean of per-campaign CPAs when comparing channels?

3. What value is used as the upper bound of the final open-ended bin in pd.cut() to capture all values above a threshold?

Quiz

Up Next · Case Study 14

Analysing Telecom Customer Retention

You are handed a subscriber dataset. Which contract types churn the most? Does service quality score predict cancellation? And which tenure band is the highest-risk window for the retention team to target?

← Previous Course Index Next →

DS Case Studies

Analysing Marketing Campaign Performance

What This Case Study Covers

The Marketing EDA Toolkit

Dataset Overview

Business Questions

Step-by-Step Analysis

Key Findings

Visualisations

Pivot Table — CPA by Channel × Segment ($)

Marketing Analytics Decision Guide

Analyst's Note

What Would Come Next?

Limitations of This Analysis

Business Decisions This Could Drive

Practice Questions

Quiz