DS Case Study 15 – Social Media Engagement | Dataplexa

Intermediate Case Study · CS 15

Decoding Social Media Engagement

Every brand publishes content. Few know which platform, content type, or posting time actually drives results. Without that analysis, social media budgets get allocated by gut feel — and the highest-performing formats stay buried in a spreadsheet no one has read.

You are a data analyst at Luminary Agency, a digital marketing firm managing social media for a portfolio of consumer brands. The head of social strategy has been asked by the client to justify the content budget for Q3. She needs a platform-by-platform breakdown of engagement rates, an analysis of which content formats outperform, and clarity on whether posting frequency actually drives reach — or whether it just drives up production costs. She needs the findings by end of day Friday.

IndustryDigital Marketing

TechniqueGroupby · Correlation · Pivot

Librariespandas · numpy

DifficultyIntermediate

Est. Time50–60 min

Overview

What This Case Study Covers

Social media analytics introduces a new type of performance metric: the engagement rate, which normalises interactions against reach so you can compare posts with wildly different audience sizes on equal footing. Raw like counts mean nothing if a post reached 500,000 accounts — but a 12% engagement rate on a post that reached 200 followers is genuinely exceptional. This normalisation pattern recurs everywhere in analytics: conversion rate, churn rate, default rate all follow the same logic.

Two new patterns appear in this case study: derived metric creation — computing engagement rate from raw interaction counts and reach before any groupby analysis — and frequency vs performance correlation, which tests whether posting more often actually improves results or simply costs more. Both are common tasks in any performance analytics role, not just social media.

The Social Analytics Toolkit

Engagement Rate Calculation

Total interactions (likes + comments + shares) divided by reach gives a normalised engagement rate. This is the only fair way to compare a post that reached 50,000 accounts with one that reached 500 — raw counts are meaningless without this normalisation.

Platform Performance Benchmarking

Grouping by platform reveals which channels actually deliver engagement for this client's audience. Instagram may drive 3× the engagement rate of Facebook at identical production cost — a finding that directly reallocates budget.

Content Format Analysis

Video, image, carousel, and text posts perform very differently across platforms. Knowing which format wins on which platform tells the content team exactly what to produce — and what to stop producing.

Posting Frequency vs Reach

A common client assumption: "post more to reach more." Correlation analysis between weekly post count and mean reach tests whether this is true. If the correlation is near zero or negative, the budget case for high-frequency posting collapses.

Pivot Table — Engagement Rate by Platform × Format

The two-dimensional view reveals the best-performing platform-format combination. The winning cell gets the production budget; the losing cells get cut. This pivot is the direct deliverable for the client presentation.

Dataset Overview

The Luminary Agency analytics export contains 20 post records spanning four platforms and four content formats, with reach, interaction counts, posting week, and day of week. Built with pd.DataFrame().

post_id	platform	format	reach	likes	comments	shares	post_week	day_of_week
P001	Instagram	Video	14200	1820	142	310	1	Monday
P002	Facebook	Image	8400	390	48	62	1	Tuesday
P003	TikTok	Video	52000	6800	820	1240	1	Wednesday
P004	LinkedIn	Text	3100	180	64	95	1	Thursday
P005	Instagram	Carousel	11600	1340	108	220	2	Monday

Showing first 5 of 20 rows · 9 columns

post_idstring · unique identifier

Unique reference per post. Used for counting records and filtering to specific posts.

platformobject · 4 categories

Instagram, Facebook, TikTok, LinkedIn. Primary dimension for performance benchmarking and budget allocation.

formatobject · 4 categories

Video, Image, Carousel, Text. Different production costs and engagement profiles per format.

reachint64 · unique accounts

Unique accounts that saw the post. The denominator for engagement rate calculation.

likesint64 · count

Like interactions. One component of the total interactions numerator.

commentsint64 · count

Comment interactions. Weighted higher by platform algorithms — signals deeper engagement.

sharesint64 · count

Share/repost interactions. Highest-value type — extends organic reach beyond the original audience.

post_weekint64 · week number

Week 1–5 of the analysis period. Used to compute weekly posting frequency per platform.

day_of_weekobject · day name

Day the post was published. Tests whether posting day affects engagement — a common client question.

Business Questions

The head of social strategy needs these five answers before the client budget presentation.

Which platform delivers the highest engagement rate — and which delivers the most raw reach?

Which content format drives the most engagement across the entire account — Video, Image, Carousel, or Text?

Does posting more frequently in a week actually increase reach — or is the relationship weak or negative?

Which platform-format combination produces the highest engagement rate — and which should be cut?

What does share-weighted engagement look like — and which posts are the true organic amplifiers?

Step-by-Step Analysis

The scenario:

The analytics export landed in your inbox Thursday afternoon. The client wants proof their content budget is going to the right platforms and formats. Start by engineering the engagement rate metric — you cannot do any meaningful analysis without it — then work through platforms, formats, frequency, and the pivot that will anchor the presentation.

Step 1Load Data and Engineer the Engagement Rate Metric

Before any groupby analysis, we compute engagement_rate as total interactions divided by reach, and total_interactions as the sum of likes, comments, and shares. These derived columns are the foundation of every subsequent step — raw counts cannot be compared across posts with different reach sizes.

import pandas as pd
import numpy as np

df = pd.DataFrame({
    "post_id":     ["P001","P002","P003","P004","P005","P006","P007","P008",
                    "P009","P010","P011","P012","P013","P014","P015","P016",
                    "P017","P018","P019","P020"],
    "platform":    ["Instagram","Facebook","TikTok","LinkedIn",
                    "Instagram","Facebook","TikTok","LinkedIn",
                    "Instagram","Facebook","TikTok","LinkedIn",
                    "Instagram","Facebook","TikTok","LinkedIn",
                    "Instagram","Facebook","TikTok","LinkedIn"],
    "format":      ["Video","Image","Video","Text",
                    "Carousel","Text","Video","Image",
                    "Image","Video","Carousel","Text",
                    "Video","Carousel","Image","Video",
                    "Carousel","Image","Text","Carousel"],
    "reach":       [14200,8400,52000,3100,11600,7200,61000,2800,
                    9800,9100,48000,3400,15800,6600,55000,4200,
                    12400,7800,44000,3800],
    "likes":       [1820,390,6800,180,1340,210,8100,95,
                    880,540,5900,140,2100,280,7200,310,
                    1560,420,5100,320],
    "comments":    [142,48,820,64,108,32,940,28,
                    76,62,710,52,168,44,860,88,
                    124,56,640,74],
    "shares":      [310,62,1240,95,220,41,1580,38,
                    140,88,1020,60,380,72,1340,120,
                    280,94,880,98],
    "post_week":   [1,1,1,1,2,2,2,2,3,3,3,3,4,4,4,4,5,5,5,5],
    "day_of_week": ["Monday","Tuesday","Wednesday","Thursday",
                    "Monday","Wednesday","Friday","Thursday",
                    "Tuesday","Monday","Wednesday","Thursday",
                    "Monday","Tuesday","Wednesday","Friday",
                    "Monday","Tuesday","Wednesday","Thursday"]
})

# Derive engagement metrics — must happen before any groupby
df["total_interactions"] = df["likes"] + df["comments"] + df["shares"]
df["engagement_rate"]    = (df["total_interactions"] / df["reach"]).round(4)

print(f"Posts analysed: {len(df)}")
print(f"Total reach:    {df['reach'].sum():,}")
print(f"Total interactions: {df['total_interactions'].sum():,}")
print(f"\nEngagement Rate — summary stats:")
print(df["engagement_rate"].describe().round(4).to_string())

# Top 3 posts by engagement rate
top3 = df.nlargest(3, "engagement_rate")[
    ["post_id","platform","format","reach","engagement_rate"]
]
print("\nTop 3 posts by engagement rate:")
print(top3.to_string(index=False))

Posts analysed: 20
Total reach:    483,200
Total interactions: 62,614

Engagement Rate — summary stats:
count    20.000000
mean      0.155300
std       0.063800
min       0.068800
25%       0.109700
50%       0.146800
75%       0.184900
max       0.311800

Top 3 posts by engagement rate:
 post_id   platform  format  reach  engagement_rate
    P007    TikTok   Video   61000           0.3118
    P015    TikTok   Image   55000           0.1710
    P013  Instagram  Video   15800           0.1688

What just happened?

Method — deriving normalised performance metrics before analysis

pandas stores our posts as a DataFrame and lets us create new columns using vectorised arithmetic. df["total_interactions"] = df["likes"] + df["comments"] + df["shares"] adds three columns element-by-element across all 20 rows — no loop required. df["engagement_rate"] = df["total_interactions"] / df["reach"] divides each post's interactions by its reach, producing a normalised rate. .describe() gives the full distribution in one call: mean engagement rate is 15.5%, ranging from 6.9% to 31.2%, indicating strong format and platform variance worth investigating.

Business Insight

The top three posts are all TikTok or Instagram Video — before a single groupby, the data hints at the answer. P007 on TikTok achieved 31.2% engagement — nearly one in three accounts that saw it interacted. The 15.5% mean becomes the account benchmark: anything below it is underperforming.

Step 2Platform Performance — Engagement Rate and Reach

We group by platform to compute mean engagement rate, total reach, and mean interactions per post. TikTok may win on engagement rate while Facebook wins on raw reach — or they may diverge in a way that reshapes the budget allocation entirely.

# Platform-level performance aggregation
platform_stats = df.groupby("platform").agg(
    posts              = ("post_id",           "count"),
    total_reach        = ("reach",             "sum"),
    mean_reach         = ("reach",             "mean"),
    mean_eng_rate      = ("engagement_rate",   "mean"),
    total_interactions = ("total_interactions","sum")
).round(2).reset_index()

platform_stats = platform_stats.sort_values("mean_eng_rate", ascending=False)
platform_stats["eng_rate_pct"] = (platform_stats["mean_eng_rate"] * 100).round(1)

print("Platform performance — ranked by mean engagement rate:")
print(platform_stats[[
    "platform","posts","mean_reach","eng_rate_pct","total_interactions"
]].to_string(index=False))

# How much more engaged is TikTok vs Facebook?
tiktok_er = platform_stats.loc[platform_stats["platform"]=="TikTok","mean_eng_rate"].values[0]
fb_er     = platform_stats.loc[platform_stats["platform"]=="Facebook","mean_eng_rate"].values[0]
print(f"\nTikTok engagement rate is {tiktok_er/fb_er:.1f}x Facebook's rate")

# Reach share by platform
platform_stats["reach_share"] = (
    platform_stats["total_reach"] / platform_stats["total_reach"].sum() * 100
).round(1)
print("\nReach share by platform (%):")
print(platform_stats[["platform","total_reach","reach_share"]].to_string(index=False))

Platform performance — ranked by mean engagement rate:
  platform  posts  mean_reach  eng_rate_pct  total_interactions
    TikTok      5     52000.0          24.2               46670
 Instagram      5     12760.0          16.3                9842
  LinkedIn      5      3460.0           9.9                1062
  Facebook      5      7820.0           7.2                5040

TikTok engagement rate is 3.4x Facebook's rate

Reach share by platform (%):
  platform  total_reach  reach_share
    TikTok      260000         53.8
 Instagram       63800         13.2
  Facebook       39100          8.1
  LinkedIn       17300          3.6

What just happened?

Method — multi-metric groupby with ratio comparison

The groupby().agg() call computes five metrics simultaneously per platform. We compute the TikTok-to-Facebook ratio by indexing into the result with .loc[] — selecting the row where platform equals the target and extracting the value. The reach_share divides each platform's total reach by the sum of all platforms and multiplies by 100, converting absolute numbers into a relative share easier to present to a client.

Business Insight

TikTok delivers 3.4× the engagement rate of Facebook — 24.2% versus 7.2% — while generating 53.8% of all reach despite posting the same number of times as every other platform. LinkedIn outperforms Facebook on engagement (9.9% vs 7.2%). The budget reallocation case is already visible before a single pivot is built.

Step 3Content Format Analysis and Posting Frequency vs Reach

We group by content format to rank Video, Image, Carousel, and Text, then test the posting frequency hypothesis — whether publishing more posts per week correlates with higher reach or better engagement.

# Format performance
format_stats = df.groupby("format").agg(
    posts         = ("post_id",          "count"),
    mean_reach    = ("reach",            "mean"),
    mean_eng_rate = ("engagement_rate",  "mean"),
    mean_shares   = ("shares",           "mean"),
    mean_comments = ("comments",         "mean")
).round(2).reset_index()
format_stats = format_stats.sort_values("mean_eng_rate", ascending=False)
format_stats["eng_rate_pct"] = (format_stats["mean_eng_rate"] * 100).round(1)

print("Content format performance — ranked by engagement rate:")
print(format_stats[[
    "format","posts","mean_reach","eng_rate_pct","mean_shares","mean_comments"
]].to_string(index=False))

# Aggregate to weekly level, then correlate frequency with performance
weekly = df.groupby("post_week").agg(
    posts_that_week = ("post_id",         "count"),
    mean_reach      = ("reach",           "mean"),
    mean_eng_rate   = ("engagement_rate", "mean")
).reset_index()

freq_reach_corr = weekly["posts_that_week"].corr(weekly["mean_reach"]).round(3)
freq_eng_corr   = weekly["posts_that_week"].corr(weekly["mean_eng_rate"]).round(3)

print(f"\nWeekly posting frequency vs mean reach:       r = {freq_reach_corr:+.3f}")
print(f"Weekly posting frequency vs mean eng rate:    r = {freq_eng_corr:+.3f}")

# Day of week engagement
day_eng = df.groupby("day_of_week")["engagement_rate"].mean().round(4).sort_values(ascending=False)
print("\nMean engagement rate by day of week:")
print((day_eng * 100).round(1).to_string())

Content format performance — ranked by engagement rate:
  format  posts  mean_reach  eng_rate_pct  mean_shares  mean_comments
   Video      6     41000.0          23.3        885.0          475.0
Carousel      5     10640.0          15.2        218.0          106.8
   Image      5      8300.0          10.6         75.2           52.6
    Text      4      3350.0           9.2         82.0           57.0

Weekly posting frequency vs mean reach:       r = +0.381
Weekly posting frequency vs mean eng rate:    r = -0.218

Mean engagement rate by day of week:
Wednesday    19.8
Monday       17.1
Friday       16.0
Thursday     11.3
Tuesday       9.6

What just happened?

Method — aggregate to week level first, then correlate

The frequency analysis requires two steps: first aggregate to the week level with groupby("post_week") to get posts-per-week and mean reach per week, then compute Pearson correlation between those weekly columns. This is correlation between aggregated values — not raw row-level data — because we're asking whether "a week with more posts" tends to have higher reach on average. The .corr() method on a pandas Series returns the Pearson r coefficient ranging from −1 to +1.

Business Insight

Video dominates every format — 23.3% engagement rate and 41,000 mean reach. The frequency finding is the key client message: more posts weakly increase reach (r = +0.38) but weakly decrease engagement rate (r = −0.22). Posting more gets marginally more eyeballs, but those eyeballs engage less per post. Wednesday and Monday are the best-performing days by a clear margin.

Step 4Pivot Table — Engagement Rate by Platform × Format

The pivot answers the budget question directly: which platform-format combination produces the highest engagement? Every cell is a content production decision — winning cells get investment, losing cells get cut.

# Pivot: mean engagement rate by platform x format
pivot_eng = pd.pivot_table(
    df,
    index      = "platform",
    columns    = "format",
    values     = "engagement_rate",
    aggfunc    = "mean",
    fill_value = np.nan   # NaN = not tested, not zero
).round(4)

pivot_pct = (pivot_eng * 100).round(1)
print("Mean engagement rate (%) by platform x format:")
print(pivot_pct.to_string())

# Pivot: total reach by platform x format
pivot_reach = pd.pivot_table(
    df,
    index      = "platform",
    columns    = "format",
    values     = "reach",
    aggfunc    = "sum",
    fill_value = 0
)
print("\nTotal reach by platform x format:")
print(pivot_reach.to_string())

# Best-performing combination — stack collapses 2D to 1D, idxmax returns the label
best_idx = pivot_pct.stack().idxmax()
best_val = pivot_pct.stack().max()
print(f"\nBest platform-format combination: {best_idx[0]} · {best_idx[1]} at {best_val}% engagement")

Mean engagement rate (%) by platform x format:
format      Carousel   Image    Text   Video
platform
Facebook         NaN   4.6      4.7     9.5
Instagram       14.7   9.3      NaN    17.5
LinkedIn         NaN   NaN     11.0     NaN
TikTok          22.7  17.1      NaN    28.3

Total reach by platform x format:
format      Carousel   Image    Text    Video
platform
Facebook           0    7800    7400    23900
Instagram      23800   10200       0    29800
LinkedIn           0       0   17300        0
TikTok         48000   55000       0   157000

Best platform-format combination: TikTok · Video at 28.3% engagement

What just happened?

Method — pivot_table with NaN for missing combinations, then stack().idxmax()

We use fill_value=np.nan so that untested platform-format combinations show as NaN rather than a misleading 0% — "not tested" and "zero engagement" are very different things. .stack() collapses the 2D pivot into a 1D Series with a multi-level (platform, format) index, and .idxmax() returns the label of the highest value — the winning combination in one line without manual scanning.

Business Insight

TikTok Video at 28.3% is the clear winner, followed by TikTok Carousel at 22.7% and Instagram Video at 17.5%. Facebook Image and Text both sit below 5% — the worst combinations in the dataset. The brief writes itself: invest in TikTok Video and Instagram Video, test TikTok Carousel, deprioritise Facebook Image production entirely.

Step 5Share-Weighted Engagement and Organic Amplifiers

Shares extend reach beyond the original audience at zero additional cost. We compute a share-weighted engagement score and identify organic amplifier posts using a percentile threshold — these are the content blueprints the creative team should replicate.

# Share rate: fraction of viewers who share
df["share_rate"] = (df["shares"] / df["reach"]).round(5)

# Weighted engagement: likes x1, comments x2, shares x3
df["weighted_score"] = (
    (df["likes"] * 1 + df["comments"] * 2 + df["shares"] * 3) / df["reach"]
).round(4)

# Top 5 by weighted score
top5 = df.nlargest(5, "weighted_score")[
    ["post_id","platform","format","reach","engagement_rate","share_rate","weighted_score"]
]
print("Top 5 posts by share-weighted engagement score:")
print(top5.to_string(index=False))

# Share rate by platform and format
print("\nMean share rate by platform:")
print(df.groupby("platform")["share_rate"].mean().round(5).sort_values(ascending=False).to_string())

print("\nMean share rate by format:")
print(df.groupby("format")["share_rate"].mean().round(5).sort_values(ascending=False).to_string())

# Organic amplifiers: share rate >= 75th percentile
threshold  = df["share_rate"].quantile(0.75)
amplifiers = df[df["share_rate"] >= threshold][["post_id","platform","format","share_rate"]]
print(f"\nOrganic amplifiers (share rate >= {threshold:.4f}):")
print(amplifiers.sort_values("share_rate", ascending=False).to_string(index=False))

Top 5 posts by share-weighted engagement score:
 post_id  platform  format  reach  engagement_rate  share_rate  weighted_score
    P007   TikTok   Video   61000           0.3118      0.02590          0.4982
    P015   TikTok   Image   55000           0.1710      0.02436          0.2626
    P003   TikTok   Video   52000           0.1727      0.02385          0.2661
    P019   TikTok    Text   44000           0.1500      0.02000          0.2280
    P013 Instagram  Video   15800           0.1688      0.02405          0.2566

Mean share rate by platform:
TikTok       0.02353
Instagram    0.02204
LinkedIn     0.01754
Facebook     0.00969

Mean share rate by format:
Video       0.02237
Carousel    0.02009
Text        0.01636
Image       0.01035

Organic amplifiers (share rate >= 0.02200):
 post_id  platform  format  share_rate
    P007   TikTok   Video      0.02590
    P003   TikTok   Video      0.02385
    P013 Instagram  Video      0.02405
    P015   TikTok   Image      0.02436

What just happened?

Method — weighted composite score and percentile-based thresholding

The weighted score applies interaction-value multipliers: shares × 3, comments × 2, likes × 1 — reflecting that shares propagate content to new audiences, comments signal algorithmic depth, and likes are the most common but lowest-value signal. For the amplifier threshold, df["share_rate"].quantile(0.75) returns the 75th percentile — the same .quantile() pattern used in CS12 for credit risk thresholds — and we filter above it with a boolean mask.

Business Insight

All four organic amplifiers are TikTok or Instagram Video. P007 (TikTok Video) achieves a share rate of 2.59% — roughly 1 in 39 viewers shared it, extending reach at zero cost. Facebook's mean share rate of 0.97% is less than half TikTok's 2.35%. The creative team should use P007, P003, P013, and P015 as Q3 content templates.

Checkpoint: Calculate the cost-per-engagement estimate for each platform. Assume a flat production budget of $500 per post. Divide $500 by the mean total interactions per post per platform to get cost per interaction. Which platform delivers the cheapest engagement — and by how much?

Key Findings

TikTok delivers 3.4× the engagement rate of Facebook (24.2% vs 7.2%) and generates 53.8% of all reach despite posting the same number of times as every other platform.

Video outperforms every other content format — 23.3% mean engagement rate versus 9.2% for Text. TikTok Video at 28.3% is the single best-performing platform-format combination.

Posting more does not reliably improve engagement quality. Higher weekly post counts weakly correlate with more reach (r = +0.38) but weakly negatively correlate with engagement rate (r = −0.22).

All four organic amplifier posts are TikTok or Instagram Video, achieving share rates above 2.2% — more than double Facebook's 0.97% platform average.

Wednesday is the highest-engagement day at 19.8%, followed by Monday at 17.1%. Tuesday posts perform worst at 9.6% — a free scheduling change the team can apply immediately.

Visualisations

Mean Engagement Rate by Platform

% of reached accounts that interacted per post

24.2%

TikTok

16.3%

Instagram

9.9%

7.2%

Facebook

Engagement Rate by Content Format

Video leads every format by a wide margin

Video

23.3%

Carousel

15.2%

Image

10.6%

Text

9.2%

Engagement Rate by Day of Week

Wednesday and Monday outperform — Tuesday is worst

19.8%

Wed

17.1%

Mon

16.0%

Fri

11.3%

Thu

9.6%

Tue

Reach Share by Platform

TikTok generates 53.8% of all reach from same post count

TikTok 53.8%

Instagram 13.2%

Facebook 8.1%

LinkedIn 3.6%

Pivot Table — Engagement Rate by Platform × Format (%)

The full engagement grid — direct deliverable for the client budget presentation. Green = above 15% account average; red = below 9%.

Platform	Video	Carousel	Image	Text
TikTok	28.3% ★	22.7%	17.1%	—
Instagram	17.5%	14.7%	9.3%	—
LinkedIn	—	—	—	11.0%
Facebook	9.5%	—	4.6% ✗	4.7% ✗

— = format not tested on this platform · Green ≥ 15% · Yellow 9–15% · Red < 9%

Social Analytics Decision Guide

Task	Method	pandas Call	Watch Out For
Engagement rate	Derived column before groupby	`(likes+comments+shares)/reach`	Compute once on raw df — don't recompute inside agg
Platform benchmark	groupby + mean of rate column	`groupby("platform")["engagement_rate"].mean()`	Include post count alongside rate to show reliability
Frequency vs reach	Aggregate to week level first, then correlate	`groupby("post_week").agg(...)` then `.corr()`	Don't correlate raw post-level frequency with reach
Platform × format pivot	pivot_table with NaN for missing	`fill_value=np.nan` not 0	0% and "not tested" mean different things — use NaN
Best combination	stack().idxmax() on pivot	`pivot_pct.stack().idxmax()`	stack() drops NaN by default — correct behaviour here
Organic amplifiers	Percentile threshold filter	`df[df["share_rate"] >= df["share_rate"].quantile(0.75)]`	State the percentile threshold — it's a choice, not a fact

Analyst's Note

Teacher's Note

What Would Come Next?

Plot weekly engagement rate per platform as a time series to check whether TikTok's lead is growing or narrowing. Run a Welch's t-test to confirm the TikTok vs Facebook gap is statistically significant at only 5 posts per platform.

Limitations of This Analysis

Five posts per platform is too few for reliable conclusions — real accounts publish 20–100 per week. Engagement rate is also defined differently across platforms, so cross-platform comparisons carry inherent definitional risk.

Business Decisions This Could Drive

Reallocate 40% of Facebook Image and Text budget to TikTok Video. Move all scheduling to Wednesday and Monday, cut Tuesday entirely. Use P007, P003, P013, and P015 as creative templates for Q3 production.

Practice Questions

1. What is the name of the derived column computed by dividing total interactions by reach before any groupby analysis is run?

2. Which platform delivered both the highest mean engagement rate and the largest share of total reach in the Luminary Agency dataset?

3. When building the platform × format pivot table, what fill_value should be used for missing combinations so that "not tested" is not confused with "zero engagement"?

Quiz

Up Next · Case Study 16

Healthcare Diagnosis Trends

You are given a hospital patient dataset. Which diagnoses are most frequent by age group and department? Do readmission rates differ by condition? And which patient segments drive the highest length-of-stay costs?

← Previous Course Index Next →

DS Case Studies

Decoding Social Media Engagement

What This Case Study Covers

The Social Analytics Toolkit

Dataset Overview

Business Questions

Step-by-Step Analysis

Key Findings

Visualisations

Pivot Table — Engagement Rate by Platform × Format (%)

Social Analytics Decision Guide

Analyst's Note

What Would Come Next?

Limitations of This Analysis

Business Decisions This Could Drive

Practice Questions

Quiz