MS SQL Lesson 32 – Execution Plans | Dataplexa

Execution Plans

When SQL Server receives a query it does not execute it directly — it first compiles it into an execution plan: a tree of physical operators that describes exactly how the data will be retrieved, joined, filtered, aggregated, and returned. Reading an execution plan is the most direct way to understand why a query runs at the speed it does. Everything covered in the Query Optimisation lesson — SARGability, index seeks vs scans, join strategies, statistics accuracy — manifests visibly in the execution plan. An execution plan tells you which table was scanned when it should have been seeked, which join was a nested loop when a hash match would have been better, and which operator is consuming 94% of the total query cost. This lesson covers how to obtain execution plans in three forms, how to read the most important operators, and how to use the warnings and cost annotations the optimiser includes to guide your tuning decisions.

Three Ways to Get an Execution Plan

SQL Server provides three plan types for different purposes. The estimated plan is generated by the optimiser without executing the query — useful for checking plan shape before running an expensive query or on a production system. The actual plan is captured after the query executes and includes real runtime metrics: actual row counts, actual executions, spill warnings, and memory grants. The difference between estimated and actual row counts is one of the most important diagnostics in the entire plan — a large discrepancy signals stale statistics. The live query statistics view shows the plan animating in real time as the query executes, row counts filling in as operators complete — useful for diagnosing long-running queries that have not finished yet.

-- Three plan retrieval methods

USE DataplexaStore;

-- METHOD 1: Estimated plan — no execution, no data returned
-- In SSMS: Ctrl+L  or  Query menu → Display Estimated Execution Plan
SET SHOWPLAN_ALL ON;    -- text form of estimated plan

SELECT
    p.ProductName,
    p.Category,
    SUM(oi.Quantity * oi.UnitPrice) AS revenue
FROM   Products    p
JOIN   OrderItems  oi ON p.ProductID = oi.ProductID
GROUP BY p.ProductName, p.Category
ORDER BY revenue DESC;

SET SHOWPLAN_ALL OFF;

-- METHOD 2: Actual plan — executes the query and captures runtime stats
-- In SSMS: Ctrl+M  or  Query menu → Include Actual Execution Plan
SET STATISTICS IO  ON;   -- shows logical reads per table
SET STATISTICS TIME ON;  -- shows parse, compile, and execute time

SELECT
    p.ProductName,
    p.Category,
    SUM(oi.Quantity * oi.UnitPrice) AS revenue
FROM   Products    p
JOIN   OrderItems  oi ON p.ProductID = oi.ProductID
GROUP BY p.ProductName, p.Category
ORDER BY revenue DESC;

SET STATISTICS IO  OFF;
SET STATISTICS TIME OFF;

-- STATISTICS IO output:
Table 'OrderItems'. Scan count 1, logical reads 1, physical reads 0,
read-ahead reads 0, lob logical reads 0.
Table 'Products'. Scan count 1, logical reads 1, physical reads 0,
read-ahead reads 0, lob logical reads 0.

-- STATISTICS TIME output:
SQL Server parse and compile time: CPU time = 0 ms, elapsed time = 1 ms.
SQL Server Execution Times: CPU time = 0 ms, elapsed time = 0 ms.

-- Query result:
ProductName Category revenue
──────────────────── ─────────── ───────
Monitor 27 inch Electronics 719.98
Notebook A5 Stationery 103.92
Desk Lamp Home Office 73.48
Mechanical Keyboard Electronics 81.80
USB-C Hub Electronics 44.99
Wireless Mouse Electronics 29.99

logical reads = 1 for both tables means SQL Server read exactly one 8KB page for each — on our small dataset this is expected; in production a query showing logical reads = 48,000 on a table is reading 375MB of data and is a candidate for optimisation
physical reads = 0 confirms both tables were in the buffer pool (memory cache) — physical reads only occur on a cold cache when pages are not yet in memory; in production high physical reads after a server restart or failover are expected and normalise as the buffer pool warms up
SET STATISTICS IO / TIME are the text equivalents of the graphical plan's I/O and time annotations — they are useful in automated scripts and when connecting via command line tools where the graphical plan is not available

Reading Plan Operators — the Core Set

An execution plan is a tree of operators, read right to left and bottom to top in the graphical view. Data flows from the leaf operators (table and index reads) through intermediate operators (joins, aggregations, sorts) to the root operator (SELECT or the DML statement). Each operator has an estimated cost percentage, estimated row counts, and in the actual plan, real row counts. The most important operators to recognise are: Clustered Index Seek (targeted row retrieval — fast), Clustered Index Scan (full table read — investigate), Nonclustered Index Seek (fast read via nonclustered index), Key Lookup (secondary read to retrieve columns not in the nonclustered index), Hash Match (join or aggregation using a hash table — good for large unordered inputs), Nested Loops (join by scanning the inner table once per outer row — good for small outer inputs with an indexed inner), Sort (explicit sort — expensive on large row counts, may spill to disk), and Parallelism / Gather Streams (query is using multiple CPU threads).

-- Queries that produce different plan operators — observe in SSMS graphical plan

-- Clustered Index SEEK — primary key lookup, single row
-- Plan: Clustered Index Seek (cost ~100% on a one-row result)
SELECT ProductID, ProductName, Price
FROM   Products
WHERE  ProductID = 5;

-- Clustered Index SCAN — no useful index for the predicate
-- Plan: Clustered Index Scan (reads entire Products table)
SELECT ProductID, ProductName, Price
FROM   Products
WHERE  YEAR(Price) = 36;        -- non-SARGable: function on column forces scan

-- Nonclustered Index Seek + Key Lookup — index found rows but needs more columns
-- Plan: Index Seek on IX_Orders_UserID → Key Lookup to PK_Orders for Status, TotalAmount
SELECT OrderID, Status, TotalAmount
FROM   Orders
WHERE  UserID = 2;

-- Hash Match (aggregate) — GROUP BY on unordered data
-- Plan: Clustered Index Scan → Hash Match (Aggregate)
SELECT Category, COUNT(*) AS product_count, AVG(Price) AS avg_price
FROM   Products
GROUP BY Category;

-- Nested Loops — small outer input (one order), indexed inner
-- Plan: Clustered Index Seek on Orders → Nested Loops → Index Seek on OrderItems
SELECT o.OrderID, o.Status, oi.ProductID, oi.Quantity
FROM   Orders     o
JOIN   OrderItems oi ON o.OrderID = oi.OrderID
WHERE  o.OrderID = 7;

-- Seek result:
ProductID ProductName Price
───────── ─────────── ─────
5 Desk Lamp 36.74

-- YEAR(Price) scan — no rows (YEAR on DECIMAL is invalid T-SQL, shown for concept):
-- Use WHERE Price BETWEEN 36 AND 37 for SARGable equivalent

-- UserID = 2 orders:
OrderID Status TotalAmount
─────── ───────── ───────────
2 delivered 512.92
6 delivered 80.99

-- Category aggregation:
Category product_count avg_price
─────────── ───────────── ─────────
Electronics 5 118.45
Home Office 3 222.24
Stationery 2 10.77

-- Order 7 line items:
OrderID Status ProductID Quantity
─────── ────────── ───────── ────────
7 processing 1 2
7 processing 3 1

The Index Seek + Key Lookup pattern on the UserID = 2 query is visible in SSMS — the nonclustered index IX_Orders_UserID finds the two matching OrderIDs efficiently, then the Key Lookup retrieves Status and TotalAmount from the clustered index for each; adding Status and TotalAmount to the INCLUDE clause of IX_Orders_UserID would eliminate the Key Lookup entirely
Hash Match on the GROUP BY is expected for small unsorted inputs — SQL Server builds an in-memory hash table keyed on Category and accumulates COUNT and AVG as it scans Products; for large tables this operator can spill to tempdb if the hash table exceeds available memory grant
Nested Loops for the single-order join is the optimal choice — the outer side returns exactly one row (OrderID = 7), and for each outer row SQL Server performs one seek into OrderItems; Nested Loops has very low overhead for small outer inputs with indexed inner tables

Estimated vs Actual Row Counts — the Most Important Diagnostic

Every operator in an execution plan has two row count annotations: the estimated rows (a triangle icon in SSMS) that the optimiser predicted based on statistics, and the actual rows (circle icon) that flowed through the operator at runtime. When these numbers match, the optimiser had accurate information and likely chose a good plan. When they diverge significantly — estimated 1 row, actual 50,000 rows — the optimiser made poor choices based on bad statistics, leading to wrong join strategies, insufficient memory grants, and serial execution where parallelism would have helped. Estimated vs actual row count divergence is the single most important thing to check when an execution plan looks unexpectedly bad.

-- Force a plan with stale statistics to demonstrate row count mismatch
-- (In production this occurs naturally after large data changes)

-- Step 1: check current statistics for Orders.Status
DBCC SHOW_STATISTICS ('Orders', 'IX_Orders_Status');

-- Step 2: artificially make statistics appear stale by loading new rows
-- then observing the plan before UPDATE STATISTICS runs
-- (In this lesson we query the DMV to show the concept)

-- Show estimated vs actual rows using the query plan DMV
-- This captures the most recent actual plan for a given query
SELECT TOP 5
    qs.execution_count,
    qs.total_logical_reads / qs.execution_count   AS avg_reads,
    qp.query_plan                                  AS plan_xml   -- XML plan
FROM       sys.dm_exec_query_stats   qs
CROSS APPLY sys.dm_exec_query_plan(qs.plan_handle) qp
WHERE      qp.dbid = DB_ID()
ORDER BY   qs.total_logical_reads DESC;

-- In SSMS: hover over any operator node to see the tooltip
-- Key tooltip fields to check:
--   Estimated Number of Rows  vs  Actual Number of Rows
--   Estimated Row Size        vs  Actual Row Size
--   Warnings                  (yellow triangle — spills, implicit conversions)
--   Output List               (which columns flow out of this operator)
PRINT 'Hover over operators in the SSMS graphical plan to see row count tooltips';

-- DBCC SHOW_STATISTICS output (condensed):
Name Updated Rows Rows Sampled Steps Density ...
─────────────────── ─────────────────────── ──── ──────────── ───── ───────
IX_Orders_Status 2024-03-28 15:10:00.000 4 4 3 0.25

-- Statistics histogram:
RANGE_HI_KEY RANGE_ROWS EQ_ROWS DISTINCT_RANGE_ROWS AVG_RANGE_ROWS
──────────── ────────── ─────── ─────────────────── ──────────────
delivered 0 2 0 1
processing 0 1 0 1
shipped 0 1 0 1

-- plan_xml returned as XML — open in SSMS for graphical rendering
(XML plan output omitted for display — open in SSMS XML editor)

The statistics histogram shows exactly 2 'delivered' rows, 1 'processing', and 1 'shipped' — when the optimiser builds a plan for WHERE Status = 'delivered' it estimates exactly 2 rows, which is accurate; after a bulk load of 50,000 'delivered' orders this histogram becomes wildly inaccurate until UPDATE STATISTICS runs
Density = 0.25 means the average selectivity of a Status value is 25% of the table — with 4 rows and 3 distinct values this is mathematically correct; in production low density values on high-cardinality columns guide the optimiser toward seeks over scans
qp.query_plan returns an XML execution plan that SSMS renders graphically when clicked — storing and comparing plans before and after an optimisation change is the most rigorous way to confirm the change had the intended effect

Plan Warnings — What the Yellow Triangle Means

SQL Server adds warning annotations to execution plan operators when it detects conditions that degraded performance or correctness. The most important warnings are: an implicit conversion warning (a type mismatch forced SQL Server to convert one side of a comparison, potentially disabling index use), a spill to tempdb warning (a Sort or Hash Match operator ran out of its memory grant and had to write intermediate data to disk — extremely expensive), a missing index hint embedded in the plan XML (the optimiser found a better index would have helped this specific query), and a cardinality estimate warning (the row count estimate was very poor due to statistics issues). These warnings are the optimiser's direct communication about what went wrong — they should always be investigated.

-- Trigger an implicit conversion warning — type mismatch in JOIN or WHERE
-- This generates a warning in the execution plan tooltip

-- UserID is INT — comparing to a VARCHAR literal triggers implicit conversion
-- Plan will show: "Type conversion in expression may affect CardinalityEstimate"
SELECT u.FullName, o.TotalAmount
FROM   Users  u
JOIN   Orders o ON o.UserID = u.UserID
WHERE  u.UserID = '2';          -- VARCHAR literal vs INT column — warning generated

-- Correct form — no warning, no conversion
SELECT u.FullName, o.TotalAmount
FROM   Users  u
JOIN   Orders o ON o.UserID = u.UserID
WHERE  u.UserID = 2;            -- INT literal — clean plan, no warning

-- Check plan cache for implicit conversion warnings in stored XML plans
SELECT TOP 10
    SUBSTRING(qt.text, 1, 100)  AS query_text,
    qp.query_plan
FROM       sys.dm_exec_query_stats   qs
CROSS APPLY sys.dm_exec_sql_text(qs.sql_handle)  qt
CROSS APPLY sys.dm_exec_query_plan(qs.plan_handle) qp
WHERE      CAST(qp.query_plan AS NVARCHAR(MAX))
               LIKE '%PlanAffectingConvert%'    -- XML attribute for implicit conversion
  AND      qt.dbid = DB_ID()
ORDER BY   qs.total_logical_reads DESC;

-- Both queries return same result:
FullName TotalAmount
──────────── ───────────
Bob Williams 512.92
Bob Williams 80.99

-- Plan cache warning search:
query_text query_plan
──────────────────────────────────────────── ──────────────────────
SELECT u.FullName, o.TotalAmount FROM Users <ShowPlanXML xmlns=...>
u JOIN Orders o ON o.UserID = u.UserID WHERE
u.UserID = '2'

Both queries return identical results — the implicit conversion is silent from a correctness standpoint, which is what makes it dangerous; it degrades performance without producing an error or any visible difference in the result set
PlanAffectingConvert in the plan XML is the specific attribute that indicates an implicit type conversion affected cardinality estimation — searching the plan cache for this attribute finds every query in the workload with this problem without having to run each one individually
The CAST approach to searching plan XML is a practical production technique — it is not fast (it scans the entire plan cache) so it should be run during off-peak hours or against a copy of the plan cache, not during high-load periods

Summary Table

Plan Element	What It Means	Action
Clustered Index Seek	Targeted row retrieval via PK	Ideal — no action needed
Clustered Index Scan	Full table read	Investigate — missing index or non-SARGable predicate
Key Lookup	Secondary read for missing columns	Add INCLUDE columns to eliminate it
Nested Loops	Row-by-row join — good for small inputs	Concern if outer input is large
Hash Match	Hash table join or aggregation	Watch for spill warning — may need more memory
Sort	Explicit reordering of rows	Eliminate with an ordered index when possible
Est. rows ≠ Actual rows	Statistics mismatch	UPDATE STATISTICS on the affected table
Yellow triangle warning	Implicit conversion, spill, or missing index	Always investigate — performance impact guaranteed

Practice Questions

Practice 1. What is the difference between an estimated and an actual execution plan?

Practice 2. A Key Lookup operator appears in an execution plan. What caused it and how is it eliminated?

Practice 3. An execution plan shows estimated rows = 1 but actual rows = 85,000 on a join operator. What does this mean and what should you do?

Practice 4. What does SET STATISTICS IO ON output and why is logical reads the key metric?

Practice 5. When is a Nested Loops join the best choice and when does it become a problem?

Quiz

Quiz 1. In the SSMS graphical plan, data flows in which direction?

Right to left
Left to right
Top to bottom
Bottom to top

Quiz 2. What does a yellow triangle warning on an execution plan operator indicate?

A performance warning — implicit conversion, spill, or missing index
Parallel operator
Operator cost exceeds 50%
Returned zero rows

Quiz 3. Which T-SQL command shows logical reads and physical reads per table without opening a graphical plan?

SET STATISTICS IO ON
SET SHOWPLAN_ALL ON
DBCC SHOW_STATISTICS
SET STATISTICS PROFILE ON

Quiz 4. A Hash Match operator in the plan has a spill warning. What caused it and what is the fix?

Hash table exceeded memory grant — spilled to tempdb. Fix: update statistics
Duplicate keys — rebuild index
tempdb ran out of space
Too many joins

Quiz 5. What XML attribute in a cached query plan indicates an implicit type conversion warning?

PlanAffectingConvert
ImplicitCast
TypeMismatch
ConversionWarning

Next up - Stored Procedures - Encapsulate your T-SQL logic into named, reusable, parameterised database objects that execute faster and protect against SQL injection.

← Previous Course Index Next →

MS SQL Server