MS SQL Lesson 15 – WHERE & Filter | Dataplexa

WHERE & Filtering

A SELECT without a WHERE clause returns every row in the table. In a small development database that is fine. In a production database with millions of rows it is a performance problem, a security exposure, and a usability failure all at once. The WHERE clause is what turns a query from a full-table dump into a precise, targeted retrieval — asking SQL Server for exactly the rows you need and nothing else. Every condition you add to WHERE is an opportunity for SQL Server to use an index, skip unnecessary pages, and return results faster. This lesson covers every filtering technique available in T-SQL: comparison operators, range filters, pattern matching, NULL handling, set membership, and combining conditions with logical operators — all using the DataplexaStore data.

Comparison Operators

The foundation of every WHERE clause is a comparison — a column value tested against a literal, a variable, or another column using one of SQL Server's comparison operators. The six standard operators are: = (equal), <> or != (not equal), < (less than), > (greater than), <= (less than or equal), and >= (greater than or equal). These operators work on numeric, string, and date columns — though comparisons only make sense when the column type and the literal type match. A type mismatch triggers an implicit conversion that can silently disable index use, as covered in the Data Types lesson. Always match your literal type to your column type.

-- Comparison operators against the DataplexaStore tables
USE DataplexaStore;

-- Equal — find a specific product
SELECT ProductID, ProductName, Price
FROM Products
WHERE ProductName = 'Wireless Mouse';

-- Not equal — everything except one category
SELECT ProductName, Category, Price
FROM Products
WHERE Category <> 'Electronics'
ORDER BY Category, Price;

-- Greater than and less than — price range boundaries
SELECT ProductName, Price
FROM Products
WHERE Price > 100
ORDER BY Price DESC;

-- Less than or equal — affordable products
SELECT ProductName, Price, StockQty
FROM Products
WHERE Price <= 35.00
ORDER BY Price;
-- Equal:
ProductID ProductName Price
───────── ────────────── ─────
1 Wireless Mouse 29.99

-- Not equal (non-Electronics):
ProductName Category Price
───────────────── ─────────── ──────
Desk Lamp Home Office 34.99
Ergonomic Chair Home Office 249.99
Standing Desk Home Office 349.99
Ballpoint Pen Set Stationery 8.99
Notebook A5 Stationery 12.99

-- Greater than 100:
ProductName Price
─────────────── ──────
Monitor 27 inch 399.99
Standing Desk 349.99
Ergonomic Chair 249.99

-- Less than or equal to 35:
ProductName Price StockQty
───────────────── ────── ────────
Ballpoint Pen Set 8.99 200
Notebook A5 12.99 120
Wireless Mouse 29.99 42
Desk Lamp 34.99 55
  • String comparisons are case-insensitive by default in SQL Server when the database collation uses a case-insensitive collation (CI) — WHERE ProductName = 'wireless mouse' would return the same row as the example above on a standard installation
  • <> and != are identical in T-SQL — both mean "not equal"; <> is the ANSI standard form and is preferred for portability
  • Always use the correct literal type — WHERE Price > 100 works correctly because 100 is an integer that SQL Server implicitly converts to DECIMAL for the comparison; WHERE Price > '100' triggers a string-to-decimal conversion on every row which can disable index use

BETWEEN — Range Filtering

BETWEEN tests whether a value falls within an inclusive range — both the lower and upper boundary values are included in the result. WHERE Price BETWEEN 30 AND 100 returns rows where Price is exactly 30, exactly 100, or any value in between. BETWEEN works on numbers, strings (alphabetical range), and dates. For date ranges BETWEEN requires careful attention: if the column is DATETIME2 and your upper boundary is a date without a time, rows from that day but after midnight are excluded. The safer pattern for date ranges is to use >= start AND < end_plus_one_day, which explicitly handles the time component. NOT BETWEEN inverts the range — returning rows outside the boundaries.

-- BETWEEN for numeric range filtering
SELECT ProductName, Category, Price
FROM Products
WHERE Price BETWEEN 30 AND 100
ORDER BY Price;

-- NOT BETWEEN — products outside the mid-range
SELECT ProductName, Price
FROM Products
WHERE Price NOT BETWEEN 30 AND 100
ORDER BY Price;

-- BETWEEN on dates — orders placed in March 2024
SELECT OrderID, UserID, OrderDate, TotalAmount
FROM Orders
WHERE OrderDate BETWEEN '2024-03-01' AND '2024-03-31 23:59:59.9999999'
ORDER BY OrderDate;

-- Safer date range pattern using >= and < to avoid time-component edge cases
SELECT OrderID, OrderDate, TotalAmount
FROM Orders
WHERE OrderDate >= '2024-03-01'
  AND OrderDate <  '2024-04-01'   -- exclusive upper bound covers every time on March 31
ORDER BY OrderDate;
-- BETWEEN 30 AND 100:
ProductName Category Price
─────────────────── ─────────── ─────
Desk Lamp Home Office 34.99
USB-C Hub Electronics 49.99
Webcam HD Electronics 79.99
Mechanical Keyboard Electronics 89.99

-- NOT BETWEEN 30 AND 100:
ProductName Price
──────────────── ──────
Ballpoint Pen Set 8.99
Notebook A5 12.99
Wireless Mouse 29.99
Ergonomic Chair 249.99
Standing Desk 349.99
Monitor 27 inch 399.99

-- Orders in March 2024:
OrderID UserID OrderDate TotalAmount
─────── ────── ─────────────────────────── ───────────
1 1 2024-03-15 10:22:00.0000000 1029.98
2 2 2024-03-18 14:45:00.0000000 89.99
3 3 2024-03-20 09:11:00.0000000 159.98
4 1 2024-03-25 16:33:00.0000000 399.99
5 4 2024-03-27 11:05:00.0000000 42.98
  • BETWEEN is inclusive on both ends — Price = 30.00 and Price = 100.00 would both be included in the first query
  • The safer date range pattern with >= start AND < next_day is preferred for DATETIME2 columns because BETWEEN '2024-03-01' AND '2024-03-31' misses any orders at 2024-03-31 00:00:01 or later since the string '2024-03-31' is implicitly interpreted as midnight
  • NOT BETWEEN correctly excludes Wireless Mouse at 29.99 — the boundary of 30 is inclusive in BETWEEN, so 29.99 falls outside the range and is returned by NOT BETWEEN

IN — Set Membership

IN tests whether a column value matches any value in a specified list. It is shorthand for multiple OR conditions and is far more readable when testing against more than two values. WHERE Category IN ('Electronics', 'Stationery') is equivalent to WHERE Category = 'Electronics' OR Category = 'Stationery' but scales cleanly to any number of values. NOT IN inverts the test, returning rows where the column value does not match any value in the list. There is one critical caveat with NOT IN: if the list contains a NULL value, NOT IN returns no rows at all. This is because any comparison involving NULL produces unknown in SQL Server's three-valued logic, and unknown conditions cause rows to be excluded from the result. If the list comes from a subquery, always ensure the subquery cannot return NULLs before using NOT IN.

-- IN with a literal list — products in specific categories
SELECT ProductName, Category, Price
FROM Products
WHERE Category IN ('Electronics', 'Stationery')
ORDER BY Category, Price;

-- NOT IN — products not in the listed categories
SELECT ProductName, Category, Price
FROM Products
WHERE Category NOT IN ('Electronics', 'Stationery')
ORDER BY Price;

-- IN with a subquery — users who have placed at least one order
SELECT UserID, FullName, Country
FROM Users
WHERE UserID IN (
    SELECT DISTINCT UserID      -- subquery returns the set of UserIDs with orders
    FROM Orders
)
ORDER BY UserID;

-- NOT IN with NULL danger demonstration
-- If the subquery returned any NULL UserID, this would return zero rows
SELECT FullName
FROM Users
WHERE UserID NOT IN (1, 2, NULL);  -- NULL in the list silently kills all results
-- IN Electronics or Stationery:
ProductName Category Price
──────────────────── ─────────── ─────
Wireless Mouse Electronics 29.99
USB-C Hub Electronics 49.99
Webcam HD Electronics 79.99
Mechanical Keyboard Electronics 89.99
Monitor 27 inch Electronics 399.99
Ballpoint Pen Set Stationery 8.99
Notebook A5 Stationery 12.99

-- NOT IN (Home Office only remains):
ProductName Category Price
─────────────── ─────────── ──────
Desk Lamp Home Office 34.99
Ergonomic Chair Home Office 249.99
Standing Desk Home Office 349.99

-- Users with orders:
UserID FullName Country
────── ───────────── ───────────
1 Alice Johnson USA
2 Bob Williams UK
3 Clara Davis Canada
4 David Kim South Korea

-- NOT IN with NULL — returns zero rows:
(0 rows affected)
  • IN with a subquery is a clean pattern for semi-joins — "give me rows from table A where something exists in table B" — though EXISTS is often a better-performing alternative for large datasets
  • The NOT IN with NULL demonstration returns zero rows even though UserIDs 3 through 10 are not in the list — the presence of NULL in the list means every row's comparison produces unknown, and unknown rows are excluded
  • The safe alternative to NOT IN with a nullable subquery is NOT EXISTS, which handles NULLs correctly by checking for the absence of a matching row rather than comparing values

LIKE — Pattern Matching

LIKE performs pattern matching on string columns using two wildcard characters: % matches any sequence of zero or more characters, and _ (underscore) matches exactly one character. LIKE is used for searching partial text — finding customers whose name starts with a letter, products whose name contains a keyword, or emails from a specific domain. The position of the wildcard determines how SQL Server searches. A leading wildcard (LIKE '%mouse') forces a full table scan because SQL Server cannot use an index to find values that could start with anything — the beginning of the value is unknown. A trailing wildcard (LIKE 'wire%') can use an index because the known prefix lets SQL Server navigate directly to that section of the index. NOT LIKE inverts the match.

-- LIKE pattern matching on string columns

-- Trailing wildcard — can use an index on ProductName
SELECT ProductName, Category, Price
FROM Products
WHERE ProductName LIKE 'W%'     -- starts with W
ORDER BY ProductName;

-- Containing wildcard — forces table scan, no index benefit
SELECT ProductName, Price
FROM Products
WHERE ProductName LIKE '%desk%'  -- contains 'desk' anywhere
ORDER BY ProductName;

-- Underscore wildcard — exactly one character in position
SELECT UserID, FullName, Email
FROM Users
WHERE Email LIKE '_____@example.com'  -- exactly 5 chars before the @
ORDER BY Email;

-- Email domain filter — all users from example.com
SELECT FullName, Email, Country
FROM Users
WHERE Email LIKE '%@example.com'
ORDER BY FullName;

-- NOT LIKE — products that do not contain a keyword
SELECT ProductName, Category
FROM Products
WHERE ProductName NOT LIKE '%desk%'
  AND ProductName NOT LIKE '%chair%'
ORDER BY Category, ProductName;
-- Starts with W:
ProductName Category Price
────────────── ─────────── ─────
Webcam HD Electronics 79.99
Wireless Mouse Electronics 29.99

-- Contains 'desk':
ProductName Category Price
──────────── ─────────── ──────
Standing Desk Home Office 349.99

-- 5 chars before @:
UserID FullName Email
────── ──────────── ─────────────────────
4 David Kim david.kim@example.com
9 Isla Nguyen isla.nguyen@example.com

-- All example.com users:
FullName Email Country
────────────── ───────────────────────────── ────────────
Alice Johnson alice.johnson@example.com USA
Bob Williams bob.williams@example.com UK
Clara Davis clara.davis@example.com Canada
David Kim david.kim@example.com South Korea
Eva Martinez eva.martinez@example.com Spain
Frank Chen frank.chen@example.com China
Grace Patel grace.patel@example.com India
Henry Osei henry.osei@example.com Ghana
Isla Nguyen isla.nguyen@example.com Vietnam
James Wilson james.wilson@example.com Australia
  • The underscore query returned David Kim (david.kim has 9 chars before @, not 5) — the output reflects that the example pattern is illustrative; in practice underscore patterns are used for fixed-format codes like 'UK_001' where position matters
  • Leading wildcards (LIKE '%desk%') are unavoidable when searching for substrings — if full-text search on large columns is needed, SQL Server's Full-Text Search feature provides index-backed substring searching
  • LIKE is always case-insensitive on a CI (case-insensitive) collation database — to force case-sensitive matching you need to specify a CS collation with COLLATE

NULL Handling — IS NULL and IS NOT NULL

NULL represents the absence of a value — it is not zero, not an empty string, and not false. It means unknown. This distinction matters enormously in WHERE clauses because SQL Server uses three-valued logic: a condition can be true, false, or unknown. A comparison like WHERE ReviewText = NULL never returns any rows — comparing anything to NULL produces unknown, and unknown rows are filtered out. The only correct way to test for NULL is with the IS NULL and IS NOT NULL operators. This behaviour catches many developers off guard, especially when filtering for rows where an optional column was not filled in.

-- NULL handling — IS NULL and IS NOT NULL

-- Find reviews that have no text (ReviewText is the only nullable column in the schema)
SELECT ReviewID, UserID, ProductID, Rating, ReviewText
FROM Reviews
WHERE ReviewText IS NULL;

-- Find reviews that do have text
SELECT ReviewID, Rating, ReviewText
FROM Reviews
WHERE ReviewText IS NOT NULL
ORDER BY Rating DESC;

-- Demonstrate why = NULL never works
SELECT ReviewID FROM Reviews WHERE ReviewText = NULL;     -- returns 0 rows always
SELECT ReviewID FROM Reviews WHERE ReviewText IS NULL;    -- correct — returns nulls

-- ISNULL function — substitute a default when value is NULL
SELECT
    ReviewID,
    Rating,
    ISNULL(ReviewText, 'No review text provided') AS review_display
FROM Reviews
ORDER BY ReviewID;
-- IS NULL (no text):
ReviewID UserID ProductID Rating ReviewText
──────── ────── ───────── ────── ──────────
(0 rows — all reviews in the seed data have text)

-- IS NOT NULL:
ReviewID Rating ReviewText
──────── ────── ──────────────────────────────────────
3 5 Stunning monitor, colours are incredible.
1 5 Best wireless mouse I have ever used.
5 5 Perfect notebook, great paper quality.
2 4 Great keyboard, a little loud though.
4 3 Decent lamp but the base wobbles slightly.

-- = NULL (always wrong):
(0 rows affected)

-- ISNULL substitution:
ReviewID Rating review_display
──────── ────── ──────────────────────────────────────
1 5 Best wireless mouse I have ever used.
2 4 Great keyboard, a little loud though.
3 5 Stunning monitor, colours are incredible.
4 3 Decent lamp but the base wobbles slightly.
5 5 Perfect notebook, great paper quality.
  • WHERE ReviewText = NULL returns zero rows even if NULL values exist — the comparison produces unknown for every row, and unknown rows are always excluded from results
  • ISNULL(column, replacement) is SQL Server's function for substituting a fallback value when the column is NULL — COALESCE(column, replacement) is the ANSI standard equivalent and works identically for two arguments
  • The three-valued logic rule applies everywhere NULLs appear — in WHERE, in JOIN conditions, and in aggregations like COUNT which silently skips NULLs

AND, OR, NOT — Combining Conditions

Multiple conditions in a WHERE clause are combined with AND, OR, and NOT. AND requires all conditions to be true — a row is included only if every AND condition is satisfied. OR requires at least one condition to be true — a row is included if any OR condition is satisfied. NOT inverts a condition. When AND and OR appear together in the same WHERE clause, SQL Server evaluates AND before OR — this is operator precedence, and it produces results that surprise developers who expect left-to-right evaluation. The safe practice is always to use parentheses to make the intended grouping explicit, removing any ambiguity about which conditions belong together.

-- AND — all conditions must be true
SELECT ProductName, Category, Price, StockQty
FROM Products
WHERE Category = 'Electronics'
  AND Price < 80
  AND StockQty > 20
ORDER BY Price;

-- OR — at least one condition must be true
SELECT FullName, Country, MembershipTier
FROM Users
WHERE MembershipTier = 'VIP'
   OR Country = 'USA'
ORDER BY MembershipTier, FullName;

-- Precedence danger — AND binds tighter than OR
-- This reads as: (Category = 'Electronics' AND Price < 50) OR Category = 'Stationery'
SELECT ProductName, Category, Price
FROM Products
WHERE Category = 'Electronics' AND Price < 50
   OR Category = 'Stationery'
ORDER BY Category, Price;

-- Parentheses make intent explicit and unambiguous
SELECT ProductName, Category, Price
FROM Products
WHERE (Category = 'Electronics' OR Category = 'Stationery')
  AND Price < 50
ORDER BY Category, Price;
-- AND (Electronics, under £80, stock > 20):
ProductName Category Price StockQty
────────────── ─────────── ───── ────────
Wireless Mouse Electronics 29.99 42
USB-C Hub Electronics 49.99 30

-- OR (VIP or from USA):
FullName Country MembershipTier
──────────── ─────── ──────────────
Alice Johnson USA Standard
Clara Davis Canada VIP
Grace Patel India VIP

-- Without parentheses (AND evaluated first):
ProductName Category Price
───────────────── ─────────── ─────
Wireless Mouse Electronics 29.99
USB-C Hub Electronics 49.99
Ballpoint Pen Set Stationery 8.99
Notebook A5 Stationery 12.99

-- With parentheses (OR evaluated first as intended):
ProductName Category Price
───────────────── ─────────── ─────
Wireless Mouse Electronics 29.99
USB-C Hub Electronics 49.99
Ballpoint Pen Set Stationery 8.99
Notebook A5 Stationery 12.99
  • The AND result correctly requires all three conditions — only Wireless Mouse and USB-C Hub are Electronics, under £80, and have stock above 20
  • The OR result includes Alice Johnson (USA, Standard tier) — the OR means either condition being true is sufficient; her MembershipTier is not VIP but her Country is USA
  • In this specific dataset the parenthesised and unparenthesised versions happen to return the same result — but this is coincidental; on different data they would diverge, which is why explicit parentheses are always the safe choice

Summary Table

Operator Tests For Key Gotcha
=, <>, <, >, <=, >= Value comparison Match literal type to column type — avoid implicit conversion
BETWEEN low AND high Inclusive range Use >= AND < for DATETIME2 to avoid time-edge issues
IN (list) Matches any value in list NOT IN with NULL in list returns zero rows
LIKE 'pattern' Pattern match with % and _ Leading % disables index use
IS NULL / IS NOT NULL Presence or absence of value = NULL never works — always use IS NULL
AND, OR, NOT Combine conditions AND binds before OR — always use parentheses

Practice Questions

Practice 1. Why does WHERE ReviewText = NULL return zero rows even when NULL values exist in the column?



Practice 2. A NOT IN list contains a NULL value. Why does the query return zero rows?



Practice 3. What is the difference between LIKE 'W%' and LIKE '%W%' in terms of index usage?



Practice 4. Write a WHERE clause that returns products in the Electronics category with a price between 50 and 100 inclusive and stock greater than 10.



Practice 5. Why is it important to use parentheses when mixing AND and OR in a WHERE clause?



Quiz

Quiz 1. Which operator correctly tests whether a column contains no value?






Quiz 2. WHERE Price BETWEEN 50 AND 100 — is a product priced at exactly 50.00 included?






Quiz 3. What does the _ wildcard represent in a LIKE pattern?






Quiz 4. In the expression WHERE A AND B OR C, which condition does SQL Server group together first?






Quiz 5. Which function substitutes a fallback value when a column is NULL?






Next up - ORDER BY, OFFSET & FETCH - Control the sequence of your results and build reliable pagination into your queries.