DataBase Design Lesson 25 – Handling NULLs in Design | Dataplexa
Normalization · Lesson 25

Handling NULLs in Design

Learn when to allow NULL values, when to prevent them, and how NULL handling decisions impact BookVault's data integrity and query performance.

Understanding NULL Values

NULL represents unknown or missing data. Not zero. Not empty string. Literally unknown. When BookVault's customer table has a NULL phone number, the database says "we don't know if this customer has a phone number."

This creates unique challenges. NULL doesn't equal anything — not even another NULL. Two customers with NULL phone numbers aren't considered to have the same phone number. They both have unknown phone numbers.

NULL Behavior

• NULL = NULL returns FALSE
• NULL + 5 returns NULL
• COUNT() excludes NULLs
• WHERE phone = NULL finds nothing

NULL Solutions

• Use IS NULL, not = NULL
• COALESCE() provides defaults
• NOT NULL constraints prevent them
• Design around optional data

Why does this matter? Because NULL handling decisions made during database design impact every query written later. Poor NULL design creates bugs that surface months after launch.

When to Allow NULLs

Some data is genuinely optional. BookVault customers might not provide their middle name, phone number, or shipping preferences initially. These fields should allow NULL because forcing fake data creates worse problems than missing data.

-- BookVault customers table with strategic NULL allowances
-- Required fields: NOT NULL enforced
-- Optional fields: NULL allowed for genuine optionality
CREATE TABLE customers (
    customer_id INT PRIMARY KEY,
    first_name VARCHAR(50) NOT NULL,
    last_name VARCHAR(50) NOT NULL,
    email VARCHAR(255) NOT NULL UNIQUE,
    phone VARCHAR(20),           -- NULL allowed: truly optional
    middle_name VARCHAR(50),     -- NULL allowed: not everyone has one
    birth_date DATE,             -- NULL allowed: sensitive personal data
    created_at TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP
);
Table created successfully.

What just happened?

We created a customers table where first_name, last_name, and email are required but phone, middle_name, and birth_date can be NULL. This reflects real business needs — customers must provide names and email for account creation, but other details are optional.

Valid NULL Use Cases

Personal Information

Phone numbers, birth dates, middle names — genuinely optional personal details

Future Data

Shipping dates for unprocessed orders, completion dates for ongoing tasks

Consider BookVault's orders table. Orders get created with NULL shipped_date because shipping hasn't happened yet. NULL represents a future state, not missing information.

When to Prevent NULLs

Critical business data should never be NULL. BookVault books must have titles, prices, and authors. Orders must have dates and customer IDs. Missing this data breaks fundamental business logic.

-- BookVault books table with strict NOT NULL enforcement
-- Every field represents critical business data
CREATE TABLE books (
    book_id INT PRIMARY KEY,
    title VARCHAR(255) NOT NULL,         -- Books must have titles
    author_id INT NOT NULL,              -- Books must have authors
    category_id INT NOT NULL,            -- Books must have categories
    price DECIMAL(10,2) NOT NULL,       -- Books must have prices
    stock_qty INT NOT NULL DEFAULT 0,   -- Stock quantity known, default to 0
    published_year INT NOT NULL,        -- Publication year required
    isbn VARCHAR(13) NOT NULL UNIQUE    -- ISBN required and unique
);
Table created successfully.

What just happened?

Every column in books is NOT NULL because every piece of information is essential for business operations. Notice stock_qty has DEFAULT 0 — we always know stock levels, even if it's zero. Try this: attempt to insert a book with NULL title and watch the database reject it.

Critical NOT NULL Fields

Field Type Examples Why NOT NULL
Primary Keys customer_id, book_id, order_id Database requirement for row identification
Business Identifiers email, ISBN, order_date Core business operations depend on these
Financial Data price, total_amount, quantity Financial calculations require known values
Status Fields order_status, created_at System logic depends on known states

Common Mistake: Over-restricting NULLs

Making every field NOT NULL forces applications to insert fake data like "Unknown" or "N/A". This creates dirty data that's harder to identify and clean than proper NULLs. Allow NULL for genuinely optional information.

Default Values vs NULLs

Sometimes you want to provide a default value instead of allowing NULL. This works when you can make a reasonable assumption about missing data.

✓ Default Values

CREATE TABLE orders (
    order_id INT PRIMARY KEY,
    customer_id INT NOT NULL,
    order_date TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
    status VARCHAR(20) DEFAULT 'pending',
    total_amount DECIMAL(10,2) NOT NULL
);

Good for: current timestamps, status flags, boolean fields

NULL Allowed

CREATE TABLE customers (
    customer_id INT PRIMARY KEY,
    first_name VARCHAR(50) NOT NULL,
    phone VARCHAR(20),
    birth_date DATE,
    preferences TEXT
);

Good for: personal details, future data, truly optional fields

BookVault uses defaults for order status (starts as 'pending') and creation timestamps (uses current time). But customer phone numbers remain NULL-able because not every customer wants to share their phone number.

Querying with NULLs

NULL values behave differently in queries. Standard comparison operators don't work. WHERE phone = NULL returns no results, even for rows where phone is actually NULL.

-- BookVault: Find customers with no phone number
-- WRONG: This finds nobody, even customers with NULL phones
SELECT customer_id, first_name, last_name 
FROM customers 
WHERE phone = NULL;
(0 rows returned)
-- BookVault: Find customers with no phone number
-- CORRECT: Use IS NULL to check for NULL values
SELECT customer_id, first_name, last_name 
FROM customers 
WHERE phone IS NULL;
customer_id | first_name | last_name
------------|------------|----------
         15 | Sarah      | Johnson
         23 | Mike       | Chen
         31 | Emma       | Davis

What just happened?

The first query returned zero rows because NULL doesn't equal anything, including NULL. The second query used IS NULL which correctly identifies NULL values. Always use IS NULL and IS NOT NULL when checking for NULL values.

Handling NULLs in Calculations

-- BookVault: Calculate average book prices, handling NULL prices properly
-- COALESCE provides fallback values for NULL data
SELECT 
    title,
    price,
    COALESCE(price, 0.00) AS safe_price,
    CASE 
        WHEN price IS NULL THEN 'Price TBD'
        ELSE CONCAT('$', price)
    END AS price_display
FROM books
WHERE category_id = 1;
title                    | price | safe_price | price_display
-------------------------|-------|------------|-------------
The Great Gatsby         | 12.99 |      12.99 | $12.99
To Kill a Mockingbird    |  NULL |       0.00 | Price TBD
1984                     | 14.50 |      14.50 | $14.50

Data Insight

Netflix uses NULL for movie ratings that haven't been calculated yet, but defaults to 0 for view counts. Amazon allows NULL product descriptions but requires prices. Each NULL decision impacts millions of queries daily.

Design Patterns for NULLs

Successful databases follow consistent patterns for NULL handling. BookVault adopts these patterns across all tables for predictable behavior.

Required Pattern

Business-critical fields: NOT NULL with meaningful defaults where appropriate

Optional Pattern

Personal/preference fields: Allow NULL, use COALESCE in queries for display

Future Pattern

Process fields: NULL until event occurs (shipped_date, completed_at)

-- BookVault: Complete table showing all three NULL patterns
CREATE TABLE order_items (
    order_item_id INT PRIMARY KEY,
    
    -- Required Pattern: Business-critical data
    order_id INT NOT NULL,
    book_id INT NOT NULL,
    quantity INT NOT NULL DEFAULT 1,
    unit_price DECIMAL(10,2) NOT NULL,
    
    -- Optional Pattern: Enhancement data
    gift_message TEXT,
    special_instructions TEXT,
    
    -- Future Pattern: Process completion data
    packed_at TIMESTAMP,
    shipped_at TIMESTAMP,
    delivered_at TIMESTAMP
);
Table created successfully.

Pro Tip: Document your NULL handling patterns in database comments. Future developers (including yourself) will thank you when they understand why some fields allow NULL and others don't. Consistency in NULL handling prevents bugs and improves code maintainability.

Quiz

1. BookVault's query `SELECT * FROM customers WHERE phone = NULL` returns zero rows even though some customers have NULL phone numbers. What's the correct fix?


2. Which field in BookVault's books table should definitely be NOT NULL?


3. BookVault's orders table has a status field that tracks order progress. What's the best approach for handling this field?


Up Next

Referential Integrity

Learn how foreign key constraints ensure data relationships remain valid and prevent orphaned records in your database.