Database Design
Third Normal Form (3NF)
Eliminate transitive dependencies from your BookVault database by moving indirectly dependent data into separate tables
What is Third Normal Form?
Third Normal Form (3NF) removes transitive dependencies from your tables. A transitive dependency exists when column A depends on column B, and column B depends on column C. That creates an indirect dependency between A and C.
Think about it this way: if you know a customer's city, you automatically know their country. The customer depends on city, and city depends on country. Customer data now has a transitive dependency on country data.
A table is in 3NF when:
It's already in Second Normal Form (2NF)
Every non-key column depends directly on the primary key (no transitive dependencies)
Spotify handles this perfectly. They don't store artist nationality in every song record. Song depends on artist_id, artist_id determines nationality. That's a transitive dependency they avoid by keeping artist details separate.
Identifying Transitive Dependencies
Transitive dependencies hide in plain sight. Look at this BookVault customer table that violates 3NF:
Before — Violates 3NF
| customer_id | name | city | country |
|---|---|---|---|
| 1 | Sarah Chen | Toronto | Canada |
| 2 | Mike Johnson | Toronto | Canada |
| 3 | Emma Wilson | London | UK |
After — Follows 3NF
| customer_id | name | city_id |
|---|---|---|
| 1 | Sarah Chen | 1 |
| 2 | Mike Johnson | 1 |
| city_id | city_name | country |
|---|---|---|
| 1 | Toronto | Canada |
| 2 | London | UK |
The transitive dependency chain looked like this:
customer_id → city → country
Country doesn't depend directly on customer_id. It depends on city, which depends on customer_id. Breaking this chain eliminates data redundancy and update anomalies.
Common Mistake
Don't confuse functional dependencies with transitive ones. order_id → customer_id is functional. order_id → customer_id → customer_city is transitive.
BookVault 3NF Transformation
Our BookVault database has a classic 3NF violation. Books store author nationality directly, creating this dependency:
Here's the problematic books table:
-- BookVault books table violating 3NF
-- Contains transitive dependency: book_id → author_id → nationality
SELECT book_id, title, author_name, nationality, price
FROM books_unnormalized
WHERE category_id = 1;
book_id | title | author_name | nationality | price --------|--------------------------|----------------|-------------|------- 1 | The Great Gatsby | F. Scott | American | 12.99 2 | This Side of Paradise | F. Scott | American | 11.49 3 | Pride and Prejudice | Jane Austen | British | 9.99 4 | Sense and Sensibility | Jane Austen | British | 10.49
What just happened?
Notice how "American" and "British" repeat for each author. That's the transitive dependency creating redundancy. Try this: count how many times you'd need to update "American" if F. Scott Fitzgerald moved to France.
Time to normalize this into 3NF. We need to extract author data into its own table:
-- Create normalized BookVault authors table
-- Eliminates transitive dependency by separating author attributes
CREATE TABLE authors (
author_id INT PRIMARY KEY,
first_name VARCHAR(50) NOT NULL,
last_name VARCHAR(50) NOT NULL,
nationality VARCHAR(30),
birth_year INT
);
-- Insert author data (no more redundancy)
INSERT INTO authors (author_id, first_name, last_name, nationality, birth_year) VALUES
(1, 'F. Scott', 'Fitzgerald', 'American', 1896),
(2, 'Jane', 'Austen', 'British', 1775),
(3, 'George', 'Orwell', 'British', 1903);
Table 'authors' created successfully. 3 rows inserted into authors table.
-- Create 3NF compliant books table
-- Only stores direct dependencies on book_id
CREATE TABLE books (
book_id INT PRIMARY KEY,
title VARCHAR(200) NOT NULL,
author_id INT NOT NULL,
category_id INT NOT NULL,
price DECIMAL(10,2) NOT NULL,
stock_qty INT DEFAULT 0,
published_year INT,
FOREIGN KEY (author_id) REFERENCES authors(author_id)
);
Table 'books' created successfully. Foreign key constraint added on author_id.
What just happened?
We broke the transitive dependency. Now books only store author_id, not author nationality. The foreign key ensures data integrity between tables. Try this: query both tables with a JOIN to get complete book information.
Benefits of 3NF
Third Normal Form delivers concrete improvements to your database. Storage efficiency improves because you eliminate repeated data. Update consistency becomes automatic because each fact exists in exactly one place.
Data Integrity
Update an author's nationality once in the authors table. Every book by that author reflects the change instantly.
Storage Efficiency
Store "American" once instead of repeating it for every F. Scott Fitzgerald book in your catalog.
Query Performance
Smaller tables mean faster scans. Index on author_id once instead of indexing repeated nationality strings.
Maintenance
Add new author attributes (birth date, biography) without touching the books table structure.
Data Insight
Amazon's product catalog follows strict 3NF principles. Product details, brand information, and category hierarchies live in separate tables. This design supports over 350 million products without redundancy issues.
When 3NF Goes Wrong
Not every dependency should be normalized away. Some "transitive" relationships are actually business rules that belong together.
Calculated Fields
Order totals depend on item prices and quantities. That's technically transitive, but splitting it would be insane:
-- DON'T normalize calculated fields like this
-- BookVault order totals should stay in orders table
SELECT
o.order_id,
o.customer_id,
SUM(oi.quantity * oi.unit_price) as total_amount
FROM orders o
JOIN order_items oi ON o.order_id = oi.order_id
GROUP BY o.order_id, o.customer_id;
order_id | customer_id | total_amount ---------|-------------|------------- 1 | 1 | 45.97 2 | 2 | 23.48 3 | 1 | 67.22
Denormalization for Performance
Netflix deliberately denormalizes user viewing data. They store movie titles alongside view records, violating 3NF. Why? Reading 100 million view records with JOINs would crush performance.
Over-Normalization Warning
Don't split every possible dependency. Customer names could technically be first_name and last_name tables, but that's ridiculous. Use common sense alongside normalization rules.
3NF in Action: BookVault Reviews
Here's a realistic 3NF example from BookVault's review system. We need to store customer reviews without transitive dependencies:
-- BookVault reviews in 3NF
-- Each table stores only directly dependent attributes
CREATE TABLE reviews (
review_id INT PRIMARY KEY,
book_id INT NOT NULL,
customer_id INT NOT NULL,
rating INT NOT NULL CHECK (rating BETWEEN 1 AND 5),
review_text TEXT,
review_date DATE NOT NULL,
FOREIGN KEY (book_id) REFERENCES books(book_id),
FOREIGN KEY (customer_id) REFERENCES customers(customer_id)
);
-- Get complete review information with JOINs
SELECT
r.review_id,
b.title as book_title,
c.first_name || ' ' || c.last_name as customer_name,
r.rating,
r.review_date
FROM reviews r
JOIN books b ON r.book_id = b.book_id
JOIN customers c ON r.customer_id = c.customer_id
WHERE r.rating >= 4
ORDER BY r.review_date DESC;
review_id | book_title | customer_name | rating | review_date ----------|---------------------|---------------|--------|------------- 15 | The Great Gatsby | Sarah Chen | 5 | 2024-03-15 14 | Pride and Prejudice | Emma Wilson | 4 | 2024-03-14 13 | 1984 | Mike Johnson | 5 | 2024-03-13
What just happened?
The reviews table stores only review-specific data. Book titles and customer names come from their respective tables via foreign keys. No transitive dependencies, no data duplication. Try this: add a new book or customer and see how reviews automatically link to complete information.
This design prevents several problems. Change a book title? It updates everywhere automatically. Delete a customer? Foreign key constraints prevent orphaned reviews. Clean, consistent, efficient.
Pro tip: Use table aliases consistently in your JOINs. reviews r is cleaner than repeating the full table name everywhere.
Quiz
1. In BookVault's books table, why does storing author nationality violate 3NF?
2. A BookVault customers table stores: customer_id, name, email, city, state. You notice city determines state (Boston → Massachusetts). How do you fix this 3NF violation?
3. BookVault's orders table stores total_amount, which can be calculated from order_items (quantity × unit_price). This creates a transitive dependency. What should you do?
Up Next
Boyce–Codd Normal Form (BCNF)
Discover the stricter normalization form that handles complex functional dependencies and composite keys that 3NF sometimes misses.