DataBase Design Lesson 14 – Many-to-Many Relationships | Dataplexa
Database Design · Lesson 14

Many-to-Many Relationships

Break down complex many-to-many relationships using junction tables and implement them properly in your BookVault database with authors who write multiple books and books with multiple authors.

Understanding Many-to-Many Relationships

Here's the problem. Books can have multiple authors. Authors can write multiple books. Your simple one-to-many relationship breaks down completely.

Think about "Good Omens" by Terry Pratchett and Neil Gaiman. Two authors, one book. Now think about Stephen King — hundreds of books, one primary author. But what about books he co-wrote with Peter Straub? This is where most database designs fall apart.

A many-to-many relationship exists when multiple records in one table relate to multiple records in another table. You see this everywhere — Netflix shows have multiple genres, Spotify songs belong to multiple playlists, Amazon products fit multiple categories.

Why One-to-Many Fails Here

If you put author_id in the books table, you can only store one author per book. If you put book_id in the authors table, each author can only write one book. Neither works for co-authored books.

The Junction Table Solution

The solution is elegant. Create a third table that sits between your two main tables. This junction table (also called a bridge table or associative table) contains foreign keys pointing to both related tables.

AUTHORS
Attributes
PK author_id
first_name
last_name
nationality
MANY
MANY
BOOK_AUTHORS
Attributes
PK book_author_id
FK book_id
FK author_id
author_role
MANY
MANY
BOOKS
Attributes
PK book_id
title
category_id
price
stock_qty

Notice how the book_authors table contains both foreign keys. Each row represents one relationship between one book and one author. This is how you store many-to-many relationships in relational databases.

Creating the Junction Table

-- Create the book_authors junction table
-- Links books to authors in many-to-many relationship
CREATE TABLE book_authors (
    book_author_id INT PRIMARY KEY AUTO_INCREMENT,
    book_id INT NOT NULL,
    author_id INT NOT NULL,
    author_role VARCHAR(50) DEFAULT 'Author',
    FOREIGN KEY (book_id) REFERENCES books(book_id),
    FOREIGN KEY (author_id) REFERENCES authors(author_id),
    UNIQUE(book_id, author_id)
);
Table created successfully.

Key constraints added:
- book_id references books table
- author_id references authors table  
- Unique constraint prevents duplicate book-author pairs

What just happened?

The UNIQUE constraint on (book_id, author_id) prevents duplicate relationships. The author_role field stores whether someone is the main author, co-author, or editor. Try this: add a book with multiple authors using INSERT statements.

Populating Many-to-Many Data

Now you can properly handle co-authored books. Start with your authors and books, then create the relationships.

-- Add authors to BookVault
INSERT INTO authors (first_name, last_name, nationality) VALUES
('Neil', 'Gaiman', 'British'),
('Terry', 'Pratchett', 'British'),
('Stephen', 'King', 'American'),
('Peter', 'Straub', 'American');

-- Add books that have multiple authors
INSERT INTO books (title, category_id, price, stock_qty, published_year) VALUES
('Good Omens', 1, 14.99, 25, 1990),
('The Talisman', 1, 16.99, 18, 1984),
('Black House', 1, 15.99, 12, 2001);
4 rows inserted into authors table.
3 rows inserted into books table.

Authors added:
- Neil Gaiman (ID: 1)
- Terry Pratchett (ID: 2)  
- Stephen King (ID: 3)
- Peter Straub (ID: 4)

Now create the relationships. Each book-author combination gets its own row in the junction table.

-- Link books to their authors via junction table
-- Good Omens has two co-authors
INSERT INTO book_authors (book_id, author_id, author_role) VALUES
(1, 1, 'Co-Author'),  -- Good Omens -> Neil Gaiman
(1, 2, 'Co-Author'),  -- Good Omens -> Terry Pratchett
(2, 3, 'Co-Author'),  -- The Talisman -> Stephen King
(2, 4, 'Co-Author'),  -- The Talisman -> Peter Straub
(3, 3, 'Co-Author'),  -- Black House -> Stephen King
(3, 4, 'Co-Author');  -- Black House -> Peter Straub
6 rows inserted into book_authors table.

Relationships created:
- Good Omens: 2 authors (Gaiman, Pratchett)
- The Talisman: 2 authors (King, Straub)
- Black House: 2 authors (King, Straub)

What just happened?

Each INSERT creates one book-author relationship. Good Omens appears twice in the junction table — once linked to Gaiman, once to Pratchett. This is how you represent the many-to-many relationship. Try this: query all books by Stephen King using a JOIN.

Querying Many-to-Many Relationships

The real power shows up when you query. You can find all books by an author, all authors for a book, or books shared between authors. This requires JOINs across three tables.

Find All Books by an Author

-- Find all books written by Stephen King
-- Requires joining through the junction table
SELECT 
    b.title,
    b.price,
    ba.author_role
FROM books b
JOIN book_authors ba ON b.book_id = ba.book_id
JOIN authors a ON ba.author_id = a.author_id
WHERE a.first_name = 'Stephen' AND a.last_name = 'King'
ORDER BY b.title;
title          | price  | author_role
Black House    | 15.99  | Co-Author
The Talisman   | 16.99  | Co-Author

2 rows returned.

Find All Authors for a Book

-- Find all authors who wrote Good Omens
-- Same pattern, different WHERE clause
SELECT 
    a.first_name,
    a.last_name,
    ba.author_role
FROM authors a
JOIN book_authors ba ON a.author_id = ba.author_id
JOIN books b ON ba.book_id = b.book_id
WHERE b.title = 'Good Omens'
ORDER BY a.last_name;
first_name | last_name | author_role
Neil       | Gaiman    | Co-Author
Terry      | Pratchett | Co-Author

2 rows returned.

Data Insight

Junction tables typically make up 15-30% of tables in mature database schemas. Amazon's product catalog likely has dozens of many-to-many relationships — products to categories, products to tags, products to related items.

Common Many-to-Many Patterns

Every industry has standard many-to-many relationships. Recognize these patterns and you'll design better databases faster.

E-commerce

Products ↔ Categories
Orders ↔ Products
Customers ↔ Wishlists

Social Media

Users ↔ Groups
Posts ↔ Tags
Users ↔ Followers

Education

Students ↔ Courses
Teachers ↔ Subjects
Courses ↔ Prerequisites

Healthcare

Patients ↔ Treatments
Doctors ↔ Specialties
Symptoms ↔ Conditions

BookVault itself has several natural many-to-many relationships beyond books and authors. Books belong to multiple categories (fiction AND bestseller). Customers have multiple addresses. Orders contain multiple books.

Extending BookVault's Many-to-Many Design

-- Books can belong to multiple categories
-- Fantasy AND Bestseller AND Award Winner
CREATE TABLE book_categories (
    book_category_id INT PRIMARY KEY AUTO_INCREMENT,
    book_id INT NOT NULL,
    category_id INT NOT NULL,
    is_primary BOOLEAN DEFAULT FALSE,
    FOREIGN KEY (book_id) REFERENCES books(book_id),
    FOREIGN KEY (category_id) REFERENCES categories(category_id),
    UNIQUE(book_id, category_id)
);
Table created successfully.

Additional features:
- is_primary field marks the main category
- Unique constraint prevents duplicate assignments
- Supports books in multiple categories

Common Mistake: Redundant Foreign Keys

Don't put author_id directly in the books table when you have a junction table. Remove it completely. The junction table is now the single source of truth for book-author relationships.

Junction Table Best Practices

Junction tables seem simple but have subtle design decisions that affect performance and maintainability. Get these wrong and you'll spend months fixing them later.

Naming Conventions

Use clear, consistent names. book_authors immediately tells you this connects books to authors. Avoid generic names like relationships or links.

Primary Key Strategy

Two options exist. Use a surrogate key (like book_author_id) for simpler application code. Or use a composite primary key combining both foreign keys for slightly better performance. Surrogate keys win for maintainability.

Additional Attributes

Junction tables can store relationship-specific data. In book_authors, add author_role, contribution_percentage, or created_at. This data belongs on the relationship, not on either main entity.

Pro tip: Index both foreign key columns separately, plus create a composite index on both together. This speeds up queries from both directions — finding books by author and authors by book.

Performance Considerations

Many-to-many queries require JOINs across three tables minimum. This gets expensive fast with large datasets. Spotify probably has billions of rows in their playlist-song junction tables.

-- Add indexes to speed up junction table queries
-- Index foreign keys for faster JOINs
CREATE INDEX idx_book_authors_book_id ON book_authors(book_id);
CREATE INDEX idx_book_authors_author_id ON book_authors(author_id);

-- Composite index for unique constraint and dual lookups  
CREATE INDEX idx_book_authors_composite ON book_authors(book_id, author_id);
3 indexes created successfully.

Performance improvements:
- Faster book → author lookups
- Faster author → book lookups  
- Faster duplicate detection

What just happened?

These indexes dramatically speed up JOIN operations. The composite index serves double duty — enforcing uniqueness and accelerating lookups in both directions. Try this: run EXPLAIN on your many-to-many queries to see the index usage.

Consider denormalization for read-heavy workloads. If you constantly need author names with books, store the primary author name directly in the books table while keeping the junction table for complex queries. This violates normal form but solves real performance problems.

Quiz

1. BookVault wants to track which customers have wishlisted which books. Each customer can wishlist many books, and each book can be wishlisted by many customers. What's the correct approach?


2. In the book_authors junction table, what does the UNIQUE constraint on (book_id, author_id) accomplish?


3. To find all authors who wrote books priced above $15 in BookVault, which tables must you JOIN and in what order?


Up Next

Weak Entities

Master entities that can't exist independently and learn how to model dependent relationships like order items that only exist within orders.