Database Design
Types of Databases
Explore the different database architectures and learn which one fits your specific use case, from simple file storage to complex distributed systems.
Database selection drives everything else in your system architecture
The database landscape splits into several distinct categories. Each solves specific problems. Pick wrong, and you'll spend months migrating later — trust me, I've been there.
Relational Databases (RDBMS)
The workhorses of the database world. Relational databases store data in tables with predefined relationships. Think of them as super-powered spreadsheets that never break.
MySQL, PostgreSQL, SQL Server, and Oracle dominate here. They use SQL (Structured Query Language) for everything. Your banking app? Relational database. Your company's payroll system? Definitely relational.
Why BookVault uses MySQL
Our online bookstore needs strict data integrity. When a customer orders a book, we cannot afford to lose that transaction or oversell inventory. MySQL's ACID compliance guarantees every order is processed correctly.
-- BookVault customer order with perfect consistency
-- This either completes fully or fails completely
BEGIN TRANSACTION;
UPDATE books
SET stock_qty = stock_qty - 2
WHERE book_id = 15 AND stock_qty >= 2;
INSERT INTO orders (customer_id, total_amount, status)
VALUES (1001, 47.98, 'confirmed');
INSERT INTO order_items (order_id, book_id, quantity, unit_price)
VALUES (LAST_INSERT_ID(), 15, 2, 23.99);
COMMIT;
NoSQL Databases
When relational databases hit their limits, NoSQL databases step in. They sacrifice some consistency for massive scale and flexibility. Netflix uses Cassandra to handle billions of viewing records. Facebook built their entire social graph on custom NoSQL solutions.
Document Databases
Store data as documents (usually JSON). MongoDB leads this space. Perfect when your data structure varies wildly or changes frequently.
// BookVault product catalog in MongoDB
// Each book can have completely different attributes
{
"_id": "507f1f77bcf86cd799439011",
"title": "Database Design Mastery",
"authors": ["Sarah Chen", "Mike Rodriguez"],
"isbn": "978-0123456789",
"price": 29.99,
"categories": ["Technology", "Databases", "Programming"],
"reviews": [
{
"user": "john_dev",
"rating": 5,
"comment": "Best database book ever!",
"date": "2024-01-15"
}
],
"metadata": {
"page_count": 420,
"language": "English",
"publisher": "TechBooks Press",
"formats": ["hardcover", "ebook", "audiobook"]
}
}
Key-Value Stores
The simplest NoSQL type. Just keys pointing to values. Redis dominates caching, while DynamoDB handles massive web applications.
-- BookVault session storage in Redis
-- Lightning-fast user session management
SET user:session:abc123 "{
\"customer_id\": 1001,
\"cart_items\": [15, 23, 67],
\"login_time\": \"2024-01-15T10:30:00Z\",
\"preferences\": {\"currency\": \"USD\", \"language\": \"en\"}
}" EX 3600
-- Retrieve session instantly
GET user:session:abc123
Graph Databases
Specialized for connected data. Neo4j excels at social networks, recommendation engines, and fraud detection. When relationships matter more than the data itself.
**Best for:** Social networks, recommendation systems, supply chain trackingRelational (RDBMS)
Perfect data consistency, mature tooling, SQL expertise, ACID transactions
NoSQL
Massive scale, flexible schemas, eventual consistency, horizontal scaling
NewSQL Databases
The best of both worlds attempt. NewSQL databases promise relational consistency with NoSQL scale. Google's Spanner, CockroachDB, and VoltDB lead here.
They're complex and expensive but solve real problems. Uber uses CockroachDB for their global trip data. When you need both ACID guarantees and planet-scale distribution.
Complexity Warning
NewSQL databases require serious database expertise to operate. Start with proven relational or NoSQL solutions unless you have specific scaling requirements that demand distributed ACID transactions.
Specialized Database Types
Time-Series Databases
Built for time-stamped data. InfluxDB and TimescaleDB excel at IoT sensor data, financial market data, and application monitoring.
Data Insight
BookVault processes 50,000 page views per hour. A time-series database could store this click-stream data using 90% less storage than a traditional relational database.
Search Databases
Elasticsearch and Solr power full-text search. Every major e-commerce site uses them for product search functionality.
// BookVault search index in Elasticsearch
// Find books matching "database design" with filters
GET /books/_search
{
"query": {
"bool": {
"must": [
{
"multi_match": {
"query": "database design",
"fields": ["title^2", "description", "author"]
}
}
],
"filter": [
{"range": {"price": {"lte": 50}}},
{"term": {"category": "technology"}}
]
}
}
}
In-Memory Databases
Store everything in RAM for maximum speed. Redis and Memcached are the champions here. Perfect for caching and session storage.
How to Choose the Right Type
Wrong database choice kills projects. I've seen teams spend six months migrating because they picked MongoDB for financial transactions or MySQL for analytics.
| Use Case | Best Database Type | Example |
|---|---|---|
| E-commerce transactions | Relational (MySQL/PostgreSQL) | BookVault orders |
| Content management | Document (MongoDB) | Blog posts, product catalogs |
| Real-time caching | Key-value (Redis) | User sessions, shopping carts |
| Search functionality | Search engine (Elasticsearch) | Product search, log analysis |
| Social connections | Graph (Neo4j) | Friend networks, recommendations |
| IoT sensor data | Time-series (InfluxDB) | Website analytics, metrics |
Decision Framework
Ask these questions in order:
1. **Do you need ACID transactions?** → Relational database 2. **Is your data highly connected?** → Graph database 3. **Do you need full-text search?** → Search database + primary database 4. **Is it time-stamped sensor/metric data?** → Time-series database 5. **Do you need massive scale with flexible schema?** → NoSQL database 6. **Everything else?** → Start with PostgreSQLMost applications need multiple database types. BookVault uses MySQL for transactions, Redis for caching, and Elasticsearch for search. This is called polyglot persistence — using the right tool for each job.
Cloud vs Self-Managed
The final choice: run it yourself or pay someone else to do it. Cloud databases cost more per query but save thousands of hours in maintenance.
**Managed Cloud Options:** - **Amazon RDS** - MySQL, PostgreSQL, and others - **MongoDB Atlas** - Fully managed MongoDB - **Google Cloud SQL** - Relational databases with automatic scaling - **Azure Cosmos DB** - Multi-model NoSQL service **Self-Managed Benefits:** - Complete control over configuration - No vendor lock-in - Lower cost at scale - Custom optimization possible **Cloud Benefits:** - Automatic backups and updates - Built-in monitoring and alerting - Instant scaling - 99.9% uptime guaranteesHonestly, unless you have a dedicated database team, start with managed cloud services. You'll ship features faster and sleep better at night.
Migration Reality
Switching database types later is painful and expensive. Pinterest spent 18 months migrating from MySQL to HBase. Plan for 3x your current scale and choose accordingly.
Quiz
1. BookVault's payment system requires perfect data consistency - no lost orders or double charges. Which database type should they choose and why?
2. BookVault wants to store book reviews where each review can have different fields - some have ratings, others have detailed comments, some include reading progress. What database type fits best?
3. A large e-commerce application needs ACID transactions for orders, fast caching for user sessions, and powerful search for products. What's the best architecture approach?
Up Next
Data vs Information
Learn the crucial distinction between raw data and meaningful information, and how proper database design transforms one into the other.