System Design 101 - Part 4: Caching
2025-11-08 • 5 min read
Caching: Your Secret Speed Weapon
Week 1, Part 4 of 5 | 5 min read
← Part 3: Load Balancing | Part 4 (You are here) | Part 5: Database Fundamentals →
The Library Problem
Imagine you're a librarian.
Scenario 1: No system
- Student asks for "Harry Potter"
- You walk to the back room
- Search through 10,000 books
- Find it after 10 minutes
- Next student asks for same book
- You do it all over again
- Keep popular books on the front desk
- Student asks for "Harry Potter"
- Grab it from desk in 5 seconds
- Next 20 students? Same 5 seconds
What Is Caching?
Simple definition: Storing copies of data in a fast-access spot so you don't have to fetch it from the slow source every time.
Real example:
- Without cache: Every page load queries database (slow)
- With cache: First load queries database, next 1,000 loads use cached copy (fast)
Why Caching Changes Everything
The Speed Difference
Without caching:
- Database query: 50-200ms
- Load 100 user profiles: 100 × 50ms = 5,000ms (5 seconds)
- Users leave because it's slow
- Cache lookup: 1-5ms
- Load 100 user profiles: 100 × 1ms = 100ms
- 50x faster
The Cost Difference
Real story: A startup was spending $50,000/month on database servers because every request hit the database.
They added Redis (a caching tool). Database load dropped 85%. New bill: $8,000/month.
Same users. Same features. Just added caching.
The Four Layers of Caching
Think of caching like storing snacks. Some you keep in your pocket (super close), some in your bag, some in your car, some at home.
1. Browser Cache (Closest to User)
Location: User's computer
What to store: Images, CSS, JavaScript files
How long: Hours to days
Example: When you visit a website twice, the logo loads instantly the second time—it's cached in your browser.
How it works:
First visit: Download logo.png (2 seconds)
Second visit: Use saved logo.png (instant)
2. CDN (Content Delivery Network)
Location: Servers around the world, close to users
What to store: Images, videos, static files, sometimes API responses
How long: Minutes to hours
Example: Netflix stores popular shows on CDN servers in your city. That's why they stream instantly—no need to fetch from California every time.
Popular CDNs: Cloudflare, AWS CloudFront, Fastly
3. Application Cache (Redis/Memcached)
Location: Fast memory storage, separate from your main database
What to store: Database query results, user sessions, frequently accessed data
How long: Seconds to hours
This is the most important one for developers.
Example: User profile data, product listings, API responses
4. Database Cache
Location: Inside the database itself
What to store: Query results
How long: Automatic
You don't control this much—the database does it automatically.
Caching Strategies (The "How")
Strategy 1: Cache-Aside (Most Common)
How it works:
- App checks cache first: "Do you have user profile for John?"
- If yes → Return it (fast)
- If no → Get from database, save to cache, return it
User requests data
↓
Check cache
↓
Found? → Return (1ms)
Not found? → Get from database (50ms) → Save to cache → Return
Pros:
- Only caches data that's actually used
- If cache dies, app still works (just slower)
- First request is always slow (cache miss)
Strategy 2: Write-Through
How it works: Every time you save data, you save it to both cache AND database at the same time.
The flow:
User updates profile
↓
Save to cache (fast)
↓
Save to database (slower)
↓
Both done → Success
Pros:
- Cache is always fresh (never stale data)
- No data loss risk
- Writes are slower (doing two things)
- Caches stuff that might never be read
Strategy 3: Write-Behind
How it works: Save to cache immediately, save to database later (in background).
The flow:
User posts comment
↓
Save to cache (instant) → Return success to user
↓
(Later) Save to database in background
Pros:
- Super fast writes
- Can batch multiple writes together
- If cache crashes before writing to database, data is lost
- More complex
Strategy 4: Refresh-Ahead
How it works: Predict what users will want and load it into cache before they ask.
Example: News website loads top 10 articles into cache every 5 minutes, even if nobody's reading them yet.
Pros:
- Users always get cached (fast) responses
- Wastes cache space if predictions are wrong
The Hardest Problem: Cache Invalidation
The famous quote: "There are only two hard things in Computer Science: cache invalidation and naming things."
The problem: When do you delete cached data?
Time-Based (TTL = Time To Live)
How it works: Set an expiration time.
Save user profile to cache for 10 minutes
After 10 minutes → Delete from cache
Next request → Fetch fresh from database
Simple but not perfect:
- User updates profile at minute 9
- Old cached version shows for 1 more minute
- Users see stale data
- Data that changes slowly (product catalogs)
- Short TTL for critical data (5 minutes)
- Long TTL for stable data (1 hour)
Event-Based
How it works: Delete cache when the actual data changes.
User updates profile
↓
Update database
↓
Delete cached profile
↓
Next request → Fetch fresh data → Cache it
More accurate but more complex:
- Need to track all cache keys
- Need to trigger deletions everywhere
- Critical data that must be accurate
- User-facing data that changes often
Hybrid (Best Practice)
Combine both:
- Set TTL as safety net (10 minutes)
- Also delete cache when data changes
- If deletion fails, TTL will clean it up
Real-World Example: E-commerce Site
Product listings:
- Strategy: Cache-aside
- TTL: 1 hour
- Why: Products don't change often, lots of people view same products
- Strategy: Write-through
- TTL: Session duration
- Why: Must be accurate, users update frequently
- Strategy: Refresh-ahead
- TTL: 5 minutes
- Why: Everyone sees same homepage, predict it'll be requested
- Strategy: Cache-aside
- TTL: 15 minutes
- Why: Same searches happen repeatedly
Real Story: Discord's Message Caching
The problem: Millions of people sending billions of messages. Can't query database for every message.
The solution:
- Recent messages (last hour): Cached in memory
- Older messages: Cached in Redis
- Ancient messages: Database only
- 95% of message requests served from cache
- Sub-second message loading
- Database only handles 5% of requests
Common Caching Mistakes
Mistake 1: Caching Everything
Problem: Cache fills up with junk, important stuff gets pushed out.
Fix: Only cache frequently accessed data. If it's requested once a day, don't cache it.
Mistake 2: TTL Too Long
Problem: Users see outdated information for hours.
Fix: Start with short TTL (5-10 minutes), increase if needed.
Mistake 3: Not Handling Cache Failures
Problem: Cache server dies, entire app crashes.
Fix: Always have fallback to database. Slow is better than broken.
Mistake 4: Forgetting to Warm the Cache
Problem: After deployment, first users get slow experience (cache is empty).
Fix: Pre-load popular items into cache before taking traffic.
Quick Tools Guide
Redis (Most Popular)
- Stores data in memory (super fast)
- Can save to disk (won't lose data on restart)
- Supports complex data types
- Easy to use
Memcached
- Simpler than Redis
- Slightly faster for simple use cases
- No disk persistence
- Good for pure caching
Your Challenge
Look at any app you use:
- What data do they probably cache?
- How long do they cache it for?
- What happens when you update your profile—does it show instantly?
- Try this: Clear your browser cache and reload. Notice the difference?
Key Takeaways
- Caching can make your app 10-50x faster
- 80% of requests should hit cache, only 20% hit database
- Four layers: Browser, CDN, Application (Redis), Database
- Cache-aside is the most common strategy
- Set TTL as safety net, delete cache when data changes
- Redis is the most popular caching tool
- Always handle cache failures gracefully
- Don't cache everything—only frequently accessed data
Next up: Part 5: Database Fundamentals →
You've built a fast, scalable system. Now let's make sure your data is stored reliably and efficiently.
Written by Amika Deshapriya Making system design simple, one story at a time.
Connect: LinkedIn | GitHub | Newsletter