Cache is one of the most powerful techniques for improving performance. By storing results of expensive operations, we avoid repeating work and dramatically reduce latency.
But cache isn't magic. Used incorrectly, it creates subtle, hard-to-debug, and potentially serious problems. This article explores the benefits of cache, its pitfalls, and how to use it effectively.
Cache is like salt in food: the right amount improves everything, too much ruins it.
Why Cache Works
The principle of locality
Most systems exhibit predictable patterns:
- Temporal locality: recently accessed data will be accessed again
- Spatial locality: data near what was accessed will also be accessed
If 80% of requests access 20% of data (Pareto), caching that 20% has huge impact.
Latency reduction
Without cache: App → Database → Response
10ms + 50ms = 60ms
With cache: App → Cache hit → Response
10ms + 1ms = 11ms
5x improvement or more is common.
Load reduction
Each cache hit is a request that doesn't go to:
- Database
- External API
- Heavy processing
Cache Levels
1. CPU cache (L1, L2, L3)
Managed by hardware. You don't control it directly, but can write cache-friendly code.
2. Application cache (in-memory)
const cache = new Map();
function getUser(id) {
if (cache.has(id)) return cache.get(id);
const user = fetchFromDB(id);
cache.set(id, user);
return user;
}
Pros: extremely fast, simple Cons: not shared between instances, lost on restart
3. Distributed cache (Redis, Memcached)
Shared between application instances.
Pros: shared, persistent (with configuration), scalable Cons: network latency, more complex
4. CDN cache
For static content and cacheable pages.
Pros: geographically distributed, reduces origin load Cons: complex invalidation
5. Database cache
Query cache, buffer pool, etc. Managed by the database.
Cache Pitfalls
1. Cache stampede (thundering herd)
Problem: when an entry expires, multiple simultaneous requests try to recalculate it.
Cache expires
↓
100 requests arrive
↓
All go to database
↓
Database overloaded
Solutions:
- Locking (only one recalculates, others wait)
- Early refresh (before expiration)
- Probabilistic early expiration
2. Stale data
Problem: cache shows data that has already changed at the source.
Solutions:
- TTL appropriate to use case
- Explicit invalidation on writes
- Cache-aside pattern with invalidation
3. Inconsistent cache between instances
Problem: different local cache on each server.
Server A: user.name = "John"
Server B: user.name = "Mary" (updated)
Solutions:
- Use distributed cache
- Short TTL for volatile data
- Pub/sub for invalidation
4. Memory pressure
Problem: cache grows until it consumes all memory.
Solutions:
- Limit cache size
- Eviction policy (LRU, LFU)
- Monitor memory usage
5. Caching data that shouldn't be cached
Problem: incorrectly caching sensitive or user-specific data.
Risks:
- Data leakage between users
- GDPR/privacy issues
- Incorrect behavior
6. Cold start
Problem: after deploy or restart, cache is empty.
Solutions:
- Proactive warm-up
- Graceful degradation
- Rolling deploys
Cache Strategies
Cache-Aside (Lazy Loading)
function getData(key) {
let data = cache.get(key);
if (!data) {
data = database.get(key);
cache.set(key, data, ttl);
}
return data;
}
Pros: simple, data cached on demand Cons: first request always slow
Write-Through
function saveData(key, data) {
database.save(key, data);
cache.set(key, data, ttl);
}
Pros: cache always updated Cons: slower writes
Write-Behind (Write-Back)
function saveData(key, data) {
cache.set(key, data);
// Persists asynchronously later
queue.push({ key, data });
}
Pros: very fast writes Cons: risk of data loss, complex
Refresh-Ahead
Cache is automatically updated before expiration.
Pros: avoids cache miss, consistent latency Cons: may update data not being accessed
Essential Metrics
Hit rate
Hit rate = cache hits / (cache hits + cache misses)
Good: > 90% Great: > 95% Poor: < 80% (reconsider strategy)
Cache latency
Too slow cache may not be worth it. Measure:
- Read time
- Write time
- Network latency (distributed)
Eviction rate
Many evictions indicate cache too small or TTL too long.
Best Practices
1. Define appropriate TTL
| Data type | Suggested TTL |
|---|---|
| Configuration | 5-15 minutes |
| Catalog data | 1-5 minutes |
| User session | 30 minutes |
| Calculated data | Depends on change frequency |
2. Use structured keys
user:123:profile
product:456:inventory
search:hash(query)
3. Serialize efficiently
JSON is readable but slow. For high volume, consider:
- MessagePack
- Protocol Buffers
- Avro
4. Always monitor
- Hit rate by cache type
- Latency
- Memory usage
- Evictions
5. Have fallback
try {
return cache.get(key);
} catch (e) {
// Cache unavailable, go direct to source
return database.get(key);
}
When NOT to Use Cache
- Data that changes every request
- Very large data (serialization cost)
- Expected hit rate very low
- When strong consistency is mandatory
- When the source is already fast enough
Conclusion
Cache is a powerful tool, but requires care:
- Understand your access patterns before caching
- Choose the right strategy for each case
- Configure appropriate TTL — neither too short nor too long
- Prepare for failures — unavailable cache shouldn't bring down the system
- Monitor continuously — hit rate, latency, memory
Used correctly, cache can transform a slow system into a fast one. Used incorrectly, it can create subtle bugs and scaling problems.
Cache is a trade-off: you trade consistency for speed. Make sure the trade is worth it.