Caching Strategies
Caching improves latency and reduces pressure on primary storage by keeping hot data closer to the application. Good interview answers cover where the cache sits, how invalidation works, and how stampedes are prevented.
Reading time
11 min
Why Cache
Read-heavy systems often spend too much time waiting on the database. A cache can return common results faster and absorb repeated reads. Even a small cache hit rate improvement dramatically reduces database load at scale.
Common Patterns
- Cache aside: the application checks the cache first, fetches from the database on a miss, then populates the cache for future reads
- Write through: every write goes to the cache and the database simultaneously, keeping them in sync at the cost of write latency
- Write back: writes go to the cache first and are flushed to the database asynchronously, improving write speed but risking data loss on crash
- Refresh ahead: the cache proactively reloads entries before they expire, reducing cold reads for predictably popular data
Where to Cache
- Database query results
- Rendered HTML fragments
- Session tokens and user profiles
- Expensive computation results
- Third-party API responses
- Configuration and feature flags
Common Problems
- Stale data: cached values diverge from the source of truth after writes
- Cache stampede: many requests miss simultaneously and all hit the database at once
- Hot keys: a single key receives disproportionate traffic and becomes a bottleneck
- Memory pressure and eviction churn: cache fills up and evicts entries faster than they can be reused
Eviction Policies
- LRU (Least Recently Used): evicts the entry that was accessed least recently, good for temporal locality
- LFU (Least Frequently Used): evicts the entry accessed fewest times, good for skewed access patterns
- TTL-based: entries expire after a fixed duration regardless of access pattern
- FIFO: evicts the oldest inserted entry, simple but often suboptimal
Mitigations
- TTL with jitter: randomize expiry times to prevent mass simultaneous expiration
- Request coalescing: collapse concurrent misses for the same key into a single database fetch
- Distributed locks for hot misses: only one request populates the cache while others wait
- Background refresh: reload cache entries asynchronously before they expire to avoid cold hits
Cache Invalidation Strategies
Cache invalidation is one of the hardest problems in distributed systems. Common approaches include time-based expiry with short TTLs, event-driven invalidation triggered by writes, and versioned cache keys that change whenever the underlying data changes. Each trades consistency for complexity differently.
Distributed Caching
Single-node caches do not survive restarts and cannot scale horizontally. Distributed caches like Redis Cluster and Memcached shard data across nodes using consistent hashing. This spreads hot key load and increases total memory capacity without a single point of failure.
Cache Warm-Up
A cold cache after a deployment or restart sends all traffic to the database simultaneously. Warm-up strategies include pre-loading known hot keys before switching traffic, using a read-through pattern that fills the cache organically, or maintaining a persistent cache that survives restarts.
Caching and Consistency
Caches introduce eventual consistency by definition. Systems that require strong consistency must either skip caching for critical reads, use write-through patterns, or implement cache invalidation on every write. Choosing a cache strategy is always a trade-off between read performance and data freshness.
Interview Tip
Do not just say 'use Redis'. Explain what data gets cached and what invalidates it. Also discuss the eviction policy you would choose and why, and whether your system can tolerate stale reads or requires strong consistency. That framing shows you understand caching as a design decision, not just a tool choice.