HLDbeginner

Caching Strategies

Caching improves latency and reduces pressure on primary storage by keeping hot data closer to the application. Good interview answers cover where the cache sits, how invalidation works, and how stampedes are prevented.

Reading time

11 min

cacheredislatencyperformance

Why Cache

Read-heavy systems often spend too much time waiting on the database. A cache can return common results faster and absorb repeated reads. Even a small cache hit rate improvement dramatically reduces database load at scale.

Common Patterns

Cache aside: the application checks the cache first, fetches from the database on a miss, then populates the cache for future reads
Write through: every write goes to the cache and the database simultaneously, keeping them in sync at the cost of write latency
Write back: writes go to the cache first and are flushed to the database asynchronously, improving write speed but risking data loss on crash
Refresh ahead: the cache proactively reloads entries before they expire, reducing cold reads for predictably popular data

Where to Cache

Database query results
Rendered HTML fragments
Session tokens and user profiles
Expensive computation results
Third-party API responses
Configuration and feature flags

Common Problems

Stale data: cached values diverge from the source of truth after writes
Cache stampede: many requests miss simultaneously and all hit the database at once
Hot keys: a single key receives disproportionate traffic and becomes a bottleneck
Memory pressure and eviction churn: cache fills up and evicts entries faster than they can be reused

Eviction Policies

LRU (Least Recently Used): evicts the entry that was accessed least recently, good for temporal locality
LFU (Least Frequently Used): evicts the entry accessed fewest times, good for skewed access patterns
TTL-based: entries expire after a fixed duration regardless of access pattern
FIFO: evicts the oldest inserted entry, simple but often suboptimal

Mitigations

TTL with jitter: randomize expiry times to prevent mass simultaneous expiration
Request coalescing: collapse concurrent misses for the same key into a single database fetch
Distributed locks for hot misses: only one request populates the cache while others wait
Background refresh: reload cache entries asynchronously before they expire to avoid cold hits

Cache Invalidation Strategies

Cache invalidation is one of the hardest problems in distributed systems. Common approaches include time-based expiry with short TTLs, event-driven invalidation triggered by writes, and versioned cache keys that change whenever the underlying data changes. Each trades consistency for complexity differently.

Distributed Caching

Single-node caches do not survive restarts and cannot scale horizontally. Distributed caches like Redis Cluster and Memcached shard data across nodes using consistent hashing. This spreads hot key load and increases total memory capacity without a single point of failure.

Cache Warm-Up

A cold cache after a deployment or restart sends all traffic to the database simultaneously. Warm-up strategies include pre-loading known hot keys before switching traffic, using a read-through pattern that fills the cache organically, or maintaining a persistent cache that survives restarts.

Caching and Consistency

Caches introduce eventual consistency by definition. Systems that require strong consistency must either skip caching for critical reads, use write-through patterns, or implement cache invalidation on every write. Choosing a cache strategy is always a trade-off between read performance and data freshness.

Interview Tip

Do not just say 'use Redis'. Explain what data gets cached and what invalidates it. Also discuss the eviction policy you would choose and why, and whether your system can tolerate stale reads or requires strong consistency. That framing shows you understand caching as a design decision, not just a tool choice.

← Back to all topics Practice questions →