Three-Layer Cache
Three-Layer Cache
Cache rarely sits in only one place. It usually stacks across several layers. Closer to the client is faster; closer to the DB is more accurate.
1. L1 — In-memory cache
The layer aimed at the shortest response time. It can be an LRU map inside the same process, or a separate process such as Redis or Memcached.
| Tool | Origin |
|---|---|
| Memcached | 2003, Brad Fitzpatrick (LiveJournal) |
| Redis | 2009, Salvatore Sanfilippo |
| Hazelcast / Infinispan | JVM distributed in-memory |
| Caffeine | JVM library (Ben Manes) |
The hallmark is volatility. The default assumption is that data disappears on restart or node replacement (Redis's RDB and AOF can preserve some).
2. L2 — Persistent cache
A cache, but a place where loss is awkward. PostgreSQL cache tables, materialized views, or object storage like S3 sit here.
Representative patterns.
- Load external API responses into a PostgreSQL table and manage expiry with a TTL column.
- Freeze expensive aggregate query results into a materialized view and
REFRESHperiodically. - Place static assets (images, PDFs) in object storage and expose them through a CDN.
When L1 dies, L2 acts as fallback. The flow "L1 empty → query L2 → fill L1 with the result" (cache-aside) is common.
3. L3 — Framework cache
Cache provided by web frameworks at the route or fetch level.
- Next.js
unstable_cacheand thenext: { revalidate }option offetch— cache data fetch results aligned to the route lifecycle. - Next.js Full Route Cache — keeps render results of static and dynamic routes (behavior varies by version and configuration).
- HTTP cache headers —
Cache-Control,ETag,Last-Modifiedinterpreted by clients, intermediate proxies, and CDNs. - CDN cache — Cloudflare, Fastly, Akamai store responses at the edge.
The hallmark of this layer is that it works with almost no code changes. The flip side is debug difficulty — knowing which response sits in which cache and when it gets invalidated becomes scattered.
4. Cache-aside (lazy loading)
The most common pattern.
read:
v = cache.get(k)
if v is null:
v = db.query(k)
cache.set(k, v, ttl)
return v
write:
db.update(k, v)
cache.delete(k) # or set(k, v)
The advantage is simplicity, the limits are sized load on cache miss and consistency gaps.
5. Write-through, write-behind, refresh-ahead
Write-through updates both cache and DB on writes.
write:
cache.set(k, v)
db.update(k, v)
Write consistency improves, but a frequently updated key loads the cache too.
Write-behind writes to the cache only and an asynchronous worker pushes to the DB. It is used in high-throughput places, but data loss risk exists if the cache dies.
Refresh-ahead refreshes items in the background as TTL nears expiry. It cuts down cache misses on user requests. Caffeine on the JVM side supports it directly.
6. TTL and stale-while-revalidate
TTL is the promise of "how stale this value can be." Too short weakens cache value; too long delivers stale results. Decide based on the data's change cadence and user tolerance.
The stale-while-revalidate pattern returns an expired value and pulls the refresh in the background. HTTP standardizes it as Cache-Control: stale-while-revalidate=....
7. Cache stampede
When many requests share a frequently called key whose TTL all expires at once, they hit the origin together and load spikes. Mitigations come as one or two of the following bundled together.
- Add a small random jitter to TTL.
- Use a distributed lock (Redis
SET NX) so only one request refreshes the origin. - Early refresh — start the refresh before TTL expires.
- Request coalescing — merge concurrent requests for the same key into one.
8. Key naming conventions
<service>:<entity>:<id>— for example,users:profile:42.- Put a version in the prefix to invalidate everything at once:
v3:users:profile:42. - Environment prefix:
prod:,staging:.
Long keys carry memory overhead, which is worth considering on in-memory stores like Redis.
9. Serialization and monitoring
- JSON — readable to humans. Serialization cost is high.
- MessagePack, CBOR — binary, smaller.
- Protobuf, Avro — schema-based, multilingual.
If we serialize and deserialize cache responses frequently, the format choice shows up in response time.
Monitoring items:
- Hit rate — too low signals key design or TTL issues.
- Memory usage, key count, eviction count.
- Origin (DB, external API) call count — measures the cache impact.
10. Common pitfalls
Partial invalidation — caching one user's info in 10 places makes invalidation hard. Key conventions and tags are needed from the start.
Caching null — decide whether to cache "no result." If we do not, every request goes to the DB; if we do, newly created items may not appear.
Infinite TTL — persistent data lingers in cache and conflicts with new code. Explicit expiry or a version prefix is safer.
Read-after-write — the client cannot see its own change. Write-through or client-side read-your-writes handling is needed.
L3 build-time caches — Next.js's static cache is tied to build time or revalidate intervals. It can clash with a CMS's instant-publish requirement.
Closing thoughts
The more cache we add, the harder invalidation gets. Starting in places where "who creates this cache and who clears it" is clear is safer. Phil Karlton's joke ("There are only two hard things in Computer Science: cache invalidation and naming things") gets quoted for a reason.
Next
- redis-roles
- data-pipeline
References: HTTP Caching (MDN), RFC 5861 — stale-while-revalidate, Redis caching patterns, Next.js Caching, Caffeine GitHub.