Field Note 21 current
Cache only immutable objects; treat caches as tech debt
TL;DR
Use caches sparingly, and — especially for external caches like Redis in the data path — only for immutable, addressed objects: things identified by a key that names one exact, never-changing value (a content hash, a versioned blob, an immutable artifact). Do not cache mutable database results. That’s where cache-invalidation bugs, stale reads, and cross-request inconsistency come from, and it quietly trades correctness for performance — the wrong direction (ZFN-2). For mutable data, use a projection (a maintained read model), not a cached copy you hope to invalidate.
A cache in the data path is best read as a smell: a patch over a datastore that can’t serve an access pattern. It’s reached for early and reflexively, and it’s conspicuously rare in the data path of mature systems, which have addressed the access pattern properly instead. Treat each cache as tech debt with a plan to remove it.
Context
Caching mutable query results runs straight into the genuinely hard problem (“there are only two hard things… cache invalidation and naming things”): the moment the underlying data can change, your cached copy can be wrong, and knowing exactly when to invalidate it — across every write path, every race, every concurrent reader — is a problem you almost never solve completely. So you get staleness, read-your-writes violations, thundering herds on expiry, and inconsistency between two users looking at “the same” data. These bugs are intermittent, hard to reproduce, and erode trust.
Immutable, addressed objects sidestep all of it: if the key identifies a value that can never change, there is no invalidation problem — the cache can only ever be right or absent. That’s why CDNs and content-addressed stores work so well, and why that’s the only shape of caching that’s unambiguously safe.
The deeper issue is what a data-path cache usually means: your store can’t serve this read pattern fast enough, and rather than fix that, you’ve bolted on a faster, less-correct copy. That’s a patch. Mature systems tend not to have caches in the hot path — not because they don’t care about latency, but because they solved the access pattern at the source (purpose-built read models, in-memory state, the right storage shape) so the cache became unnecessary. Reaching for a cache should prompt the question: what access pattern is my datastore failing to serve, and how do I serve it correctly?
Recommendation
Cache immutable addressed objects freely; for everything else, fix the access pattern instead of caching mutable results. Concretely, by data shape:
- Immutable / content-addressed objects — caching is fine and good (this is what CDNs do). The key names one value forever; no invalidation, no correctness risk.
- Stable objects you’d be tempted to cache — don’t put them in a shared external cache; load them into local memory and partition the work across services so each instance owns and holds its slice (ZFN-15). Memory you own beats a network hop to Redis, and there’s no shared-cache coherence problem.
- Lookup state, config, or slow-rollout data — store it as a state file / snapshot and fetch it periodically, holding it in memory and failing static on the last good copy (ZFN-17, ZFN-16). Slow-changing data wants a versioned snapshot, not a cache.
- Changing data — use the database directly. If it can’t meet the performance need, don’t paper over it with a cache: take an atomic snapshot and stream the relevant changes from the WAL (logical replication / CDC) into an in-memory, continuously-updated view. That gives you in-memory speed with correctness, because the log keeps it current — the principled version of “I need it in memory.”
- Expensive mutable query results — build a projection / materialized read model that’s maintained as the source changes, rather than caching the query output and guessing when to bust it.
If you must cache mutable data anyway, make the trade explicit: bound staleness, scope it as tightly as possible, and accept (in writing, at the use site) that you’ve chosen performance over correctness for that path — so it’s a known, revisited decision, not an invisible one.
Consequences
Easier:
- The hardest correctness bugs — stale reads, invalidation races, inconsistent views — simply don’t exist for immutable caching, and are designed out (not patched) for everything else.
- Addressing the access pattern at the source (projections, in-memory state fed by the log, partitioning) yields speed and correctness, and removes a moving part (the external cache) from the data path.
- “Why is this data wrong sometimes?” stops being a recurring incident.
Harder:
- Projections and WAL-fed in-memory views are more to build than slapping a cache in front of a query — real work, paid to get correctness.
- You have to actually diagnose the access pattern rather than reaching for the reflexive fix, which is slower up front.
- Memory-resident state bounds size and adds a warm-up/rebuild concern; partitioning has its own coordination cost.
- Genuinely immutable caching is great and you shouldn’t avoid it out of dogma — the discipline is about mutable data.
References
- ZFN-2 — caching mutable results trades correctness for performance, against the priority order.
- ZFN-17 — snapshot static/lookup data and hold it in memory instead of caching it.
- ZFN-16 — fail-static on a cached snapshot is fine; a per-request cache of mutable state is not.
- ZFN-15 — partition and hold stable data in local memory rather than sharing an external cache.
- ZFN-20 — a data-path cache is the canonical “easy, not simple” patch and a smell to design out.
Changelog
- 2026-06-12: First published as a Field Note.