Field Note 20current

The simplest-looking system is often the most complex to live with

By: Theo Zourzouvillys
Published: June 12, 2026
Tags: architecturedatadesignphilosophy

TL;DR

Don’t confuse easy to stand up with simple. The classic starter stack — a backend service, Postgres, and a Redis cache — looks simpler than a more deliberately-architected design, and it is: simpler to start. But the hard parts of the problem (consistency, invalidation, concurrency, partial failure, ordering, multi-tenancy) don’t go away because you didn’t build for them — they’re still there, now surfacing as customer-visible bugs and data inconsistency. A design with more moving parts that actually addresses those cases has fewer surprises, and in data architecture it very often turns out to be the simpler system to live with over time. Pay the essential complexity on purpose, up front, rather than accreting it as patches later.

This is not a license to over-engineer. It’s the narrower claim that when you know the hard cases exist, the design that handles them — even if it has more components — usually beats the one that pretends they don’t.

Context

“Simple” is two different things, and conflating them is the trap:

Easy — familiar, fast to assemble, few boxes on the diagram on day one.
Simple — few entanglements; few edge cases and failure modes you have to hold in your head to reason about correctness. (Rich Hickey’s Simple Made EasyRich Hickey — Simple Made Easy (Strange Loop 2011)Rich Hickey's talk separating simple (one role or concept, un-entangled — an objective property of a thing) from easy (familiar, near-at-hand — a subjective relationship to it). He argues we habitually optimise for easy and pay in incidental complexity, and that choosing the simple option is what keeps systems reliable and changeable over the long run.infoq.com ↗ draws exactly this line.)

The backend-plus-Postgres-plus-cache version is easy. It is not necessarily simple, because the genuinely hard parts of a stateful product are still present, just unhandled:

The cache and the database disagree, and there’s no correct invalidation, so reads go stale and users see inconsistency (ZFN-21Field Note · currentZFN-21 — Cache only immutable objects; treat caches as tech debtUse caches sparingly, only for immutable addressed objects — never for mutable DB results, where invalidation bugs and stale reads live; use projections instead. A cache in the data path is usually a patch over an architectural gap that trades correctness for performance.Open ZFN-21 →).
Concurrent writes race because nothing was designed for it; retries double-apply because nothing is idempotent (ZFN-19Field Note · currentZFN-19 — Annotate read-only and idempotent endpoints; make every mutation idempotentAnnotate every endpoint as read-only (safe) or idempotent, in the schema, so infrastructure can retry, route to replicas, and cache safely. Make every state-changing endpoint idempotent (idempotency keys for create/charge/send); a non-idempotent retry double-applies.Open ZFN-19 →).
A partial failure leaves two stores out of sync with no path back to consistency.
It can’t be sharded later because everything assumed one database (ZFN-15Field Note · currentZFN-15 — Partition customer data by tenant from day oneMake customer data tenant-partitioned from day one: tenant-scope every query, never join across tenants, route through a tenant→location directory. Run one physical database at first — but keep the model shardable. Retrofitting isolation onto a shared DB is brutal.Open ZFN-15 →).

So the “simple” system spends its second year accreting special cases — a lock here, a reconciliation cron there, a cache-busting hack, a nightly repair job. That’s accidental complexity: the worst kind, because no one chose it and no one understands all of it. The system that looked complex up front — a real event/state model, projections, partitioning, idempotency, a control/data-plane split — paid its complexity as essential complexity, deliberately, in places you can name and reason about. Over the life of the system, fewer surprises wins.

Recommendation

Choose the design that minimises total, whole-life complexity, not day-one part count — and in stateful/data-heavy systems, that’s frequently the more deliberate design.

Distinguish essential from accidental complexity. Essential complexity is inherent to the problem (you have multiple tenants; writes race; failures are partial) — it must be handled somewhere. The only choice is whether you handle it deliberately or accrete it as patches. Accidental complexity is the patches. Spend on the first to avoid the second.
Don’t price the edge cases at zero. When estimating “the simple option,” include the cost of the consistency bugs, the data-repair work, the migration you’ve foreclosed, and the on-call. Those are real and they’re usually what makes the easy option expensive.
Still apply YAGNI to the speculative. This is about complexity you know is essential, not about building for imagined futures. If a hard case genuinely won’t exist, don’t design for it. The judgement is “essential vs. speculative,” not “complex is always better.”
Get the data model and contracts right early, because those are the expensive things to change later (ZFN-1Field Note · currentZFN-1 — Keep engineering decision recordsRecord significant engineering decisions as short, versioned markdown files — context, decision, consequences. Write one for cross-team contracts, directional principles, hard-to-reverse choices, and conventions others must follow. Cite them instead of re-arguing.Open ZFN-1 →, ZFN-14Field Note · currentZFN-14 — Define every API with a schema, and generate the clientsDefine every API with a machine-readable schema (OpenAPI, Protobuf, GraphQL) as the source of truth, and generate clients and server stubs from it — never hand-roll request-building and JSON parsing. Hand-written clients drift and break silently; check schema compatibility in CI.Open ZFN-14 →, ZFN-15Field Note · currentZFN-15 — Partition customer data by tenant from day oneMake customer data tenant-partitioned from day one: tenant-scope every query, never join across tenants, route through a tenant→location directory. Run one physical database at first — but keep the model shardable. Retrofitting isolation onto a shared DB is brutal.Open ZFN-15 →, ZFN-17Field Note · currentZFN-17 — Separate configuration, state, and ephemeral dataCustomer data splits into mostly-static config, durable state, and ephemeral sessions — different access, durability, and change rates. Model and store each separately. For bounded static config, prefer loading one validated snapshot held in memory over fetching on demand.Open ZFN-17 →). A clean data architecture is what lets the implementation behind it stay genuinely simple.
When you did take the easy path, plan to dig out. Quarantine the patched part behind an interface and replace it (ZFN-22Field Note · currentZFN-22 — Quarantine bad architecture behind an interface, then replace itWhen a subsystem is complex and badly architected, quarantine it at its seam: write a clean adapter interface over the mess so the rest of the system depends on the contract, then build a better implementation behind it and expose the new interface directly.Open ZFN-22 →, ZFN-23Field Note · currentZFN-23 — Rewriting an implementation is fine — refactoring isn't always the answerRefactoring isn't always right. When the structure is wrong at the root, it's fine — often better — to rewrite an implementation from scratch. Clean interfaces and data models make the implementation disposable: stable contract, swappable internals. LLMs make it cheaper still.Open ZFN-23 →) rather than patching the patch.

Consequences

Easier:

Fewer correctness incidents and less data-inconsistency firefighting, because the hard cases were designed for instead of discovered in production.
The system stays reason-about-able and extendable; new work composes with a real model instead of threading through special cases.
You’re not foreclosing scale, sharding, or evolution by baking in a too-simple assumption.

Harder:

More to build and understand up front, and a real risk of talking yourself into over-engineering — the essential-vs-speculative call takes judgement and honesty.
It’s a harder sell early (“why is this so involved?”) precisely because the cost it avoids is invisible until later — you’re trading a visible up-front cost for an invisible avoided one.
Some genuinely simple problems are well served by the easy stack; this note is about stateful, correctness-sensitive data architecture, not every CRUD app.

References

ZFN-21Field Note · currentZFN-21 — Cache only immutable objects; treat caches as tech debtUse caches sparingly, only for immutable addressed objects — never for mutable DB results, where invalidation bugs and stale reads live; use projections instead. A cache in the data path is usually a patch over an architectural gap that trades correctness for performance.Open ZFN-21 → — the canonical “easy but not simple” patch: reaching for a cache instead of fixing the access pattern.
ZFN-2Field Note · currentZFN-2 — Engineering priority orderingWhen concerns conflict, prioritize security > correctness > availability > performance — and never trade a higher-ranked concern for a lower one. The rule binds the moment you must choose. Cite it instead of re-arguing it.Open ZFN-2 → — the easy option often quietly trades correctness for performance/convenience.
ZFN-15Field Note · currentZFN-15 — Partition customer data by tenant from day oneMake customer data tenant-partitioned from day one: tenant-scope every query, never join across tenants, route through a tenant→location directory. Run one physical database at first — but keep the model shardable. Retrofitting isolation onto a shared DB is brutal.Open ZFN-15 →, ZFN-17Field Note · currentZFN-17 — Separate configuration, state, and ephemeral dataCustomer data splits into mostly-static config, durable state, and ephemeral sessions — different access, durability, and change rates. Model and store each separately. For bounded static config, prefer loading one validated snapshot held in memory over fetching on demand.Open ZFN-17 →, ZFN-16Field Note · currentZFN-16 — Separate the data plane from the control planeSplit the serving path (data plane) from the management path (control plane). The data plane keeps serving on last-known-good config when the control plane is down — never call it on the hot path. Coupling them turns a control-plane bug into a serving outage.Open ZFN-16 → — the deliberate data-architecture choices that pay off over the system’s life.
ZFN-22Field Note · currentZFN-22 — Quarantine bad architecture behind an interface, then replace itWhen a subsystem is complex and badly architected, quarantine it at its seam: write a clean adapter interface over the mess so the rest of the system depends on the contract, then build a better implementation behind it and expose the new interface directly.Open ZFN-22 → / ZFN-23Field Note · currentZFN-23 — Rewriting an implementation is fine — refactoring isn't always the answerRefactoring isn't always right. When the structure is wrong at the root, it's fine — often better — to rewrite an implementation from scratch. Clean interfaces and data models make the implementation disposable: stable contract, swappable internals. LLMs make it cheaper still.Open ZFN-23 → — how to dig out when you took the easy path.
Rich Hickey, Simple Made Easy — simple (un-entangled) vs easy (familiar/quick).

Changelog

2026-06-12: First published as a Field Note.