Theo Zourzouvillys

Field Note 1 current

Keep engineering decision records

By
Theo Zourzouvillys
Published
Tags
processmeta

TL;DR

Record significant engineering decisions as short, version-controlled markdown files that live with the code, each one stating the context, the decision, and the consequences. Write one when you’re locking in a contract where two teams meet, committing to a directional principle, making a hard-to-reverse technology choice, or codifying a convention other code is expected to follow. Give each a stable number and cite it (DR-12, or whatever prefix you pick) in code comments, PRs, and commits, so a decision is referenced rather than re-argued every quarter. Don’t ship a change that contradicts a live record — amend it in place or write a new one that supersedes it. Every record opens with a one-paragraph summary so a reader can act on it without reading the whole thing.

(This site’s Field Notes are one personal instance of exactly this practice.)

Context

Once an engineering group grows past the point where everyone is aware of everything, coordination stops happening by osmosis. Teams start owning their own services and moving at their own speed — which is the goal — but they still meet at well-defined seams: the RPC contracts between services, the events one team emits and another consumes, the shared conventions everyone is expected to follow, the operational bar every service must meet. Those seams are exactly where misalignment compounds fastest and is most expensive to unwind.

Three problems show up if there’s no durable record of what was decided at those seams:

  1. Loss of context. Six months after a decision, no one remembers why the obvious-looking alternative was rejected, so the next engineer reopens the same debate from scratch.
  2. Drift. Without an authoritative record of the agreement, each team interprets it slightly differently, and the divergence surfaces at integration time, in production, or in an incident.
  3. Onboarding cost. New engineers — and LLM coding agents — have no canonical place to learn what’s been agreed, so they reverse-engineer intent from code and copy whatever the nearest service does, which may itself be wrong.

LLM coding assistants are a permanent part of how software gets built now, and they are excellent at applying rules they can find and terrible at inferring rules from absence. A grep-able, stable, machine-readable record of decisions is increasingly critical infrastructure, not nice-to-have documentation.

The usual alternatives don’t hold up. Wiki/Notion pages aren’t version-controlled, aren’t co-located with code, and rot. Long-form design docs capture the discussion but bury the outcome inside thirty pages. Chat plus tribal memory stops working the moment the group is larger than one room.

The practice

Record significant engineering decisions as short, focused markdown files that live in the repository alongside the code. Each one:

  • Is a small file with front matter and a body that follows Context / Decision / Consequences — the shape Michael Nygard popularized for Architecture Decision Records. The decisions worth recording are broader than “architecture” (tooling, process, vendor choices, deprecations, conventions), so I treat them as engineering decisions, not just architectural ones.
  • Begins with a one-paragraph summary stating the decision in plain language and what it obliges a reader to do. A reader should be able to act on the record after the summary alone; Context and Consequences exist for the why.
  • Gets a stable numeric id that never changes and is never reused. The id is the durable identifier; the filename slug is for humans.
  • Has a machine-readable status: usually current (the decision in effect), with deprecated and superseded for end-of-life.
  • Is immutable in spirit. Typos and clarifications are amended in place with a changelog entry. A change to the decision itself is a new record that supersedes the old one — the old one stays on the record.

When to write one

Write one when any of these is true:

  • It establishes a contract at a seam between teams — an RPC interface, an event schema, a shared library, a vendor everyone depends on, an operational standard, a security boundary that crosses teams. These are the highest-value records.
  • It is directional guidance — a principle, value, or tiebreaker that informs many future decisions without being a specific technical choice. The classic shape is a lexicographic priority ordering (see ZFN-2). Directional records let independent teams reach consistent decisions on their own, and can include explicit carve-outs: narrow exceptions allowed when stated conditions hold, with a requirement that each use site document, in the code or config itself, why the carve-out applies (see ZFN-3 for that pattern).
  • The decision is hard to reverse (database, wire protocol, auth model, vendor).
  • A reasonable engineer would later ask “why did we do it this way?” and the answer isn’t obvious from the code.
  • You are deliberately rejecting an attractive alternative — the rejection is the valuable artifact.
  • The decision establishes a convention other code is expected to follow.

Do not write one for bug fixes, routine refactors, implementation details inside an already-decided design, personal style preferences (use a linter), or things still being figured out (those belong in a PR or a design doc).

What gives a record its authority

A record doesn’t need a top-down mandate to be legitimate. Either bar is enough: it’s ordained (handed down where the requirement is non-negotiable — security, legal, a leadership call) or it rests on rough consensus, or the honest pursuit of it among the people the decision affects. What’s not enough is an unsettled opinion that is neither mandated nor backed by a genuine attempt at consensus — that belongs in a PR or design doc. The record is honest about which bar it rests on.

Referencing

Cite records by a short, greppable id in code comments, commit messages, and PR descriptions:

// DR-2: inter-service calls go over the agreed transport; don't add an HTTP client here.
git grep -nE 'DR-[0-9]+'

In other records, link supersession relations explicitly so the lineage of a changed position is always traceable.

Consequences

Easier:

  • One place to look for “why is it this way?”
  • Reviews cite a record instead of re-arguing the same point every quarter.
  • Onboarding has a finite, ordered reading list.
  • LLM agents get a queryable knowledge base for intent, and a current status is a strong, declarative signal they can refuse to contradict.
  • Independent teams move without re-negotiating the seams in every standup.

Harder:

  • Writing a record is overhead. I accept the cost because losing context is more expensive.
  • You have to decide what counts as worth recording; the “when to write one” list is a starting heuristic, not a rule.
  • Records that go stale are worse than none — they confidently mislead. The amendment and supersession discipline only works if you actually use it.

New obligations:

  • A change that contradicts a live record must either amend it in the same change or introduce a superseding one. Reviewers enforce this.
  • A build step that validates front matter is worth having — better to block a publish than to ship a malformed record.

References

Changelog

  • 2026-06-12: First published as a Field Note.