Theo Zourzouvillys

Field Note 35 current

Reference secrets in config; dereference, refresh, and re-fetch

By
Theo Zourzouvillys
Published
Tags
securityinfraconfigreliability

TL;DR

Don’t store secret values in configuration — not in env vars, config files, deploy manifests, or baked into an image. Store a reference to the secret: a path/URI/key in a managed secret store (AWS SSM Parameter Store / Secrets Manager, GCP Secret Manager, Vault). The application dereferences that reference at runtime — resolving it to the value using its own keyless workload identity (ZFN-9) — and holds it in memory, never in the config surface.

Two behaviours make this actually work instead of just relocating the problem:

  • Refresh on a signal or expiry. The cached value has a TTL, or you subscribe to a rotation/ version-change event, so a rotated secret is picked up without a redeploy or restart.
  • Re-fetch on auth failure. When a credential is rejected (a 401/403/invalid-credential from the thing it authenticates to), invalidate the cached value and re-resolve it. Rotation becomes self-healing: the app recovers on its own when a secret is rotated out from under it.

Context

Putting the secret value straight into config is the path of least resistance, and it quietly creates several problems at once:

  • Sprawl and exposure. The same secret ends up copied into env vars, CI, images, deploy configs, developer machines, and logs. Every copy is a place it can leak, and you can no longer say where it all is.
  • Rotation requires a redeploy. If the value lives in config, rotating it means changing config and redeploying every consumer — so in practice it doesn’t happen, and secrets get old and over-shared. The thing that should be routine becomes a project.
  • No clean revocation. You can’t reliably pull a secret back once it’s scattered across deploy-time copies; you just hope nothing kept it.
  • It breaks the moment a secret rotates. A consumer pinned to a stale config value keeps presenting a credential that’s been rotated away, and fails — with no path to recover except a human noticing and redeploying.

The fix is the same indirection that makes other things sane: store a reference, resolve it late, and treat the secret store as the single source of truth. The value lives in one governed place; config just points at it.

Recommendation

Config holds a pointer; the app resolves it at runtime and keeps it fresh.

  • Reference, don’t embed. Config carries the secret’s path/identifier in the store plus the access grant — never the value. A reference is not a secret; it’s safe to commit, log, and pass around.
  • Dereference at runtime with workload identity. The app reads the value from the secret store as it starts and as needed, authenticating with its own federated, keyless identity (ZFN-9) — so the only credential the app holds is the one that fetches the others. Hold the resolved value in memory, out of the config surface, and never log it.
  • Refresh on a signal or expiry. Don’t resolve once and pin forever. Give the cached value a TTL, and/or subscribe to the store’s rotation/version-change notification, and re-read on that trigger. This is the same as treating it like control-plane state pushed to the data plane (ZFN-16, ZFN-17): cache it, refresh it, fail static on the last good value if the store is briefly unreachable.
  • Re-fetch on auth failure. Treat an authentication rejection from a downstream as a signal that the secret may have rotated: drop the cached value and re-resolve it, then retry the operation once. (Bound it — re-fetch and retry a limited number of times, with backoff, so a genuinely bad credential doesn’t become a hot loop — see ZFN-13.) This makes rotation self-healing: rotate in the store, and consumers converge on the new value without anyone redeploying.
  • Use a real secret store, not a homegrown one. Reach for the platform’s managed secret store rather than inventing your own (ZFN-30); it gives you versioning, rotation, access control, and audit for free.

Consequences

Easier:

  • Rotation becomes routine and safe: rotate in one place, and consumers refresh (proactively on the signal/TTL, or reactively on auth failure) with no redeploy.
  • Secrets stop sprawling across env/images/CI/logs — there’s one governed source and a reference everywhere else.
  • Revocation actually works, and a rotated-away secret self-heals instead of paging someone at 2am.
  • The app holds only its keyless identity; the standing-secret blast radius shrinks to the store.

Harder:

  • A runtime dependency on the secret store on (at least) startup and refresh; you must handle it being briefly unreachable (cache + fail-static on last-known-good) and bound the re-fetch-on-failure loop.
  • Caching introduces staleness windows; the refresh signal and the auth-failure re-fetch are what keep them bounded, and both need building.
  • A little more moving machinery than SECRET=... in an env var — paid once, for rotation and exposure you’d otherwise never get right.

References

  • ZFN-9 — fetch secrets at runtime by path using federated, keyless workload identity; this note is that pattern generalized to all secrets.
  • ZFN-16 / ZFN-17 — a secret is control-plane state: resolve, cache, refresh, fail static.
  • ZFN-13 — bound the re-fetch-and-retry on auth failure so it can’t become a hot loop.
  • ZFN-30 — use a managed secret store, don’t roll your own.

Changelog

  • 2026-06-12: First published as a Field Note.