---
id: 35
title: "Reference secrets in config; dereference, refresh, and re-fetch"
status: current
date: 2026-06-12
authors:
  - "Theo Zourzouvillys"
tags: [security, infra, config, reliability]
summary: "Don't put secret values in config — store a reference (a path in a secret store) and dereference it at runtime via your workload identity. Refresh on a signal or expiry so rotation needs no redeploy; re-fetch on auth failure so a rotated secret self-heals."
supersedes: null
superseded_by: null
aliases: []
---

## TL;DR

Don't store **secret values** in configuration — not in env vars, config files, deploy manifests, or
baked into an image. Store a **reference** to the secret: a path/URI/key in a managed secret store
(AWS SSM Parameter Store / Secrets Manager, GCP Secret Manager, Vault). The application **dereferences**
that reference **at runtime** — resolving it to the value using its own keyless workload identity
([ZFN-9](/zfn/9-no-long-lived-cloud-keys/)) — and holds it in memory, never in the config surface.

Two behaviours make this actually work instead of just relocating the problem:

- **Refresh on a signal or expiry.** The cached value has a TTL, or you subscribe to a rotation/
  version-change event, so a rotated secret is picked up **without a redeploy or restart**.
- **Re-fetch on auth failure.** When a credential is rejected (a `401`/`403`/invalid-credential from
  the thing it authenticates to), invalidate the cached value and **re-resolve it**. Rotation becomes
  **self-healing**: the app recovers on its own when a secret is rotated out from under it.

## Context

Putting the secret value straight into config is the path of least resistance, and it quietly creates
several problems at once:

- **Sprawl and exposure.** The same secret ends up copied into env vars, CI, images, deploy configs,
  developer machines, and logs. Every copy is a place it can leak, and you can no longer say where it
  all is.
- **Rotation requires a redeploy.** If the value lives in config, rotating it means changing config and
  redeploying *every* consumer — so in practice it doesn't happen, and secrets get old and
  over-shared. The thing that should be routine becomes a project.
- **No clean revocation.** You can't reliably pull a secret back once it's scattered across
  deploy-time copies; you just hope nothing kept it.
- **It breaks the moment a secret rotates.** A consumer pinned to a stale config value keeps presenting
  a credential that's been rotated away, and fails — with no path to recover except a human noticing
  and redeploying.

The fix is the same indirection that makes other things sane: store a *reference*, resolve it *late*,
and treat the secret store as the single source of truth. The value lives in one governed place; config
just points at it.

## Recommendation

**Config holds a pointer; the app resolves it at runtime and keeps it fresh.**

- **Reference, don't embed.** Config carries the secret's **path/identifier** in the store plus the
  access grant — never the value. A reference is not a secret; it's safe to commit, log, and pass
  around.
- **Dereference at runtime with workload identity.** The app reads the value from the secret store *as
  it starts and as needed*, authenticating with its own federated, keyless identity
  ([ZFN-9](/zfn/9-no-long-lived-cloud-keys/)) — so the only credential the app holds is the one that
  fetches the others. Hold the resolved value in memory, out of the config surface, and never log it.
- **Refresh on a signal or expiry.** Don't resolve once and pin forever. Give the cached value a TTL,
  and/or subscribe to the store's rotation/version-change notification, and re-read on that trigger.
  This is the same as treating it like control-plane state pushed to the data plane
  ([ZFN-16](/zfn/16-separate-data-plane-control-plane/), [ZFN-17](/zfn/17-separate-config-state-ephemeral/)):
  cache it, refresh it, fail static on the last good value if the store is briefly unreachable.
- **Re-fetch on auth failure.** Treat an authentication rejection from a downstream as a signal that
  the secret may have rotated: drop the cached value and re-resolve it, then retry the operation once.
  (Bound it — re-fetch and retry a limited number of times, with backoff, so a genuinely bad credential
  doesn't become a hot loop — see [ZFN-13](/zfn/13-load-shedding-and-flow-control/).) This makes
  rotation self-healing: rotate in the store, and consumers converge on the new value without anyone
  redeploying.
- **Use a real secret store, not a homegrown one.** Reach for the platform's managed secret store
  rather than inventing your own ([ZFN-30](/zfn/30-use-standards-dont-reinvent/)); it gives you
  versioning, rotation, access control, and audit for free.

## Consequences

**Easier:**

- Rotation becomes routine and safe: rotate in one place, and consumers refresh (proactively on the
  signal/TTL, or reactively on auth failure) with no redeploy.
- Secrets stop sprawling across env/images/CI/logs — there's one governed source and a reference
  everywhere else.
- Revocation actually works, and a rotated-away secret self-heals instead of paging someone at 2am.
- The app holds only its keyless identity; the standing-secret blast radius shrinks to the store.

**Harder:**

- A runtime dependency on the secret store on (at least) startup and refresh; you must handle it being
  briefly unreachable (cache + fail-static on last-known-good) and bound the re-fetch-on-failure loop.
- Caching introduces staleness windows; the refresh signal and the auth-failure re-fetch are what keep
  them bounded, and both need building.
- A little more moving machinery than `SECRET=...` in an env var — paid once, for rotation and
  exposure you'd otherwise never get right.

## References

- [ZFN-9](/zfn/9-no-long-lived-cloud-keys/) — fetch secrets at runtime by path using federated, keyless
  workload identity; this note is that pattern generalized to all secrets.
- [ZFN-16](/zfn/16-separate-data-plane-control-plane/) / [ZFN-17](/zfn/17-separate-config-state-ephemeral/)
  — a secret is control-plane state: resolve, cache, refresh, fail static.
- [ZFN-13](/zfn/13-load-shedding-and-flow-control/) — bound the re-fetch-and-retry on auth failure so it
  can't become a hot loop.
- [ZFN-30](/zfn/30-use-standards-dont-reinvent/) — use a managed secret store, don't roll your own.

## Changelog

- **2026-06-12**: First published as a Field Note.
