Field Note 5current

Make workload identity a platform-owned service

By: Theo Zourzouvillys
Published: June 12, 2026
Tags: securityauthinfraplatform

TL;DR

How one service proves its identity to another is a problem every service in a decomposed system has, and it should be solved once, as platform infrastructure — not reimplemented per team and not owned by whichever product happened to build it first. The shape I recommend: a small security-token service (STS) (plus a shared library) that mints short-lived identity tokens for workloads, which any service consumes to mint and to verify.

Start simple; don’t let “we don’t have PKI” block you. A perfectly good first step is tokens signed with a shared symmetric key (HMAC) the issuer and verifiers hold — far better than per-service schemes or long-lived static secrets, and adoptable in an afternoon. The better end-state is asymmetric signing, where only the issuer can mint and verifiers hold just a public key; and ideally a signing key rooted in a KMS/HSM, with verification by anchoring to that root (no public key-distribution endpoint — no JWKS — to operate). Pick the rung you can run well today and climb later; the important move is having one shared mechanism, not having PKI on day one.

New peer-authentication code consumes the platform module; introducing a new per-service identity scheme is a deliberate, reviewed exception, not a local choice.

Context

Authentication and authorization decide whether a workload may act. They don’t, on their own, give you a good answer to how a workload proves which workload it is to its peers — and once a system is decomposed into many services, every one of them needs that answer. Left unsolved at the platform level, it gets solved many times: each team picks its own peer-identity mechanism, the schemes interoperate poorly, they rotate differently, they fail differently, and you end up with an archipelago of auth code that is impossible to audit as one thing.

A few forces make a shared mechanism the right call:

More than one service needs it. A capability the whole platform authenticates with should not be owned by one product, and should not be copy-pasted into every consumer.
One mechanism beats many. Convergence on a single, well-understood mechanism means one thing to audit, one rotation story, one verifier, and one set of cross-language ergonomics.
Verification shouldn’t require a key-distribution service. If every verifier has to fetch and cache a rotating public-key set from an endpoint, that endpoint becomes critical infrastructure with its own availability and trust problems. Anchoring trust to a root you already hold avoids it.

Recommendation

Treat workload identity as platform infrastructure with a single shared contract. Concretely:

A platform-owned module (and a service where a boundary needs it). The minting, certificate management, verification, and token-shape logic live in one library, owned by the platform or security function — not by a product team. An STS service form exists for callers that can’t or shouldn’t hold signing material directly (it issues delegated, short-lived credentials); the in-process library covers the rest. Both are the same capability.
Short-lived tokens, signed by a key the platform controls. Tokens are short-lived; how they’re signed is a progression, not a prerequisite:
- Shared key (HMAC) — the fine first step. The issuer and verifiers hold a shared symmetric secret. Simple, no certificate machinery, and already a large improvement. Its weakness is that every verifier can also mint (it holds the signing secret), so a verifier compromise is a minting compromise — manage and rotate the shared key accordingly.
- Asymmetric — the better step. Only the issuer holds the private key; verifiers hold the public key and can verify but not mint. A verifier compromise no longer lets an attacker forge identities.
- KMS/HSM-rooted, CA-anchored — the ideal. A locally held, frequently-rotated leaf key whose certificate chains to a CA private key held in a KMS/HSM and never extractable; the token carries the chain (e.g. an x5c header) and a verifier validates by anchoring to the CA it already trusts. No JWKS endpoint to publish or keep available — verifiers need only the CA (or the KMS public key).
Choose the highest rung you can operate well now; the shared contract should let you raise it later without every consumer changing how it calls the library.
Producers and verifiers use the shared code, not reimplementations. A service that needs a token shape or claim the contract doesn’t offer proposes a change to the contract; it does not fork its own.
The contract is a seam. Once many services mint and verify against it, the token shape, claims, and roles are hard to change. Evolve it additively and review changes the way you’d review any cross-team interface — guard the module with code owners.

Pair it with sender-constraint. Identity tokens are still bearer tokens unless you bind them to a holder key, so a stolen token is replayable. Combine this with proof-of-possession binding (ZFN-6Field Note · currentZFN-6 — Bind tokens to a key: sender-constrained tokens (DPoP)A bearer token grants access to whoever holds it — steal it, replay it. Bind the token to a holder key (DPoP, RFC 9449) so using it requires proving possession of a private key the token names. A stolen token alone becomes useless.Open ZFN-6 →) so theft of a token alone isn’t enough — or sign the request/message itself (ZFN-7Field Note · currentZFN-7 — Sign the message, not just the session (HTTP Message Signatures)A bearer token proves nothing about the request it rides on. Sign the message itself (HTTP Message Signatures, RFC 9421) — request, and ideally response — so the recipient can prove who sent this exact message and not a byte changed. Shared keys first; asymmetric better.Open ZFN-7 →).

Scope. This note is about workload (service) identity — one service proving which service it is to another. It’s complementary to how a workload authenticates to a cloud provider (ZFN-9Field Note · currentZFN-9 — No long-lived cloud keys; workloads authenticate by federated identityNo static AWS or GCP keys anywhere — not in code, secret stores, or env. Workloads use their runtime's own identity and cross clouds by exchanging it (OIDC) for short-lived credentials via federation. Static keys are a documented carve-out only.Open ZFN-9 →); a single service typically uses both.

Consequences

Easier:

One workload-identity mechanism across the platform — one rotation story, one verifier, one audit surface — instead of per-team schemes.
New services get authenticated identity by consuming a module, not by building crypto.
Verification needs no key-distribution service; at the top rung trust anchors to a KMS-held root with no JWKS endpoint to operate, and you can start far simpler with a shared key.

Harder:

Whoever owns the module takes on a capability the whole system depends on — with the on-call, versioning, and cross-language maintenance burden that implies. This is a real organizational shift, not just a code move.
The contract becomes load-bearing; evolution must be additive and reviewed as a seam.
Consumers that hold signing or private-key material inherit a blast-radius obligation: bound key residency, handle rotation/eviction carefully, never log it.
Picking one mechanism forecloses others (mTLS-only, SPIFFE) for the cases they might have fit better. A single well-supported path beats a best-fit-per-case patchwork.

New obligations:

New peer-authentication code uses the platform module; a new per-service scheme requires a reviewed exception, not a local decision.
Contract changes (claims, roles, trust distribution) are code-owner-guarded and additive where possible.
Consumers holding key material document and bound their custody (residency, rotation, eviction, no-logging).

References

ZFN-6Field Note · currentZFN-6 — Bind tokens to a key: sender-constrained tokens (DPoP)A bearer token grants access to whoever holds it — steal it, replay it. Bind the token to a holder key (DPoP, RFC 9449) so using it requires proving possession of a private key the token names. A stolen token alone becomes useless.Open ZFN-6 → — sender-constrained tokens (DPoP): bind these identity tokens to a holder key so a stolen one is useless.
ZFN-7Field Note · currentZFN-7 — Sign the message, not just the session (HTTP Message Signatures)A bearer token proves nothing about the request it rides on. Sign the message itself (HTTP Message Signatures, RFC 9421) — request, and ideally response — so the recipient can prove who sent this exact message and not a byte changed. Shared keys first; asymmetric better.Open ZFN-7 → — signing the request (and ideally response) itself, an alternative or complement to bearer identity tokens.
ZFN-9Field Note · currentZFN-9 — No long-lived cloud keys; workloads authenticate by federated identityNo static AWS or GCP keys anywhere — not in code, secret stores, or env. Workloads use their runtime's own identity and cross clouds by exchanging it (OIDC) for short-lived credentials via federation. Static keys are a documented carve-out only.Open ZFN-9 → — authenticating to a cloud provider (the other half of a service’s identity story).

Changelog

2026-06-12: First published as a Field Note.