Field Note 11 current
Route outbound HTTP through an isolated egress proxy
TL;DR
Don’t let application compute open arbitrary outbound connections directly. Any service that makes
outbound HTTP — fetching user-supplied URLs, delivering webhooks, calling third-party APIs, link
previews, importing remote files, SSO/OIDC metadata fetches — is an SSRF surface: an
attacker-influenced URL turns your app into a deputy that can reach internal services, the
cloud metadata endpoint (169.254.169.254), and localhost admin ports. Route all such
egress through a proxy that runs on separate compute which physically cannot reach anything
internal. The proxy’s network — not the application’s URL-validation code — is the enforcement
boundary: even a fully compromised app behind it can only reach the public internet.
Build it in tiers, simplest first: (1) a SOCKS proxy on an isolated subnet whose ACLs allow only routes to the public internet and deny anything aimed at internal/trusted ranges; (2) a dedicated VPC (no peering inward) with a load balancer fronting an autoscaled SOCKS fleet; (3) a gRPC egress service you call to make the request with metadata (which tenant, what purpose), so egress becomes a governed, per-tenant capability — trust chains, allowed TLS versions, header policy, audit.
Context
A surprising amount of application code makes outbound HTTP, and much of it is driven by input you don’t fully control: a customer configures a webhook URL, pastes a link to preview, points an importer at a remote file, registers an OIDC issuer, supplies an avatar URL. The moment a destination is influenced from outside, you have Server-Side Request Forgery: the attacker doesn’t reach the target directly — they make your server reach it, from inside your network, with your network’s trust.
What that buys an attacker is exactly the things your perimeter was supposed to protect:
- The cloud metadata endpoint (
169.254.169.254,fd00:ec2::254) — on a misconfigured host this hands out IAM credentials. This is the classic SSRF-to-cloud-takeover chain, and the reason ZFN-9 matters: federated, short-lived identity plus IMDSv2 shrink the prize, but blocking the route removes it. - Internal services — admin endpoints, databases, other tenants’ service instances, the service mesh — none of which expect to be reached from the open internet but will happily answer a request that originates inside the VPC.
- localhost — debug servers, sidecars, and management ports bound to
127.0.0.1.
Per-call URL validation in application code is necessary but not sufficient: it’s defeated by DNS rebinding (the name resolves to a public IP at validation time and an internal IP at connect time), by redirects to an internal host, by IPv6 and odd encodings, and simply by the next engineer who adds an outbound call and forgets to validate. You cannot make “every outbound call site validates correctly, forever” a reliable property. You can make “application compute has no network route to internal services” a reliable property — by topology, the way ZFN-4 isolates recovery tooling. That’s the move.
Recommendation
Send all outbound HTTP from application compute through a dedicated egress proxy that lives on compute with no path to internal services. The proxy makes the actual outbound connection; the application only ever talks to the proxy. Implement it at the tier you can operate well, and climb:
-
Tier 1 — SOCKS on an isolated subnet. Run a SOCKS proxy in a subnet whose network ACLs / route tables permit egress only to public-internet routes and deny anything destined for internal or trusted ranges — RFC1918 (
10/8,172.16/12,192.168/16), your VPC CIDRs, link-local (169.254.0.0/16, including the metadata IP), and127/8. Also deny traffic that originates from trusted IP ranges being relayed back inward. The subnet has no route to your services, so a compromised proxy (or app) can’t pivot. Simple, and already a large improvement. -
Tier 2 — a dedicated egress VPC. Put the SOCKS fleet in its own VPC with no peering, no transit gateway, no route back to your application or data VPCs. An internal load balancer (ALB/ NLB) fronts an autoscaled proxy fleet; the only way in is the LB endpoint and the only way out is the internet gateway. Isolation at the VPC boundary is stronger and far easier to reason about and audit than per-subnet ACLs.
-
Tier 3 — a gRPC (or HTTP) egress service. Instead of (or alongside) raw SOCKS, expose a first-party service that performs the outbound request on the caller’s behalf and accepts metadata with each call: which tenant/customer it’s for, the purpose, the caller identity. Because it’s a real service, not a dumb socket, it can enforce policy that a SOCKS proxy can’t:
- Per-tenant configuration — custom CA trust chains / pinned certs, allowed TLS versions and ciphers, required or forbidden headers, destination allow/deny lists, per-tenant rate limits and timeouts, response-size caps.
- SSRF hardening in one place — resolve DNS once and connect to the resolved IP (re-checking
it’s public) to defeat rebinding; re-validate every redirect hop; restrict schemes to
https/http; strip hop-by-hop and internal headers. - Audit — every outbound call is attributable to a tenant and purpose, logged centrally. Egress becomes a governed capability, not ambient socket access.
This is a platform-owned service in the sense of ZFN-5: one team owns the egress contract, everyone consumes it, and security policy lives at the seam.
Scope. This is about untrusted or third-party-bound egress from application compute. Purely internal service-to-service traffic isn’t egress and is governed by ZFN-3. A small set of well-known first-party destinations may be reached directly if that path is itself locked down — but the default for anything internet-bound, and anything whose destination is influenced by input, is the proxy. Make the proxy the path of least resistance so engineers don’t route around it.
Consequences
Easier:
- SSRF stops being catastrophic. The worst an attacker gets from a forged request is “reach a public address” — they can’t pivot to internal services or steal instance credentials, because the compute making the call has no route inward.
- The security property is topological and centrally enforced, not “every call site validated correctly.” New outbound code inherits the protection by construction.
- At Tier 3, egress is per-tenant configurable and fully audited — answering “who did we call, for which customer, and why?” becomes trivial, and tenant-specific TLS/cert requirements have a home.
Harder:
- Real infrastructure to run: an isolated subnet/VPC, a proxy fleet, a load balancer, and (Tier 3) a service with its own availability and on-call. Egress is now a dependency on the request path.
- Every HTTP client in every service must be pointed at the proxy — SDKs, libraries, and tools that don’t honor proxy settings need handling, and “someone added a client that bypasses the proxy” is a failure mode to guard against (egress-deny by default at the network layer makes the bypass fail rather than silently leak).
- Latency and a potential bottleneck on the egress path; the fleet must scale and degrade sensibly.
- Streaming, websockets, and long-lived connections need explicit support through the proxy.
New obligations:
- Application subnets/VPCs deny direct internet egress at the network layer, so the proxy is the only way out and a bypass fails closed rather than leaking.
- The proxy network is kept provably unable to reach internal ranges; any change that adds a route inward (peering, transit gateway, a shared subnet) is a blocking security review.
- New outbound-HTTP code uses the egress path; introducing a direct-egress client is a reviewed exception, documented at the call site.
- At Tier 3, tenant egress policy (trust chains, TLS, allow-lists) is owned and audited like any other security configuration.
References
- ZFN-9 — SSRF to the metadata endpoint is the classic credential-theft chain; federated identity shrinks the prize, isolating egress removes the route.
- ZFN-10 — the resource-ownership sibling of this note; both are “untrusted input drives a privileged action,” one at the network layer, one at the resource layer.
- ZFN-5 — the Tier-3 egress service as platform-owned capability with a governed seam.
- ZFN-4 — the same “isolate by topology, not by code” reasoning, applied to recovery tooling.
- OWASP: Server-Side Request Forgery and the SSRF Prevention Cheat Sheet.
- AWS IMDSv2 — defense in depth for the metadata endpoint behind the egress isolation.
Changelog
- 2026-06-12: First published as a Field Note.