---
id: 11
title: "Route outbound HTTP through an isolated egress proxy"
status: current
date: 2026-06-12
authors:
  - "Theo Zourzouvillys"
tags: [security, infra, network, ssrf]
summary: "Application compute shouldn't make arbitrary outbound HTTP — it's an SSRF pivot to internal services and the cloud metadata endpoint. Route all egress through a proxy (SOCKS, or a gRPC egress service) on isolated compute with no route inward. The proxy's network is the boundary."
supersedes: null
superseded_by: null
aliases: []
---

## TL;DR

Don't let application compute open arbitrary outbound connections directly. Any service that makes
outbound HTTP — fetching user-supplied URLs, delivering webhooks, calling third-party APIs, link
previews, importing remote files, SSO/OIDC metadata fetches — is an **SSRF** surface: an
attacker-influenced URL turns your app into a deputy that can reach **internal services**, the
**cloud metadata endpoint** (`169.254.169.254`), and localhost admin ports. Route **all** such
egress through a **proxy that runs on separate compute which physically cannot reach anything
internal.** The proxy's network — not the application's URL-validation code — is the enforcement
boundary: even a fully compromised app behind it can only reach the public internet.

Build it in tiers, simplest first: (1) a **SOCKS proxy on an isolated subnet** whose ACLs allow only
routes to the public internet and deny anything aimed at internal/trusted ranges; (2) a **dedicated
VPC** (no peering inward) with a load balancer fronting an autoscaled SOCKS fleet; (3) a **gRPC
egress service** you call to make the request *with metadata* (which tenant, what purpose), so egress
becomes a governed, per-tenant capability — trust chains, allowed TLS versions, header policy, audit.

## Context

A surprising amount of application code makes outbound HTTP, and much of it is driven by input you
don't fully control: a customer configures a webhook URL, pastes a link to preview, points an
importer at a remote file, registers an OIDC issuer, supplies an avatar URL. The moment a destination
is influenced from outside, you have **Server-Side Request Forgery**: the attacker doesn't reach the
target directly — they make *your* server reach it, from inside your network, with your network's
trust.

What that buys an attacker is exactly the things your perimeter was supposed to protect:

- **The cloud metadata endpoint** (`169.254.169.254`, `fd00:ec2::254`) — on a misconfigured host this
  hands out IAM credentials. This is the classic SSRF-to-cloud-takeover chain, and the reason
  [ZFN-9](/zfn/9-no-long-lived-cloud-keys/) matters: federated, short-lived identity plus IMDSv2
  shrink the prize, but blocking the route removes it.
- **Internal services** — admin endpoints, databases, other tenants' service instances, the service
  mesh — none of which expect to be reached from the open internet but will happily answer a request
  that originates inside the VPC.
- **localhost** — debug servers, sidecars, and management ports bound to `127.0.0.1`.

Per-call URL validation in application code is necessary but not sufficient: it's defeated by DNS
rebinding (the name resolves to a public IP at validation time and an internal IP at connect time),
by redirects to an internal host, by IPv6 and odd encodings, and simply by the next engineer who adds
an outbound call and forgets to validate. You cannot make "every outbound call site validates
correctly, forever" a reliable property. You *can* make "application compute has no network route to
internal services" a reliable property — by topology, the way [ZFN-4](/zfn/4-incident-tooling-independence/)
isolates recovery tooling. That's the move.

## Recommendation

**Send all outbound HTTP from application compute through a dedicated egress proxy that lives on
compute with no path to internal services.** The proxy makes the actual outbound connection; the
application only ever talks to the proxy. Implement it at the tier you can operate well, and climb:

- **Tier 1 — SOCKS on an isolated subnet.** Run a SOCKS proxy in a subnet whose network ACLs / route
  tables permit egress **only** to public-internet routes and **deny** anything destined for internal
  or trusted ranges — RFC1918 (`10/8`, `172.16/12`, `192.168/16`), your VPC CIDRs, link-local
  (`169.254.0.0/16`, including the metadata IP), and `127/8`. Also deny traffic that *originates from*
  trusted IP ranges being relayed back inward. The subnet has no route to your services, so a
  compromised proxy (or app) can't pivot. Simple, and already a large improvement.
- **Tier 2 — a dedicated egress VPC.** Put the SOCKS fleet in its own VPC with **no peering, no
  transit gateway, no route** back to your application or data VPCs. An internal load balancer (ALB/
  NLB) fronts an autoscaled proxy fleet; the *only* way in is the LB endpoint and the *only* way out
  is the internet gateway. Isolation at the VPC boundary is stronger and far easier to reason about
  and audit than per-subnet ACLs.
- **Tier 3 — a gRPC (or HTTP) egress *service*.** Instead of (or alongside) raw SOCKS, expose a
  first-party service that performs the outbound request **on the caller's behalf** and accepts
  metadata with each call: which **tenant/customer** it's for, the **purpose**, the caller identity.
  Because it's a real service, not a dumb socket, it can enforce policy that a SOCKS proxy can't:
  - **Per-tenant configuration** — custom **CA trust chains** / pinned certs, allowed **TLS versions
    and ciphers**, required or forbidden **headers**, destination **allow/deny lists**, per-tenant
    rate limits and timeouts, response-size caps.
  - **SSRF hardening in one place** — resolve DNS once and **connect to the resolved IP** (re-checking
    it's public) to defeat rebinding; re-validate every **redirect** hop; restrict schemes to
    `https`/`http`; strip hop-by-hop and internal headers.
  - **Audit** — every outbound call is attributable to a tenant and purpose, logged centrally. Egress
    becomes a governed capability, not ambient socket access.

  This is a platform-owned service in the sense of [ZFN-5](/zfn/5-platform-workload-identity-service/):
  one team owns the egress contract, everyone consumes it, and security policy lives at the seam.

**Scope.** This is about untrusted or third-party-bound egress from application compute. Purely
internal service-to-service traffic isn't egress and is governed by
[ZFN-3](/zfn/3-default-encrypt-internal-traffic/). A small set of well-known first-party destinations
*may* be reached directly if that path is itself locked down — but the default for anything
internet-bound, and anything whose destination is influenced by input, is the proxy. Make the proxy
the path of least resistance so engineers don't route around it.

## Consequences

**Easier:**

- SSRF stops being catastrophic. The worst an attacker gets from a forged request is "reach a public
  address" — they can't pivot to internal services or steal instance credentials, because the compute
  making the call has no route inward.
- The security property is **topological and centrally enforced**, not "every call site validated
  correctly." New outbound code inherits the protection by construction.
- At Tier 3, egress is per-tenant configurable and fully audited — answering "who did we call, for
  which customer, and why?" becomes trivial, and tenant-specific TLS/cert requirements have a home.

**Harder:**

- Real infrastructure to run: an isolated subnet/VPC, a proxy fleet, a load balancer, and (Tier 3) a
  service with its own availability and on-call. Egress is now a dependency on the request path.
- Every HTTP client in every service must be pointed at the proxy — SDKs, libraries, and tools that
  don't honor proxy settings need handling, and "someone added a client that bypasses the proxy" is a
  failure mode to guard against (egress-deny by default at the network layer makes the bypass *fail*
  rather than silently leak).
- Latency and a potential bottleneck on the egress path; the fleet must scale and degrade sensibly.
- Streaming, websockets, and long-lived connections need explicit support through the proxy.

**New obligations:**

- Application subnets/VPCs **deny direct internet egress** at the network layer, so the proxy is the
  only way out and a bypass fails closed rather than leaking.
- The proxy network is kept provably unable to reach internal ranges; any change that adds a route
  inward (peering, transit gateway, a shared subnet) is a blocking security review.
- New outbound-HTTP code uses the egress path; introducing a direct-egress client is a reviewed
  exception, documented at the call site.
- At Tier 3, tenant egress policy (trust chains, TLS, allow-lists) is owned and audited like any other
  security configuration.

## References

- [ZFN-9](/zfn/9-no-long-lived-cloud-keys/) — SSRF to the metadata endpoint is the classic
  credential-theft chain; federated identity shrinks the prize, isolating egress removes the route.
- [ZFN-10](/zfn/10-verify-resource-owner/) — the resource-ownership sibling of this note; both are
  "untrusted input drives a privileged action," one at the network layer, one at the resource layer.
- [ZFN-5](/zfn/5-platform-workload-identity-service/) — the Tier-3 egress service as platform-owned
  capability with a governed seam.
- [ZFN-4](/zfn/4-incident-tooling-independence/) — the same "isolate by topology, not by code"
  reasoning, applied to recovery tooling.
- [OWASP: Server-Side Request Forgery](https://owasp.org/www-community/attacks/Server_Side_Request_Forgery)
  and the [SSRF Prevention Cheat Sheet](https://cheatsheetseries.owasp.org/cheatsheets/Server_Side_Request_Forgery_Prevention_Cheat_Sheet.html).
- [AWS IMDSv2](https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/configuring-instance-metadata-service.html) — defense in depth for the metadata endpoint behind the egress isolation.

## Changelog

- **2026-06-12**: First published as a Field Note.
