Cloudflare tunnel
We expose the router over the internet using cloudflared running as
a Kubernetes Deployment. The cluster has no public ingress: all
inbound traffic to the router arrives through an outbound TLS
connection that cloudflared opens to Cloudflare's edge.
Why a tunnel instead of a LoadBalancer
- The cluster does not need a public IP, an Ingress controller, or a cloud LB. One fewer attack surface.
- DDoS, WAF, and bot management run at Cloudflare's edge before the request ever reaches our pods.
- Tunnel credentials can be revoked without touching DNS or firewall rules.
Trade-off: requests carry a small additional hop. In practice this adds ~5–15 ms median; acceptable for an LLM endpoint where p50 inference dominates.
One-time setup (operator)
We run cloudflared in token mode (a "remotely-managed" tunnel): the tunnel
and its ingress rules live in the Cloudflare dashboard, and a single token
encodes the tunnel identity. There is no local credentials JSON and no
config.yaml ConfigMap.
In the Cloudflare dashboard (Zero Trust → Networks → Tunnels):
- Create a tunnel named
split-brain; copy the install token (the base64 string shown aftercloudflared service install …). - Add Public Hostnames mapping each external hostname to its in-cluster
service, e.g.:
-
split-brain-router.<zone>→http://<release>-router:8080-split-brain-ui.<zone>→http://<release>-ui:8080
The token is supplied to the chart as global.secrets.cloudflared.tunnelToken
(in a gitignored values-*.secrets.yaml); the umbrella renders it into the
cloudflared-credentials Secret under the key TUNNEL_TOKEN, which the
Deployment reads as the TUNNEL_TOKEN env. See helm.md § Secrets.
Ingress rules
Ingress rules are not in the cluster — cloudflared runs with just
tunnel --metrics 0.0.0.0:2000 run and fetches its hostname→service mapping
from Cloudflare's control plane at runtime (the dashboard's Public Hostnames).
To change what's exposed, edit the tunnel in the dashboard; no redeploy.
Two hostnames are exposed today — the router and the UI — each mapped to its in-cluster Service. The router carries the OpenAI/Anthropic API; the UI is the operator console (gated by Google sign-in, see google-auth.md).
Access policy (recommended)
Put a Cloudflare Access application in front of router.<zone> that
requires a service token. Clients send both:
CF-Access-Client-Id/CF-Access-Client-Secretheaders (edge).Authorization: Bearer <key>header (router).
This gives us defense in depth: edge revocation if a token leaks, router-side bearer check independent of Cloudflare.
For service-to-service callers that cannot easily send extra headers, we can scope an Access bypass policy by IP range. Avoid this for human-facing clients.
Replicas and health
We run two cloudflared replicas so a pod restart does not drop the tunnel. Cloudflare maintains independent connections from each replica to the edge and load-balances across them automatically.
The metrics: 0.0.0.0:2000 line exposes /metrics (Prometheus) and
/ready (used by Kubernetes probes).
Hazard: the tunnel token defines the origin, not the cluster. Any
cloudflaredthat starts with this tunnel's credentials becomes a live replica and Cloudflare load-balances the public hostnames across all of them. If the chart is ever deployed to a second cluster with the samecloudflared-credentialssecret, that cluster starts serving (or failing) real production traffic — see the 2026-06-04 wrong-cluster incident in deploy.md. The deploy-time cluster guard exists to prevent exactly this; be deliberate about where the tunnel secret is installed.
Egress requirements
cloudflared needs egress to region1.argotunnel.com:7844 and
region2.argotunnel.com:7844 (TCP). Both are covered by allowing
*.argotunnel.com:7844 and *.cloudflare.com:7844 in the
NetworkPolicy. No inbound ports.
Failure modes
| Failure | Result | Recovery |
|---|---|---|
| Single cloudflared pod crash | Other replica carries traffic. | k8s restart. |
| Both replicas down | Hostname returns 530 at the edge. | k8s rolls them. |
| Cloudflare edge outage | Hostname unreachable. | Wait or fail over to a secondary tunnel in another zone (out of scope v1). |
| Credentials revoked / rotated | Tunnel fails to start. | Recreate Secret, restart Deployment. |
What lives in this repo
- The
cloudflaredsubchart undercharts/split-brain/charts/cloudflared/: Deployment, NetworkPolicy, PodDisruptionBudget, ServiceAccount. No ConfigMap (token mode) and no Dockerfile — the chart pulls the upstreamcloudflare/cloudflaredimage pinned to a specific tag (image.tagin the subchart values), not a wrapper image we build.
What is not in the repo: the tunnel token, the DNS records, the Public Hostname mappings, or any Access policies. Those are operator-managed in the Cloudflare dashboard.