Deploying — and the cluster targeting guard
./split-brain deploy runs helm upgrade --install for the umbrella
chart, then rolls the app deployments so a same-tag image rebuild is
re-pulled (the dev flow pins :v0.0.1 + pullPolicy: Always). destroy
uninstalls the release and, by default, deletes the data PVC.
The usual flows:
./split-brain deploy # helm upgrade + roll the dev release
./split-brain deploy --build --remote # build images on $BLACKWELL_HOST, push, deploy, roll
./split-brain deploy dev --dry-run # render + server-validate, no changes
Cluster targeting guard
helm and kubectl fall back to the ambient kubectl current-context
when no context is given. If KUBECONFIG is unset/empty (or the current
context was changed elsewhere), that can be a completely different
cluster — silently.
To make that impossible, deploy and destroy resolve and verify a
kube-context before any cluster call:
- Resolve the target context, in order:
--context <name>→<ENV>_KUBE_CONTEXT(e.g.DEV_KUBE_CONTEXT) →KUBE_CONTEXT. The CLI sources a gitignored.envfor these, so the guard works even in non-direnv shells (cron, CI, an agent's non-interactive shell). - Verify the context exists in the active
KUBECONFIG. If not, the command aborts with instructions — it never falls through to the ambient current-context. - Target explicitly: every
helm/kubectlcall is passed--kube-context <name>, so the resolved context is authoritative.
Configure it once. Either is enough; both is belt-and-suspenders:
.env(sourced by the CLI, all shells) — copy.env.example:
bash
DEV_KUBE_CONTEXT=split-brain
# PROD_KUBE_CONTEXT=...
.envrc(direnv, interactive shells) — already setsKUBECONFIG; it now also exportsDEV_KUBE_CONTEXT. See.envrc.example.
The context named here must be visible in your KUBECONFIG. For the Civo
dev cluster, its kubeconfig holds a context literally named split-brain:
export KUBECONFIG=$HOME/keys/civo-split-brain-kubeconfig
Override per-invocation with --context:
./split-brain deploy prod --context split-brain-prod
Why the guard exists (2026-06-04 incident)
A ./split-brain deploy ran in a shell where direnv had not loaded
(KUBECONFIG empty), so helm/kubectl used the ambient current-context
— a different Civo cluster (solitary-surf-89820118). The full stack
helm-installed there, including a cloudflared replica. Because that
replica used the same tunnel token as production, Cloudflare
registered it as a second origin and load-balanced
*.scalarxlm.com traffic across both clusters — half of it hitting
the wrong cluster and failing with connection refused. A live,
intermittent production outage caused purely by targeting the wrong
cluster.
Recovery was helm uninstall + namespace delete on the wrong cluster,
which deregistered the rogue tunnel replica and restored all traffic to
the correct cluster.
Two compounding lessons, both now mitigated:
- Never rely on the ambient context for a mutating command. → the guard above (fail-closed: abort, don't guess).
- A shared tunnel token means any cluster running
cloudflaredwith it becomes a live origin. Be deliberate about where that secret is installed. See cloudflare-tunnel.md.