Kubernetes deployment

All split-brain workloads run in a single namespace, split-brain. ScalarLM is not deployed by this chart — it is a separately deployed service (its own Helm chart, possibly its own cluster); we only configure SCALARLM_BASE_URL and a credential to reach it. No GPU node pool is required for split-brain itself.

Namespace and RBAC

Namespace: split-brain.
ServiceAccount per component: router-sa, classifier-sa, ui-sa, cloudflared-sa. No cluster-wide permissions; no Roles needed by v1.

Workloads

Component	Kind	Replicas (base / dev)	Resources
router	Deployment	3 / 1	250m / 512Mi requests, 1 / 1Gi limits
classifier	Deployment	2 / 1	500m / 1Gi requests, 2 / 2Gi limits
ui	Deployment	2 / 1	250m / 512Mi requests, 1 / 1Gi limits
cloudflared	Deployment	2 / 2	100m / 128Mi requests, 500m / 256Mi limits

All four run on CPU nodes — no GPU scheduling on our side. The dev overlay pins router/classifier/ui to 1 replica because the Civo PVC is RWO (all PVC-mounting pods land on one node); the UI must also stay at 1 as the single writer of its PVC files.

HPA

Only the router subchart ships an hpa.yaml (gated on autoscaling.enabled, CPU-target based) — and it's disabled in dev (fixed replicaCount: 1). classifier and ui have no HPA. There is no Prometheus / prometheus-adapter custom-metric autoscaling in this build.

PodDisruptionBudgets

PDBs exist per subchart but are disabled in dev (a PDB minAvailable: 2 conflicts with replicaCount: 1). cloudflared keeps a PDB (minAvailable: 1).

Services

Name	Type	Port	Backed by
router	ClusterIP	8080	router pods
classifier	ClusterIP	8080	classifier pods
ui	ClusterIP	8080	ui pods

No LoadBalancer and no Ingress. The router and UI are reachable from outside the cluster only through cloudflared tunnels (see cloudflare-tunnel.md).

Secrets

Three Secrets live in the namespace, created by the Helm chart from values by default (global.secrets.create: true):

Secret name	Keys	Consumed by	Required?
`anthropic-api-key`	`ANTHROPIC_API_KEY`	router	yes
`cloudflared-credentials`	`TUNNEL_TOKEN`	cloudflared	yes
`scalarlm-credentials`	`SCALARLM_API_KEY`	router	only when `router.config.scalarlmBaseUrl` is set
`google-oauth`	`clientSecret`, `sessionSecret`	ui	only when ui Google sign-in is enabled

The Helm chart materializes these from global.secrets.* values supplied at install time (typically via a gitignored values-*.secrets.yaml file). See helm.md § Secrets for the values schema and the "keep secret material out of git" convention.

Operators who manage secrets out of band (sealed-secrets, external-secrets-operator, vault-injector) set global.secrets.create: false; the chart skips the Secret resources and the operator creates Secrets with the names above themselves.

All consumed via envFrom / volume mounts; we do not bake credentials into images.

Router-client bearer tokens (sbk_*) are not Kubernetes Secrets — they live as sha256(token) files on the shared PVC, issued at runtime by the UI (see ui.md).

ConfigMaps

router-config — non-secret env: classifier threshold + endpoint, in-flight cap, seed Anthropic model, ScalarLM URL/model, cache flags.
classifier-config — model kind, MiniLM head path + version, keyword list path.
ui-config — classifier base URL, Cloudflare Access team domain + audience, and Google sign-in settings (client id, redirect URL, allowed domains/emails, session length).
cloudflared has no ConfigMap — it runs in token mode (ingress rules in the Cloudflare dashboard).

Persistent storage

One PVC.

split-brain-data — access mode from global.storage.accessMode (default ReadWriteMany; the dev overlay uses ReadWriteOnce), sized for traffic + audit retention. Mounted at /var/split-brain/ in every pod that needs durable state. Holds:

audit/ — append-only per-pod request audit log written by the router (RW), read by the ui Request explorer (RO).
tokens/ — one JSON file per active router API token; written by ui (create/revoke) and router (last-used flush).
bootstrap/ — proprietary docs staged in the UI before training.
heads/ — the trained classifier head the UI writes; the classifier loads it on /reload.
labels/ — operator-curated training labels (labels.jsonl).
limits/ — per-user token limits (limits.json); ui writes, router reads.
settings/ — runtime settings (settings.json, e.g. the default Claude model); ui writes, router reads.
usage/ — per-pod daily token usage ledgers written by the router.

With ReadWriteMany (NFS, CephFS, EFS, Filestore, Azure Files) the PVC-mounting components can scale horizontally. The current Civo cluster only offers ReadWriteOnce block storage (civo-volume), so the dev overlay runs them at a single replica on one node. The chart does not pick a class; the operator sets global.storage.storageClassName. No S3/object store is used; all durable state is on the PVC.

The classifier model is small and ships inside the container image. It does not use a PVC.

Health probes

Component	Liveness	Readiness	Initial delay
router	`GET /healthz`	`GET /readyz`	5s
classifier	`GET /healthz`	`GET /readyz`	5s
ui	`GET /healthz`	`GET /readyz`	5s
cloudflared	`GET /ready`	`GET /ready`	5s

Network policies

Default deny in split-brain. Explicit allows:

router → classifier:8080, the external ScalarLM URL, and egress to api.anthropic.com:443.
classifier → nothing (no egress beyond DNS).
ui → classifier:8080 (the probe view and the /reload after training).
cloudflared → router:8080, ui:8080, and egress to Cloudflare edge (*.cloudflare.com:7844).

Rollouts

All deployments: RollingUpdate, maxSurge 1, maxUnavailable 0.

Layout

Workloads are deployed via a Helm umbrella chart with one subchart per component. See helm.md for the chart structure, values schema, and release process. This document specifies what each workload looks like in the cluster; the chart is how we get it there.