split-brain

Sign in

Kubernetes deployment

All split-brain workloads run in a single namespace, split-brain. ScalarLM is not deployed by this chart — it is a separately deployed service (its own Helm chart, possibly its own cluster); we only configure SCALARLM_BASE_URL and a credential to reach it. No GPU node pool is required for split-brain itself.

Namespace and RBAC

  • Namespace: split-brain.
  • ServiceAccount per component: router-sa, classifier-sa, ui-sa, cloudflared-sa. No cluster-wide permissions; no Roles needed by v1.

Workloads

Component Kind Replicas (base / dev) Resources
router Deployment 3 / 1 250m / 512Mi requests, 1 / 1Gi limits
classifier Deployment 2 / 1 500m / 1Gi requests, 2 / 2Gi limits
ui Deployment 2 / 1 250m / 512Mi requests, 1 / 1Gi limits
cloudflared Deployment 2 / 2 100m / 128Mi requests, 500m / 256Mi limits

All four run on CPU nodes — no GPU scheduling on our side. The dev overlay pins router/classifier/ui to 1 replica because the Civo PVC is RWO (all PVC-mounting pods land on one node); the UI must also stay at 1 as the single writer of its PVC files.

HPA

Only the router subchart ships an hpa.yaml (gated on autoscaling.enabled, CPU-target based) — and it's disabled in dev (fixed replicaCount: 1). classifier and ui have no HPA. There is no Prometheus / prometheus-adapter custom-metric autoscaling in this build.

PodDisruptionBudgets

PDBs exist per subchart but are disabled in dev (a PDB minAvailable: 2 conflicts with replicaCount: 1). cloudflared keeps a PDB (minAvailable: 1).

Services

Name Type Port Backed by
router ClusterIP 8080 router pods
classifier ClusterIP 8080 classifier pods
ui ClusterIP 8080 ui pods

No LoadBalancer and no Ingress. The router and UI are reachable from outside the cluster only through cloudflared tunnels (see cloudflare-tunnel.md).

Secrets

Three Secrets live in the namespace, created by the Helm chart from values by default (global.secrets.create: true):

Secret name Keys Consumed by Required?
anthropic-api-key ANTHROPIC_API_KEY router yes
cloudflared-credentials TUNNEL_TOKEN cloudflared yes
scalarlm-credentials SCALARLM_API_KEY router only when router.config.scalarlmBaseUrl is set
google-oauth clientSecret, sessionSecret ui only when ui Google sign-in is enabled

The Helm chart materializes these from global.secrets.* values supplied at install time (typically via a gitignored values-*.secrets.yaml file). See helm.md § Secrets for the values schema and the "keep secret material out of git" convention.

Operators who manage secrets out of band (sealed-secrets, external-secrets-operator, vault-injector) set global.secrets.create: false; the chart skips the Secret resources and the operator creates Secrets with the names above themselves.

All consumed via envFrom / volume mounts; we do not bake credentials into images.

Router-client bearer tokens (sbk_*) are not Kubernetes Secrets — they live as sha256(token) files on the shared PVC, issued at runtime by the UI (see ui.md).

ConfigMaps

  • router-config — non-secret env: classifier threshold + endpoint, in-flight cap, seed Anthropic model, ScalarLM URL/model, cache flags.
  • classifier-config — model kind, MiniLM head path + version, keyword list path.
  • ui-config — classifier base URL, Cloudflare Access team domain + audience, and Google sign-in settings (client id, redirect URL, allowed domains/emails, session length).
  • cloudflared has no ConfigMap — it runs in token mode (ingress rules in the Cloudflare dashboard).

Persistent storage

One PVC.

split-brain-data — access mode from global.storage.accessMode (default ReadWriteMany; the dev overlay uses ReadWriteOnce), sized for traffic + audit retention. Mounted at /var/split-brain/ in every pod that needs durable state. Holds:

  • audit/ — append-only per-pod request audit log written by the router (RW), read by the ui Request explorer (RO).
  • tokens/ — one JSON file per active router API token; written by ui (create/revoke) and router (last-used flush).
  • bootstrap/ — proprietary docs staged in the UI before training.
  • heads/ — the trained classifier head the UI writes; the classifier loads it on /reload.
  • labels/ — operator-curated training labels (labels.jsonl).
  • limits/ — per-user token limits (limits.json); ui writes, router reads.
  • settings/ — runtime settings (settings.json, e.g. the default Claude model); ui writes, router reads.
  • usage/ — per-pod daily token usage ledgers written by the router.

With ReadWriteMany (NFS, CephFS, EFS, Filestore, Azure Files) the PVC-mounting components can scale horizontally. The current Civo cluster only offers ReadWriteOnce block storage (civo-volume), so the dev overlay runs them at a single replica on one node. The chart does not pick a class; the operator sets global.storage.storageClassName. No S3/object store is used; all durable state is on the PVC.

The classifier model is small and ships inside the container image. It does not use a PVC.

Health probes

Component Liveness Readiness Initial delay
router GET /healthz GET /readyz 5s
classifier GET /healthz GET /readyz 5s
ui GET /healthz GET /readyz 5s
cloudflared GET /ready GET /ready 5s

Network policies

Default deny in split-brain. Explicit allows:

  • routerclassifier:8080, the external ScalarLM URL, and egress to api.anthropic.com:443.
  • classifier → nothing (no egress beyond DNS).
  • uiclassifier:8080 (the probe view and the /reload after training).
  • cloudflaredrouter:8080, ui:8080, and egress to Cloudflare edge (*.cloudflare.com:7844).

Rollouts

  • All deployments: RollingUpdate, maxSurge 1, maxUnavailable 0.

Layout

Workloads are deployed via a Helm umbrella chart with one subchart per component. See helm.md for the chart structure, values schema, and release process. This document specifies what each workload looks like in the cluster; the chart is how we get it there.