Using split-brain from a client

split-brain is a drop-in endpoint that speaks both the OpenAI Chat Completions API (/v1/chat/completions) and the Anthropic Messages API (/v1/messages). Point any compatible client at the router, give it a split-brain token, and your traffic is routed per the novelty classifier: general/confident prompts can go to Claude, novel/uncertain prompts stay on the private ScalarLM backend (the IP invariant).

Router base URL: https://split-brain-router.scalarxlm.com
Auth: Authorization: Bearer sbk_… — a split-brain token you mint in the UI. The router only accepts bearer tokens (not x-api-key).

1. Get a token

Open the UI (Tokens page), create a token, and copy it once — it's shown only at creation (format sbk_ + 40 chars). Revoke and re-mint from the same page if it leaks.

export SPLIT_BRAIN_TOKEN=sbk_xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx

2. Point Claude Code at it

Claude Code speaks the Anthropic Messages API natively, and the router serves it (/v1/messages, streaming + tools + thinking + prompt caching). Two environment variables do it:

export ANTHROPIC_BASE_URL=https://split-brain-router.scalarxlm.com
export ANTHROPIC_AUTH_TOKEN=$SPLIT_BRAIN_TOKEN   # → Authorization: Bearer …
claude

Use ANTHROPIC_AUTH_TOKEN, not ANTHROPIC_API_KEY: the former sends Authorization: Bearer, which is what the router authenticates; ANTHROPIC_API_KEY sends x-api-key, which the router ignores → 401.

To make it durable, add the two vars to your Claude Code settings instead of exporting per shell — ~/.claude/settings.json:

{
  "env": {
    "ANTHROPIC_BASE_URL": "https://split-brain-router.scalarxlm.com",
    "ANTHROPIC_AUTH_TOKEN": "sbk_xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx"
  }
}

Notes:

Model selection is automatic. Claude Code sends model IDs like claude-sonnet-4-6 / claude-opus-4-8; the router treats any name that isn't literally claude or scalarlm as router-auto and decides the backend per request via the classifier. When a request routes to Claude, the Anthropic ingress honors the exact claude-* model the client asked for (it does not flatten everything to one model). You don't need to set a model. (Gateway discovery — /v1/models — advertises router-auto.)
count_tokens never leaves the cluster. Claude Code calls /v1/messages/count_tokens constantly for context management; the router answers it locally and never forwards your content to Anthropic.
Force a backend for a session by sending the model claude or scalarlm (ANTHROPIC_MODEL=scalarlm) — scalarlm skips the classifier entirely and pins to the private backend.

Agentic IP caveat (read this)

Today the router classifies the last user message to decide novelty. That's sound for chat, but an agent like Claude Code spreads your codebase across the system prompt, tool definitions, and tool results — so a benign "now add a test" can classify general → Claude and carry proprietary context with it. Whole-payload classification for the agentic path is designed but not yet implemented — see anthropic-ingress.md. Until it lands, pin sensitive sessions to ScalarLM (ANTHROPIC_MODEL=scalarlm).

3. OpenAI-compatible clients

Any OpenAI SDK or tool works by changing the base URL and key:

from openai import OpenAI

client = OpenAI(
    base_url="https://split-brain-router.scalarxlm.com/v1",
    api_key="sbk_xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx",
)
resp = client.chat.completions.create(
    model="router-auto",            # or "claude" / "scalarlm" to force
    messages=[{"role": "user", "content": "What is the capital of France?"}],
)
print(resp.choices[0].message.content)

The OpenAI SDK sends the key as Authorization: Bearer, so it authenticates the same way.

4. curl (smoke test)

# OpenAI shape
curl https://split-brain-router.scalarxlm.com/v1/chat/completions \
  -H "Authorization: Bearer $SPLIT_BRAIN_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"model":"router-auto","messages":[{"role":"user","content":"hello"}]}'

# Anthropic shape
curl https://split-brain-router.scalarxlm.com/v1/messages \
  -H "Authorization: Bearer $SPLIT_BRAIN_TOKEN" \
  -H "anthropic-version: 2023-06-01" \
  -H "Content-Type: application/json" \
  -d '{"model":"router-auto","max_tokens":64,"messages":[{"role":"user","content":"hello"}]}'

Each response carries Split-Brain-* headers (request id, chosen backend, routing decision) so you can see where a request went. The same requests show up in the UI Request explorer, where you can also label prompts to improve the classifier.

Which backend did I hit?

Response headers (Split-Brain-Backend, Split-Brain-Decision) and the Request explorer both report it. A request routes to Claude only when the classifier is confident the prompt is general; novel and uncertain prompts route to ScalarLM, and if the classifier or ScalarLM is down the router fails closed (5xx) rather than leak to Claude.

Cloudflare Access

The router is exposed through a Cloudflare tunnel. For programmatic clients the router hostname must be reachable without interactive SSO (which gates the browser UI). If your deployment puts a Cloudflare Access policy on the router host, issue an Access service token and send it alongside the bearer token — e.g. for Claude Code via ANTHROPIC_CUSTOM_HEADERS:

export ANTHROPIC_CUSTOM_HEADERS="CF-Access-Client-Id: <id>
CF-Access-Client-Secret: <secret>"

See cloudflare-tunnel.md for the tunnel + Access setup.