Using split-brain from a client
split-brain is a drop-in endpoint that speaks both the OpenAI Chat
Completions API (/v1/chat/completions) and the Anthropic Messages
API (/v1/messages). Point any compatible client at the router, give it a
split-brain token, and your traffic is routed per the novelty classifier:
general/confident prompts can go to Claude, novel/uncertain prompts stay on
the private ScalarLM backend (the IP invariant).
- Router base URL:
https://split-brain-router.scalarxlm.com - Auth:
Authorization: Bearer sbk_…— a split-brain token you mint in the UI. The router only accepts bearer tokens (notx-api-key).
1. Get a token
Open the UI (Tokens page), create a token, and copy it once — it's
shown only at creation (format sbk_ + 40 chars). Revoke and re-mint from
the same page if it leaks.
export SPLIT_BRAIN_TOKEN=sbk_xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
2. Point Claude Code at it
Claude Code speaks the Anthropic Messages API natively, and the router
serves it (/v1/messages, streaming + tools + thinking + prompt caching).
Two environment variables do it:
export ANTHROPIC_BASE_URL=https://split-brain-router.scalarxlm.com
export ANTHROPIC_AUTH_TOKEN=$SPLIT_BRAIN_TOKEN # → Authorization: Bearer …
claude
Use ANTHROPIC_AUTH_TOKEN, not ANTHROPIC_API_KEY: the former sends
Authorization: Bearer, which is what the router authenticates;
ANTHROPIC_API_KEY sends x-api-key, which the router ignores → 401.
To make it durable, add the two vars to your Claude Code settings instead
of exporting per shell — ~/.claude/settings.json:
{
"env": {
"ANTHROPIC_BASE_URL": "https://split-brain-router.scalarxlm.com",
"ANTHROPIC_AUTH_TOKEN": "sbk_xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx"
}
}
Notes:
- Model selection is automatic. Claude Code sends model IDs like
claude-sonnet-4-6/claude-opus-4-8; the router treats any name that isn't literallyclaudeorscalarlmas router-auto and decides the backend per request via the classifier. When a request routes to Claude, the Anthropic ingress honors the exactclaude-*model the client asked for (it does not flatten everything to one model). You don't need to set a model. (Gateway discovery —/v1/models— advertisesrouter-auto.) count_tokensnever leaves the cluster. Claude Code calls/v1/messages/count_tokensconstantly for context management; the router answers it locally and never forwards your content to Anthropic.- Force a backend for a session by sending the model
claudeorscalarlm(ANTHROPIC_MODEL=scalarlm) —scalarlmskips the classifier entirely and pins to the private backend.
Agentic IP caveat (read this)
Today the router classifies the last user message to decide novelty.
That's sound for chat, but an agent like Claude Code spreads your codebase
across the system prompt, tool definitions, and tool results — so a benign
"now add a test" can classify general → Claude and carry proprietary
context with it. Whole-payload classification for the agentic path is
designed but not yet implemented — see
anthropic-ingress.md. Until it lands, pin sensitive
sessions to ScalarLM (ANTHROPIC_MODEL=scalarlm).
3. OpenAI-compatible clients
Any OpenAI SDK or tool works by changing the base URL and key:
from openai import OpenAI
client = OpenAI(
base_url="https://split-brain-router.scalarxlm.com/v1",
api_key="sbk_xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx",
)
resp = client.chat.completions.create(
model="router-auto", # or "claude" / "scalarlm" to force
messages=[{"role": "user", "content": "What is the capital of France?"}],
)
print(resp.choices[0].message.content)
The OpenAI SDK sends the key as Authorization: Bearer, so it
authenticates the same way.
4. curl (smoke test)
# OpenAI shape
curl https://split-brain-router.scalarxlm.com/v1/chat/completions \
-H "Authorization: Bearer $SPLIT_BRAIN_TOKEN" \
-H "Content-Type: application/json" \
-d '{"model":"router-auto","messages":[{"role":"user","content":"hello"}]}'
# Anthropic shape
curl https://split-brain-router.scalarxlm.com/v1/messages \
-H "Authorization: Bearer $SPLIT_BRAIN_TOKEN" \
-H "anthropic-version: 2023-06-01" \
-H "Content-Type: application/json" \
-d '{"model":"router-auto","max_tokens":64,"messages":[{"role":"user","content":"hello"}]}'
Each response carries Split-Brain-* headers (request id, chosen backend,
routing decision) so you can see where a request went. The same requests
show up in the UI Request explorer, where you can also
label prompts to improve the classifier.
Which backend did I hit?
Response headers (Split-Brain-Backend, Split-Brain-Decision) and the
Request explorer both report it. A request routes to Claude only
when the classifier is confident the prompt is general; novel and uncertain
prompts route to ScalarLM, and if the classifier or ScalarLM is down
the router fails closed (5xx) rather than leak to Claude.
Cloudflare Access
The router is exposed through a Cloudflare tunnel. For programmatic clients
the router hostname must be reachable without interactive SSO (which
gates the browser UI). If your deployment puts a Cloudflare Access policy on
the router host, issue an Access service token and send it alongside the
bearer token — e.g. for Claude Code via ANTHROPIC_CUSTOM_HEADERS:
export ANTHROPIC_CUSTOM_HEADERS="CF-Access-Client-Id: <id>
CF-Access-Client-Secret: <secret>"
See cloudflare-tunnel.md for the tunnel + Access setup.