Per-token routing override
Every router API token carries a routing mode that decides how that
token's requests are routed, independent of the request's model field:
| Mode | Behavior | Classifier | IP veto |
|---|---|---|---|
tier-auto (default) |
Like auto, plus on the general branch the router picks the model tier by difficulty/stuck (fast → balanced → deep). See model-tiering.md. |
runs | enforced |
auto |
The novelty classifier decides: general → Claude (the client's model honored), novel/uncertain → ScalarLM. No tiering. | runs | enforced |
scalarlm |
Everything goes to the private ScalarLM backend. | skipped | n/a |
claude |
Deliberate manual override — everything goes to Claude. | skipped | bypassed |
tier-auto and auto differ only on the general branch: auto honors the
client's requested claude-* model; tier-auto lets the router choose the
model (overriding the client's pin) by task difficulty and stuck-ness. Both
run the classifier and uphold the IP invariant identically.
claude mode is an IP bypass — by design
auto upholds the IP invariant: novel/uncertain prompts never reach Claude.
A request that forces Claude via the model field is still vetoed (403)
when the classifier isn't confidently "general".
claude mode is different: it is a per-token, owner-chosen override that
sends all of that token's traffic to Claude — including proprietary
content — skipping both the classifier and the IP veto. It exists so a user
who knows their work is non-proprietary (or who accepts the tradeoff) can opt
that token out of routing entirely.
The token owner sets this themselves on the Tokens page (a confirmed, deliberate action). The UI labels such tokens IP bypass and warns that proprietary content will reach Anthropic.
How it resolves
router.app._resolve_routing(token, payload):
- If the token's mode is
claudeorscalarlm, that wins — the request'smodelfield is ignored. - If the token's mode is
tier-autoorauto, the request'smodelis honored (claude/scalarlmforce a backend — withclaudestill IP-vetoed — anything else isrouter-auto)._resolve_routingalso returns atier_autoflag; when set, the general branch runs the tier selector instead of sending the client's pinned model.
claude mode passes bypass_veto=True to policy.decide, which returns a
forced-Claude decision without running or consulting the classifier. The
audit record shows routing_decision=forced, chosen_backend=claude, and no
classifier label.
Storage
The mode is a routing_mode field on each token JSON
(<token_dir>/tok_*.json), read by both the router (tokens.TokenRecord) and
the UI. Missing/invalid values default to tier-auto, so a token without
an explicit mode gets difficulty tiering. (To keep the client's own model
selection and skip tiering, set the token to auto.)