Per-token routing override

Every router API token carries a routing mode that decides how that token's requests are routed, independent of the request's model field:

Mode	Behavior	Classifier	IP veto
`tier-auto` (default)	Like `auto`, plus on the general branch the router picks the model tier by difficulty/stuck (fast → balanced → deep). See model-tiering.md.	runs	enforced
`auto`	The novelty classifier decides: general → Claude (the client's model honored), novel/uncertain → ScalarLM. No tiering.	runs	enforced
`scalarlm`	Everything goes to the private ScalarLM backend.	skipped	n/a
`claude`	Deliberate manual override — everything goes to Claude.	skipped	bypassed

tier-auto and auto differ only on the general branch: auto honors the client's requested claude-* model; tier-auto lets the router choose the model (overriding the client's pin) by task difficulty and stuck-ness. Both run the classifier and uphold the IP invariant identically.

`claude` mode is an IP bypass — by design

auto upholds the IP invariant: novel/uncertain prompts never reach Claude. A request that forces Claude via the model field is still vetoed (403) when the classifier isn't confidently "general".

claude mode is different: it is a per-token, owner-chosen override that sends all of that token's traffic to Claude — including proprietary content — skipping both the classifier and the IP veto. It exists so a user who knows their work is non-proprietary (or who accepts the tradeoff) can opt that token out of routing entirely.

The token owner sets this themselves on the Tokens page (a confirmed, deliberate action). The UI labels such tokens IP bypass and warns that proprietary content will reach Anthropic.

How it resolves

router.app._resolve_routing(token, payload):

If the token's mode is claude or scalarlm, that wins — the request's model field is ignored.
If the token's mode is tier-auto or auto, the request's model is honored (claude/scalarlm force a backend — with claude still IP-vetoed — anything else is router-auto). _resolve_routing also returns a tier_auto flag; when set, the general branch runs the tier selector instead of sending the client's pinned model.

claude mode passes bypass_veto=True to policy.decide, which returns a forced-Claude decision without running or consulting the classifier. The audit record shows routing_decision=forced, chosen_backend=claude, and no classifier label.

Storage

The mode is a routing_mode field on each token JSON (<token_dir>/tok_*.json), read by both the router (tokens.TokenRecord) and the UI. Missing/invalid values default to tier-auto, so a token without an explicit mode gets difficulty tiering. (To keep the client's own model selection and skip tiering, set the token to auto.)

Per-token routing override

claude mode is an IP bypass — by design

How it resolves

Storage

`claude` mode is an IP bypass — by design