AI Router

Smart Router

The router intercepts every AI request and decides: run locally (free) or send to frontier (paid). Here's exactly how it makes that decision.

Decision logic

The router runs three checks in order. First match wins.

Signal matching → Frontier

Scans for high-complexity keywords: architect, design system, security audit, trade-off, compare and contrast, multi-tenant, compliance. If any match → route to frontier.

Signal matching → Local

Scans for routine coding keywords: write a function, fix this, create a, implement, write sql, write a test, dockerfile, bash script. If any match → route to local Brewmode model.

Length fallback

If no keyword match: prompts under 500 chars → local (most short prompts are routine). Prompts over 500 chars → frontier (longer prompts usually need deeper reasoning). Over 1,000 chars always goes frontier regardless of step 1-2.

Architecture

Your app → POST /api/router { prompt, max_tokens }

↓

classifyComplexity(prompt)

├─ frontier signals? → Claude Sonnet 4 via OpenRouter

├─ local signals? → Brewmode Qwen3-8B on Modal (free)

└─ length fallback → <500 chars local, >500 chars frontier

↓

Response: { text, route, reason, model, time_ms, tokens, cost }

Manual override

Pass model: "local" or model: "frontier" to bypass the classifier and force a specific route. Default is model: "auto".

Without router

$10,000/mo

10 devs × 100% frontier at $0.015-0.06/1K tokens

With router

$1,000-2,000/mo

80-90% handled by Brewmode at $0/token. Only 10-20% hits paid API.

Try it live

Run the demo to see routing decisions in real time.

Requests

Routed Local

Frontier Calls

Cost Saved

$0.0000

Annual Savings (10 devs)