How AI Routing Works | GreatRouter

Reading guide: Pipeline → Score → Costs → Control → Reliability. Each section explains one stage of automatic routing.

Overview

One API call.
The right model.
Every time.

Send any prompt — text, images, videos, audio, or music — to a single endpoint. GreatRouter detects what you are trying to do, finds the best model from 1,516 options, and adds under 5ms of routing overhead on most requests.

<5ms routing · 1,516 models · Auto-fallback · Full cost transparency

The Pipeline

Four steps to the
perfect model

01 Classify

Explicit task?

Keyword match?

LLM classify

Fast path: pass task directly (<1ms). Medium path: regex keyword match on your prompt (~2ms). Slow path: Gemma 4 26B for ambiguous inputs (~80ms, fallback to Llama 3.2 3B).

02 Score

base = w_price×(1−price) + w_quality×quality
+ tier_bonus + intelligence + pref − health

Two-axis scoring: price and quality blend by optimization (price-optimized, balanced, or output-optimized). Intelligence bonus matches complexity tier to model quality and feedback history.

03 Select

fl-2-dev sd-3.5 dall-e-3 recraftv3

Apply budget_dollars filter, deduplicate overlap groups, pick top. Return the winner + full price comparison (cheapest / best-value / best-quality).

04 Proxy

Routes the request to the selected model through our inference network. OpenAI-compatible /v1/chat/completions or native auto-route JSON. If it fails, auto-retry the next-ranked model.

Before & After

Without GreatRouter

1 Pick a model manually

2 Check if it supports your task

3 Compare pricing across providers

4 Manage separate API keys

5 Handle failures yourself

6 Track spend per provider

With GreatRouter

1 POST /v1/auto/route

2 Get result + model ID + cost

Scoring

How a model
is selected

Example: "Generate a photorealistic image of a mountain lake" with optimization: "balanced".

#1 flux-2-dev

Price score 0.35

Quality score 0.80

Intelligence bonus +0.12

Total 0.63

$0.025/image · premium

#2 dall-e-3

Price score 0.20

Quality score 0.75

Provider boost +0.10

Total 0.58

$0.040/image · flagship

#5 stable-diffusion-3.5

Price score 0.45

Quality score 0.60

Health penalty −0.50

Total 0.28

$0.018/image · standard · degraded

Costs

Savings &
Transparency

Every response includes a price_comparison object. You see what you paid, what you could have paid, and what the premium option costs.

Price comparison for this request

Cheapest google/gemini-2.5-flash $0.00015

Selected openai/gpt-5-mini $0.00120

Best quality anthropic/claude-4-sonnet $0.00800

Saved 85% vs. most expensive option

budget_dollars

Hard ceiling in USD. Models above this are filtered out before scoring.

optimization

price-optimized favors economy/standard tiers. output-optimized favors premium/flagship. balanced blends both axes.

maxCost (tiers)

Cap by price tier: economy, standard, balanced, premium, or flagship.

/v1/auto/suggest

Preview rankings before spending. No inference, no cost.

Control

Fine-grained
Control

Per-request parameters override org defaults. Org defaults apply to every request unless you say otherwise.

Per-request

task — skip classification (text, image, video, …)

optimization — price-optimized | balanced | output-optimized

budget_dollars — max cost ceiling (USD)

maxCost — cap by tier (economy … flagship)

content_mode — generate | edit | combine

taxonomy — catalog category (translation, llm, …)

provider — narrow within taxonomy (google, meta)

catalog_family — pick best variant in a model line

model — explicit provider/model id

Organization

excluded_providers — never select these

excluded_models — skip specific IDs

preferred_providers — +0.1 score boost

default_optimization — org-wide default

default_task — fallback task type

Reliability

Always get
a result

Auto-fallback

If the top model fails during inference, the router immediately tries #2, #3, and so on. You always get a result or a clear error.

Health-aware routing

Three consecutive failures or a >50% failure rate marks a model degraded. The router applies a −0.5 score penalty until the model recovers. No manual blacklist needed.

Session tracking

Pass session_id to get contextual boosts. Repeated tasks get higher confidence. Blind spots are flagged for review.

Try it in one call

Free credits with every account. No setup, no provider keys, no guesswork.

curl https://api.greatrouterai.com/v1/auto/route \
  -H "Authorization: Bearer $KEY" \
  -d '{"prompt":"A photorealistic cat","task":"image","optimization":"balanced"}'

Get Started Free Browse Models

One API call.The right model.Every time.

Four steps to theperfect model

How a modelis selected

Savings &Transparency

Fine-grainedControl

Always geta result