Reading guide: Pipeline → Score → Costs → Control → Reliability. Each section explains one stage of automatic routing.
One API call.
The right model.
Every time.
Send any prompt — text, images, videos, audio, or music — to a single endpoint. GreatRouter detects what you are trying to do, finds the best model from 1,516 options, and adds under 5ms of routing overhead on most requests.
Four steps to the
perfect model
Fast path: pass task directly (<1ms).
Medium path: regex keyword match on your prompt (~2ms).
Slow path: Gemma 4 26B for ambiguous inputs (~80ms, fallback to Llama 3.2 3B).
+ tier_bonus + intelligence + pref − health
Two-axis scoring: price and quality blend by optimization
(price-optimized, balanced, or output-optimized).
Intelligence bonus matches complexity tier to model quality and feedback history.
Apply budget_dollars filter, deduplicate overlap groups, pick top.
Return the winner + full price comparison (cheapest / best-value / best-quality).
Routes the request to the selected model through our inference network.
OpenAI-compatible /v1/chat/completions or native auto-route JSON.
If it fails, auto-retry the next-ranked model.
How a model
is selected
Example: "Generate a photorealistic image of a mountain lake" with optimization: "balanced".
Savings &
Transparency
Every response includes a price_comparison object. You see what you paid, what you could have paid, and what the premium option costs.
budget_dollarsHard ceiling in USD. Models above this are filtered out before scoring.
optimizationprice-optimized favors economy/standard tiers. output-optimized favors premium/flagship. balanced blends both axes.
maxCost (tiers)Cap by price tier: economy, standard, balanced, premium, or flagship.
/v1/auto/suggestPreview rankings before spending. No inference, no cost.
Fine-grained
Control
Per-request parameters override org defaults. Org defaults apply to every request unless you say otherwise.
task — skip classification (text, image, video, …)optimization — price-optimized | balanced | output-optimizedbudget_dollars — max cost ceiling (USD)maxCost — cap by tier (economy … flagship)content_mode — generate | edit | combinetaxonomy — catalog category (translation, llm, …)provider — narrow within taxonomy (google, meta)catalog_family — pick best variant in a model linemodel — explicit provider/model idexcluded_providers — never select theseexcluded_models — skip specific IDspreferred_providers — +0.1 score boostdefault_optimization — org-wide defaultdefault_task — fallback task typeAlways get
a result
Auto-fallback
If the top model fails during inference, the router immediately tries #2, #3, and so on. You always get a result or a clear error.
Health-aware routing
Three consecutive failures or a >50% failure rate marks a model degraded. The router applies a −0.5 score penalty until the model recovers. No manual blacklist needed.
Session tracking
Pass session_id to get contextual boosts. Repeated tasks get higher confidence. Blind spots are flagged for review.
Try it in one call
Free credits with every account. No setup, no provider keys, no guesswork.
curl https://api.greatrouterai.com/v1/auto/route \
-H "Authorization: Bearer $KEY" \
-d '{"prompt":"A photorealistic cat","task":"image","optimization":"balanced"}'