Skip to content

Model Selection and Smart Routing

Wayland Core handles model selection at two distinct layers: static aliases and per-provider defaults set at startup, and per-request family detection that adapts the wire format to the model being called.

Shared model alias constants live in wcore-types::model_aliases. Named constants such as ANTHROPIC_SONNET, OPENAI_GPT4O, BEDROCK_SONNET, VERTEX_SONNET, and VERTEX_GEMINI_PRO are used throughout the engine so the codebase does not scatter raw model ID strings.

At the config level, each [providers.<name>] block accepts a model key, and each [profiles.<name>] block can bundle a provider and model together:

[default]
provider = "anthropic"
model = "claude-sonnet-4-6"
[profiles.fast]
provider = "groq"
model = "llama-3.3-70b-versatile"
[profiles.reasoning]
provider = "openai"
model = "o3-mini"

Activate a profile with --profile <name> or set [default] profile = "fast" in the config file.

wcore-providers/src/openai_compat.rs detects the OpenAI model family on every request, not at provider construction time. A single OpenAIProvider instance can therefore serve both classic gpt-4o and reasoning o3-mini turns correctly within the same session.

Three predicates cover the divergent wire shapes:

PredicateFamilies matchedEffect
wants_max_completion_tokens(model)o1*, o3*, gpt-5*Sends max_completion_tokens instead of max_tokens. Sending max_tokens to these families returns a 400.
accepts_reasoning_effort(model)o1*, o3*, gpt-5*Includes reasoning_effort (low / medium / high) in the request body when set.
accepts_temperature(model)All except o1*, o3*Omits temperature for the o1/o3 series; gpt-5* still honors an explicit value.

All three predicates do case-insensitive prefix matching on the model string. The o-series check requires the character immediately after o to be an ASCII digit, so unrelated model names such as octo-7b or ollama-llama3 are not matched.

The gpt-5* predicate matches any model whose lowercased name starts with "gpt-5".

wcore-providers/src/routing.rs computes a RoutingDecision from a RequestShape before each turn. The decision is stamped onto LlmRequest::routing_hint as a string label in the form "{tier}:{reason}" (for example, "premium:large_context"). This hint is surfaced in tracing spans when the ProviderChain dispatches the request.

The tier-to-model mapping is the caller’s responsibility. The router is provider-agnostic: it produces a label, it does not select a provider or model.

Rules are evaluated in order. The first match wins.

TierConditionReason label
Premiumrequires_vision = truerequires_vision
Premiuminput_tokens > 8000large_context
Premiumtool_call_count >= 3tool_heavy
Balancedcode_ratio >= 0.30code_heavy
Cheapall other requestssimple

Thresholds are held in RoutingHeuristics::default() and can be overridden in code:

let heuristics = RoutingHeuristics {
code_threshold: 0.30,
large_context_tokens: 8000,
tool_heavy_calls: 3,
};
let decision = route(&shape, &heuristics);
// decision.to_hint() returns a RoutingHint wrapping e.g. "premium:tool_heavy"

The hint is a RoutingHint(String) on LlmRequest. The ProviderChain reads it for tracing:

routing_hint = "premium:large_context"
chain_len = 2

No dispatch logic changes based on the hint today. Tier-based model selection (routing the hint to a specific model or provider slot) is the caller’s job.

A minimal setup that routes reasoning-heavy work to o3-mini and defaults to claude-sonnet-4-6 for everything else:

[default]
provider = "anthropic"
model = "claude-sonnet-4-6"
[profiles.reasoning]
provider = "openai"
model = "o3-mini"
# o3-mini: engine auto-sends max_completion_tokens + accepts reasoning_effort
[profiles.vision]
provider = "anthropic"
model = "claude-opus-4-7"
# for turns where requires_vision = true → Premium tier hint

Switch profiles at runtime:

Terminal window
wayland-core --profile reasoning "Explain the proof of Fermat's Last Theorem"
wayland-core --provider openai --model o3-mini "..."