Budget & Cost Caps
Wayland Core tracks spending at two levels and enforces limits before a charge lands, not after. All caps are opt-in: nothing is blocked by default, but every cap configured will prevent overruns atomically.
Source crates: wcore-budget, wcore-pricing.
Two enforcement models
Section titled “Two enforcement models”1. ExecutionBudget - global session caps ([budget])
Section titled “1. ExecutionBudget - global session caps ([budget])”ExecutionBudget is an Arc<RwLock<...>> tree threaded through every tool call via ToolContext.budget. It tracks seven axes and calls is_exceeded() before launching any long work.
# ~/.wayland-core/config.toml (or project .wayland-core.toml)[budget]max_wall_time_secs = 600 # wall clock since session startmax_tool_runtime_secs = 120 # cumulative tool execution timemax_processes = 8 # concurrent subprocesses at any instantmax_agent_depth = 4 # sub-agent nesting depthmax_tokens_in = 200000 # total input tokensmax_tokens_out = 16384 # total output tokensmax_cost_usd = 1.50 # total USD costAll seven fields are optional. Omitting a field means no cap on that axis.
Check order when is_exceeded() is called:
max_wall_time → max_tool_runtime → max_processes → max_agent_depth → max_tokens_in → max_tokens_out → max_cost_usd.
The first exceeded axis is returned as the reason string; the others are not evaluated.
Tree rollup. When a sub-agent spawns, it gets a sub_budget() child view. Counters recorded on the child roll up to all ancestors, so the root view always reflects the full session total. A stricter cap on a child does not relax the parent.
RAII guards. Tools that fork subprocesses call enter_tool_run(), which increments processes_active and decrements it on drop via ToolRunGuard. Sub-agent spawns use enter_agent() / AgentDepthGuard the same way.
2. BudgetTracker - per-session and per-user caps ([session_cap])
Section titled “2. BudgetTracker - per-session and per-user caps ([session_cap])”BudgetTracker is the M5.3 model. It is keyed by session id and (optionally) user id, and enforces three axes:
| Axis | Builder call | Behaviour |
|---|---|---|
per_session_tokens | .per_session_tokens(n) | Blocks a charge that would push this session’s token total past n. |
per_session_usd | .per_session_usd(d) | Blocks a charge that would push this session’s USD total past d. |
per_user_daily_usd | .per_user_daily_usd(d) | Blocks a charge that would push this user’s UTC-day USD total past d. Resets at midnight UTC. |
Build a cap:
let cap = BudgetCap::builder() .per_session_usd(0.10) .per_user_daily_usd(1.00) .build();let mut tracker = BudgetTracker::new(cap);For a single-tenant run, call tracker.charge(session_id, tokens, usd).
For a multi-tenant run where one user maps to multiple sessions in a day:
tracker.charge_for_user(session_id, user_id, tokens, usd)?;charge_for_user checks the per-user daily cap first, then calls charge for the per-session cap. A rejected charge leaves both buckets at their prior totals - the rejected amount does not “stick”.
Reading totals:
let (tokens, usd) = tracker.session_totals(session_id);let daily_usd = tracker.user_daily_usd(user_id);Translating [budget] TOML into a BudgetCap
Section titled “Translating [budget] TOML into a BudgetCap”BudgetCap implements From<&BudgetConfig> so you can translate the TOML block directly:
let cap = BudgetCap::from(&budget_config);The mapping:
max_tokens_in + max_tokens_out(either or both) →per_session_tokens(sum).max_cost_usd→per_session_usd.max_wall_time_secs,max_tool_runtime_secs,max_processes,max_agent_depthbelong toExecutionBudgetonly; no counterpart inBudgetCap.per_user_daily_usdhas no TOML counterpart. Set it directly via the builder for multi-tenant deployments.
Events
Section titled “Events”Every charge attempt emits a BudgetEvent to an attached BudgetEventSink. Three variants:
pub enum BudgetEvent { Charge { session_id, tokens, usd }, // accepted CapWarn { session_id, pct_used: f32 }, // accepted, but >= 80% of strictest cap CapBlock { session_id, reason: BudgetError }, // rejected}CapWarn fires when the running total for a session crosses 80% of the strictest configured per-session cap (tokens or USD, whichever is closer to its limit). It fires on every accepted charge after that threshold, not just the first.
CapBlock carries a BudgetError::CapExceeded with the cap kind (per_session_tokens, per_session_usd, or per_user_daily_usd), the configured limit, and the observed total that would have resulted.
Wiring the event sink
Section titled “Wiring the event sink”In production the engine connects ObservabilityBudgetEventBridge, which serializes each BudgetEvent to JSON and forwards it through the SpanSink telemetry channel:
let bridge = Arc::new(ObservabilityBudgetEventBridge::new(span_sink.clone()));tracker.set_event_sink(bridge);Sink calls happen synchronously on the charge hot path; implementations must not block.
Pricing catalog
Section titled “Pricing catalog”Cost is computed from wcore-pricing/pricing.toml, bundled at compile time. Override with WAYLAND_PRICING_PATH=/path/to/my-pricing.toml.
The catalog uses TOML sections keyed by [provider.model]:
[anthropic.claude-opus-4-7]input_per_mtok_usd = 5.0output_per_mtok_usd = 25.0cache_read_per_mtok_usd = 0.5cache_write_per_mtok_usd = 6.25
[openai.gpt-5]input_per_mtok_usd = 5.0output_per_mtok_usd = 15.0cache_read_per_mtok_usd and cache_write_per_mtok_usd are optional; models that do not support prompt caching omit them.
Selected rates from the bundled catalog (Q2-2026 list prices, USD per million tokens):
| Provider | Model | Input | Output |
|---|---|---|---|
| anthropic | claude-opus-4-7 | 5.00 | 25.00 |
| anthropic | claude-sonnet-4-6 | 3.00 | 15.00 |
| anthropic | claude-haiku-4-5 | 1.00 | 5.00 |
| openai | gpt-5 | 5.00 | 15.00 |
| openai | gpt-4o | 2.50 | 10.00 |
| openai | o1 | 15.00 | 60.00 |
| gemini | gemini-2-5-pro | 1.25 | 10.00 |
| deepseek | deepseek-v4-flash | 0.14 | 0.28 |
| groq | llama-3-3-70b | 0.59 | 0.79 |
Self-hosted / local providers (ollama, vllm, lmstudio, litellm, openai-compatible) are cataloged at 0.00/0.00.
Live refresh
Section titled “Live refresh”PricingRefresher can fetch the OpenRouter /api/v1/models catalog (24-hour TTL) and diff it against the bundled data, emitting CatalogChange events (Added, Removed, PriceChanged). Auto-application is off by default; diffs are surfaced for review only. Set WAYLAND_PRICING_AUTO_REFRESH=off to skip the live fetch entirely and keep the bundled catalog static.
The on-disk cache lives at ~/.wayland/pricing-cache.json by default (WAYLAND_HOME respected).
Charge integrity caveat
Section titled “Charge integrity caveat”Bedrock and Vertex pricing caveat
Section titled “Bedrock and Vertex pricing caveat”Summary of TOML keys
Section titled “Summary of TOML keys”| Key | Section | Type | Default |
|---|---|---|---|
max_wall_time_secs | [budget] | u64 | none |
max_tool_runtime_secs | [budget] | u64 | none |
max_processes | [budget] | usize | none |
max_agent_depth | [budget] | usize | none |
max_tokens_in | [budget] | u64 | none |
max_tokens_out | [budget] | u64 | none |
max_cost_usd | [budget] | f64 | none |
[session_cap] per-user-daily-usd has no TOML field; set it programmatically via BudgetCapBuilder.
See also
Section titled “See also”- Configuration - how TOML files merge across the three config levels.
- Tracing & Telemetry - how
BudgetEventflows through theSpanSinkchannel.