Budget & Cost Caps

Wayland Core tracks spending at two levels and enforces limits before a charge lands, not after. All caps are opt-in: nothing is blocked by default, but every cap configured will prevent overruns atomically.

Source crates: wcore-budget, wcore-pricing.

Two enforcement models

1. ExecutionBudget - global session caps (`[budget]`)

ExecutionBudget is an Arc<RwLock<...>> tree threaded through every tool call via ToolContext.budget. It tracks seven axes and calls is_exceeded() before launching any long work.

# ~/.wayland-core/config.toml  (or project .wayland-core.toml)
[budget]
max_wall_time_secs    = 600    # wall clock since session start
max_tool_runtime_secs = 120    # cumulative tool execution time
max_processes         = 8      # concurrent subprocesses at any instant
max_agent_depth       = 4      # sub-agent nesting depth
max_tokens_in         = 200000 # total input tokens
max_tokens_out        = 16384  # total output tokens
max_cost_usd          = 1.50   # total USD cost

All seven fields are optional. Omitting a field means no cap on that axis.

Check order when is_exceeded() is called:
max_wall_time → max_tool_runtime → max_processes → max_agent_depth → max_tokens_in → max_tokens_out → max_cost_usd.
The first exceeded axis is returned as the reason string; the others are not evaluated.

Tree rollup. When a sub-agent spawns, it gets a sub_budget() child view. Counters recorded on the child roll up to all ancestors, so the root view always reflects the full session total. A stricter cap on a child does not relax the parent.

RAII guards. Tools that fork subprocesses call enter_tool_run(), which increments processes_active and decrements it on drop via ToolRunGuard. Sub-agent spawns use enter_agent() / AgentDepthGuard the same way.

2. BudgetTracker - per-session and per-user caps (`[session_cap]`)

BudgetTracker is the M5.3 model. It is keyed by session id and (optionally) user id, and enforces three axes:

Axis	Builder call	Behaviour
`per_session_tokens`	`.per_session_tokens(n)`	Blocks a charge that would push this session’s token total past `n`.
`per_session_usd`	`.per_session_usd(d)`	Blocks a charge that would push this session’s USD total past `d`.
`per_user_daily_usd`	`.per_user_daily_usd(d)`	Blocks a charge that would push this user’s UTC-day USD total past `d`. Resets at midnight UTC.

Build a cap:

let cap = BudgetCap::builder()
    .per_session_usd(0.10)
    .per_user_daily_usd(1.00)
    .build();
let mut tracker = BudgetTracker::new(cap);

For a single-tenant run, call tracker.charge(session_id, tokens, usd).
For a multi-tenant run where one user maps to multiple sessions in a day:

tracker.charge_for_user(session_id, user_id, tokens, usd)?;

charge_for_user checks the per-user daily cap first, then calls charge for the per-session cap. A rejected charge leaves both buckets at their prior totals - the rejected amount does not “stick”.

Reading totals:

let (tokens, usd) = tracker.session_totals(session_id);
let daily_usd     = tracker.user_daily_usd(user_id);

Translating `[budget]` TOML into a `BudgetCap`

BudgetCap implements From<&BudgetConfig> so you can translate the TOML block directly:

let cap = BudgetCap::from(&budget_config);

The mapping:

max_tokens_in + max_tokens_out (either or both) → per_session_tokens (sum).
max_cost_usd → per_session_usd.
max_wall_time_secs, max_tool_runtime_secs, max_processes, max_agent_depth belong to ExecutionBudget only; no counterpart in BudgetCap.
per_user_daily_usd has no TOML counterpart. Set it directly via the builder for multi-tenant deployments.

Events

Every charge attempt emits a BudgetEvent to an attached BudgetEventSink. Three variants:

pub enum BudgetEvent {
    Charge  { session_id, tokens, usd },          // accepted
    CapWarn { session_id, pct_used: f32 },         // accepted, but >= 80% of strictest cap
    CapBlock { session_id, reason: BudgetError },  // rejected
}

CapWarn fires when the running total for a session crosses 80% of the strictest configured per-session cap (tokens or USD, whichever is closer to its limit). It fires on every accepted charge after that threshold, not just the first.

CapBlock carries a BudgetError::CapExceeded with the cap kind (per_session_tokens, per_session_usd, or per_user_daily_usd), the configured limit, and the observed total that would have resulted.

Wiring the event sink

In production the engine connects ObservabilityBudgetEventBridge, which serializes each BudgetEvent to JSON and forwards it through the SpanSink telemetry channel:

let bridge = Arc::new(ObservabilityBudgetEventBridge::new(span_sink.clone()));
tracker.set_event_sink(bridge);

Sink calls happen synchronously on the charge hot path; implementations must not block.

Pricing catalog

Cost is computed from wcore-pricing/pricing.toml, bundled at compile time. Override with WAYLAND_PRICING_PATH=/path/to/my-pricing.toml.

The catalog uses TOML sections keyed by [provider.model]:

[anthropic.claude-opus-4-7]
input_per_mtok_usd        = 5.0
output_per_mtok_usd       = 25.0
cache_read_per_mtok_usd   = 0.5
cache_write_per_mtok_usd  = 6.25

[openai.gpt-5]
input_per_mtok_usd   = 5.0
output_per_mtok_usd  = 15.0

cache_read_per_mtok_usd and cache_write_per_mtok_usd are optional; models that do not support prompt caching omit them.

Selected rates from the bundled catalog (Q2-2026 list prices, USD per million tokens):

Provider	Model	Input	Output
anthropic	claude-opus-4-7	5.00	25.00
anthropic	claude-sonnet-4-6	3.00	15.00
anthropic	claude-haiku-4-5	1.00	5.00
openai	gpt-5	5.00	15.00
openai	gpt-4o	2.50	10.00
openai	o1	15.00	60.00
gemini	gemini-2-5-pro	1.25	10.00
deepseek	deepseek-v4-flash	0.14	0.28
groq	llama-3-3-70b	0.59	0.79

Self-hosted / local providers (ollama, vllm, lmstudio, litellm, openai-compatible) are cataloged at 0.00/0.00.

Live refresh

PricingRefresher can fetch the OpenRouter /api/v1/models catalog (24-hour TTL) and diff it against the bundled data, emitting CatalogChange events (Added, Removed, PriceChanged). Auto-application is off by default; diffs are surfaced for review only. Set WAYLAND_PRICING_AUTO_REFRESH=off to skip the live fetch entirely and keep the bundled catalog static.

The on-disk cache lives at ~/.wayland/pricing-cache.json by default (WAYLAND_HOME respected).

Charge integrity caveat

Bedrock and Vertex pricing caveat

Summary of TOML keys

Key	Section	Type	Default
`max_wall_time_secs`	`[budget]`	`u64`	none
`max_tool_runtime_secs`	`[budget]`	`u64`	none
`max_processes`	`[budget]`	`usize`	none
`max_agent_depth`	`[budget]`	`usize`	none
`max_tokens_in`	`[budget]`	`u64`	none
`max_tokens_out`	`[budget]`	`u64`	none
`max_cost_usd`	`[budget]`	`f64`	none

[session_cap] per-user-daily-usd has no TOML field; set it programmatically via BudgetCapBuilder.