Skip to content

Tracing & Telemetry

Wayland Core emits structured traces for every agent turn, tool call, memory operation, and budget charge. Traces are JSON values routed through a SpanSink abstraction; the sink implementation determines where they go.

Source crate: wcore-observability.


Three nested types cover one complete session:

ExecutionTrace (session-scoped aggregate)
└── TurnTrace (one per LLM turn)
└── ToolCallTrace (one per tool call within the turn)

Every record carries source_product = "wayland-core" so trace streams from multiple sources can be separated.

One tool invocation:

FieldTypeNotes
call_idStringOpaque identifier for this invocation.
tool_nameStringName of the tool (e.g. "Read", "Bash", "Grep").
inputValueRaw input JSON.
output_summaryStringBounded summary, up to 4096 chars. Full output sits in session storage.
duration_msu64Wall time from dispatch to result.
bytes_in / bytes_outu64Byte counts for the call.
errorOption<String>Present only when the call returned an error; omitted otherwise.
cancelledbooltrue if the call was cancelled mid-execution (populated in W7). Default false; omitted from JSON when false.
partialbooltrue if the call produced a partial flush (populated in W7). Default false; omitted from JSON when false.
result_snippetOption<String>First 512 bytes of the tool result, truncated at a UTF-8 char boundary. Absent when snippet capture is disabled (see below).
source_productStringAlways "wayland-core".

Result snippet capture is on by default. Disable it by setting WAYLAND_TRACE_RESULT_SNIPPETS=off (also accepts 0, false, no). When disabled, result_snippet is always null/absent.

One complete LLM call plus every tool call it triggered:

FieldTypeNotes
turnusizeTurn index within the session.
modelStringModel id as reported by the provider.
providerStringProvider name (e.g. "anthropic", "openai").
input_tokens / output_tokensu64Token counts for this turn.
cache_read / cache_writeu64Prompt-cache token counts.
cache_hit_ratef64cache_read / input_tokens for this turn. 0.0 when input_tokens is zero.
cost_usdf64USD cost for this turn, computed in W6. 0.0 in W1 traces.
tool_callsVec<ToolCallTrace>Tool calls made during this turn, in order.
hook_actionsVec<HookActionRecord>Hook engine records (populated in W2; empty in W1).
source_productStringAlways "wayland-core".

Aggregate for one whole session or task:

FieldTypeNotes
session_idString
task_idString
task_descriptionOption<String>Optional; omitted from JSON when absent.
turnsVec<TurnTrace>
outcomeTaskOutcomeTagged enum: success, partial_success, failure, timeout, user_aborted, suspended.
total_cost_usdf64Sum across all turns.
total_input_tokens / total_output_tokensu64
duration_msu64
source_productStringAlways "wayland-core".

MemoryOpTrace: emitted by wcore-memory around every gated memory API call. Fields: op (method name, e.g. "record_episode"), partition, tier, latency_ms, success.

EvolutionEventTrace: emitted once per scored child by the wcore-evolve GEPA loop. Fields: run_id, generation, parent_id, child_id, mutation_kind, score, retained.

WorkflowDetectionRecord: shadow-mode workflow detection record (Dynamic Workflows B4). Emitted when observability.workflow_detection_enabled is on and the per-turn WorkflowCandidate heuristic fires, without touching routing. Flat JSON; filter on "kind": "workflow_detection". The task_excerpt field is capped at 120 bytes and scrubbed for credentials before truncation (see PII scrubbing below).


SpanSink is the trait the agent targets. All sinks receive a &serde_json::Value and must not panic. Emission is synchronous but non-blocking from the agent’s perspective; any I/O or network error handling is the sink’s responsibility.

pub trait SpanSink: Send + Sync {
fn emit(&self, trace: &Value);
}

Three implementations are available:

Buffers every emitted trace in an Arc<Mutex<Vec<Value>>>. Clone-safe: all clones share the same buffer.

let sink = InMemorySink::new();
// ... run agent ...
let traces = sink.snapshot(); // Vec<Value>

Useful for tests, diagnostics, and the planned HITL-suspend trace-replay path. snapshot() returns a clone of the current buffer without clearing it.

Writes one JSON line per trace to stdout. Useful when running wayland-core outside --json-stream mode and you want trace output on the terminal.

let sink: Arc<dyn SpanSink> = Arc::new(JsonStdoutSink);

Exports traces to an OpenTelemetry collector over OTLP/HTTP. Behind the otlp Cargo feature flag; not included in the default binary build to keep binary size within the §2.2 notarization budget.

Enable the feature:

Cargo.toml
wcore-observability = { ..., features = ["otlp"] }

Construct the sink:

let sink = OtlpSink::new("http://localhost:4318/v1/traces")?;

Construction is fully fallible (Result<OtlpSink, OtlpSinkError>) so DNS, TLS, or proxy failures surface as errors rather than panics. The provider is initialized once via OnceLock; subsequent calls with the same process reuse the static provider. Each turn trace becomes one span with the full JSON serialized as the trace.json attribute and service.name = "wayland-core".


PiiScrubbingSink wraps any SpanSink and scrubs every trace’s serialized JSON before forwarding it. The scrubber (wcore-safety::PIIScrubber) uses a RegexSet fast-bail-out: when no pattern matches the input, it returns Cow::Borrowed (zero allocation). Only when a match is found does it allocate and replace.

let scrubbed_sink = PiiScrubbingSink::wrap(inner_sink);

The scrubber recognizes 28 credential and PII patterns, replacing each match with [REDACTED:<KIND>]:

KindPattern
AWS_ACCESS_KEYAKIA + 16 uppercase alphanumeric chars
AWS_SECRET_KEYaws...secret...= <40-char base64> (case-insensitive)
OPENAI_API_KEYsk- + 32 or more alphanumeric chars
ANTHROPIC_API_KEYsk-ant- + alphanumeric/dash/underscore
JWTeyJ...eyJ... (three base64url segments)
BEARER_TOKENBearer + 20 or more chars of token material
GITHUB_PATghp_ + 20 or more alphanumeric chars
GITHUB_PAT_FGgithub_pat_ + 20 or more alphanumeric/underscore chars
GITHUB_OAUTHgho_, ghu_, ghs_, or ghr_ + 20 or more chars
SLACK_TOKENxoxb-, xoxa-, xoxp-, xoxr-, or xoxs- prefix
GOOGLE_API_KEYAIza + 30 or more alphanumeric/underscore/dash chars
GOOGLE_OAUTH_REFRESH4/0 + 20 or more alphanumeric/underscore/dash chars
STRIPE_SECRET_KEYsk_live_ / sk_test_ / rk_live_ / rk_test_ + 20 or more chars
SENDGRID_API_KEYSG. + 20 or more chars
HUGGINGFACE_TOKENhf_ + 20 or more alphanumeric chars
REPLICATE_TOKENr8_ + 20 or more alphanumeric chars
NPM_TOKENnpm_ + 30 or more alphanumeric chars
PYPI_TOKENpypi- + 20 or more alphanumeric/underscore/dash chars
DIGITALOCEAN_TOKENdop_v1_ or doo_v1_ + 20 or more chars
PERPLEXITY_API_KEYpplx- + 20 or more alphanumeric chars
GROQ_API_KEYgsk_ + 20 or more alphanumeric chars
TAVILY_API_KEYtvly- + 20 or more alphanumeric chars
EXA_API_KEYexa_ + 20 or more alphanumeric chars
BROWSERBASE_KEYbb_live_ + 20 or more alphanumeric/underscore/dash chars
TELEGRAM_BOT_TOKEN8+ digit id + : + 30 or more url-safe chars
PRIVATE_KEY_BLOCK-----BEGIN ... PRIVATE KEY----- PEM blocks
DB_CONNECTION_STRINGpostgres://, mysql://, mongodb://, redis://, amqp:// with embedded credentials
PHONE_E164+ + E.164 phone number (7–15 digits, country code 1–9)
DISCORD_MENTION<@...> Discord snowflake mentions (17–20 digit ids)

Scrubbing applies patterns in order, so a string containing multiple credential types is fully redacted in one pass.

WorkflowDetectionRecord scrubs at construction. Because the emit_trace path this record flows through does not pass through PiiScrubbingSink, WorkflowDetectionRecord::new scrubs the task string before truncating to the 120-byte excerpt. A secret straddling the truncation boundary is redacted before the cut, not after, so no fragment can survive.


The [observability] section in config.toml controls runtime behavior:

[observability]
# Shadow workflow detection (B4). Off by default.
# Emits WorkflowDetectionRecord telemetry without changing routing.
workflow_detection_enabled = false

The online_evolution flag (also under [observability]) is shown in configuration.md but controls the GEPA evolution track rather than tracing.

All WAYLAND_* opt-out gates share the same disable vocabulary: off, 0, false, no (case-insensitive, whitespace trimmed). Any other value, including unset, keeps the gate enabled.

VariableDefaultEffect
WAYLAND_TRACE_RESULT_SNIPPETSonSet to off to suppress result_snippet on all ToolCallTrace records.
WAYLAND_PRICING_AUTO_REFRESHonSet to off to skip the live pricing fetch and keep the bundled catalog static.
WAYLAND_PRICING_PATH(bundled)Absolute path to a replacement pricing.toml to use instead of the bundled file.

BudgetEvent records flow through the same SpanSink channel via ObservabilityBudgetEventBridge. Each event serializes to a flat JSON object with a "kind" discriminator:

{ "kind": "charge", "session_id": "...", "tokens": 1200, "usd": 0.0036 }
{ "kind": "cap_warn", "session_id": "...", "pct_used": 0.83 }
{ "kind": "cap_block", "session_id": "...", "reason": { ... } }

These records appear inline in the same JSON-Lines stream as TurnTrace and MemoryOpTrace records. Filter on "kind" to separate them.

See Budget & Cost Caps for full cap configuration and event semantics.


wcore-observability also owns the prompt-cache discipline module. mark_cache_boundaries places a MessageCacheHint::Breakpoint on the last message of an LlmRequest before each API call (on providers that support explicit breakpoints per ProviderCompat.cache_message_breakpoints()). Combined with system-prompt and tool-list markers placed by the individual provider build steps, every turn after the first benefits from a long cacheable prefix. The function is idempotent: calling it multiple times on the same request leaves at most one breakpoint at the tail.