Skip to content

Agent Loop and Turn Lifecycle

Every conversation turn starts and ends inside one method: AgentEngine::run(&mut self, user_input: &str, msg_id: &str) at crates/wcore-agent/src/engine.rs:2312. This page walks through each stage in order.

StepWhat runs
1StyleDetector::observe: fingerprint user vocabulary
2observe_user_turn: write to UserModelBackend
3SkillRouter::choose: Thompson-sampled skill pick
4recall_relevant_facts: inject durable memory
5push_user_turn + WAL write: persist before any LLM call
6Workflow detection gate: optional ForgeFlow intercept
loop startRepeats until StopReason fires or max_turns is reached
7cancel_token check: cooperative cancellation
8max_turns / context-token check: runaway guards
9on_turn_start hooks: pre-turn hook engine
10PreCompact hooks: hook pass before compaction
11run_compaction: multi-level context compaction
12Tool-list filtering: plan mode gate
13MCP tool curation: top-K trim
14System prompt assembly: append plan / skill hint
15Cache tier + routing hint: provider optimisation
16PrePrompt hooks: hook pass before stream
17provider.stream / retry loop: LLM call with bounded retries
18Tool dispatch: per-category timeouts and gates
loop end
19observe_skill_router_outcome: credit or debit the pick
20observe_auto_skill: bucketer and drafter trigger

1. StyleDetector::observe (engine.rs:2315)

Section titled “1. StyleDetector::observe (engine.rs:2315)”

StyleDetector::observe(user_input) is called first, before any other logic. It fingerprints the user’s vocabulary so the UserModelBackend trait can adapt tone and detail level across sessions.

observe_user_turn(user_input) writes a per-turn preference observation via the UserModelBackend trait. No-op when no backend is installed; errors are logged and swallowed, never fatal.

3. Per-turn SkillRouter pick (engine.rs:2331)

Section titled “3. Per-turn SkillRouter pick (engine.rs:2331)”

If a SkillRouter is installed and the skill catalog is non-empty, choose is called against wcore_dispatch::DecisionRouter. The pick is a Thompson Beta-sampled selection from the full catalog. The winner is stashed on the engine as current_skill_router_pick so the matching observe call at turn end credits the same arm.

When a skill is picked, a single non-binding hint line is appended to the system prompt (engine.rs:2547). Engines without a router are unaffected.

4. Cross-session memory recall (engine.rs:2372)

Section titled “4. Cross-session memory recall (engine.rs:2372)”

recall_relevant_facts(user_input) queries the durable memory store for facts relevant to the current input and injects them before the first LLM call. On resumed sessions, or with NullMemory, this is a no-op. The intent is that a fresh process can answer from prior-session memory without waiting for the model to invoke session_search explicitly.

The user message is written to disk before any LLM call. A SIGKILL mid-turn cannot erase it. On the first message in a session, persist_first_message writes the session file and the WAL entry together. On subsequent messages, append_wal appends only. When the session is later resumed, SessionManager::load calls merge_wal to fold the WAL entries back into the message list.

6. Workflow detection gate (engine.rs:2426)

Section titled “6. Workflow detection gate (engine.rs:2426)”

Before entering the LLM loop, the engine checks whether the input looks like a ForgeFlow candidate via orchestration::intent::workflow_candidate. If all conditions are true (workflow live mode enabled, an approval manager and protocol writer are wired, and the heuristic matches), try_live_workflow runs the synthesised workflow and returns early, skipping the normal turn loop entirely.

If the user declines, or synthesis fails, or the mode is off, None is returned and execution falls through to the normal loop.

7-8. Loop entry: cooperative cancellation and runaway guards

Section titled “7-8. Loop entry: cooperative cancellation and runaway guards”

At the top of each iteration, the engine checks cancel_token.is_cancelled(). A host (TUI, ACP server) that fires the token stops the loop cleanly at turn boundary, avoiding mid-await cancellation. The unpaired tool_use left by an in-turn cancel is repaired on next push_user_turn.

max_turns is an optional override, not the primary runaway guard. When set it caps the loop at that count. When None, the budget cap (from wcore-budget) and the context-token ceiling together act as backstops. The context ceiling is checked after compaction: if the estimated full request size (messages + system + tool defs) still exceeds context_window - output_reserve - emergency_buffer, the run terminates with a user-visible message.

9-10. Pre-turn and PreCompact hooks (engine.rs:2472, 2489)

Section titled “9-10. Pre-turn and PreCompact hooks (engine.rs:2472, 2489)”

on_turn_start hooks run at the top of each iteration. A hook that returns block halts the loop cleanly; this is the intended mechanism for “stop after condition X” operators. PreCompact hooks run immediately before the compaction pass and are otherwise non-blocking.

run_compaction runs before each API call. On the first turn last_input_tokens is zero, so neither auto-compact nor emergency compaction fires. A compaction failure (for example ContextTooLong from emergency bail) ends the session: session-end hooks fire and the session is saved before the error propagates.

When plan_state.is_active is true, the tool registry is filtered to ToolCategory::Info tools only, excluding EnterPlanMode itself. Normal mode excludes ExitPlanMode. See Plan Mode for the full gate description.

apply_mcp_curation trims MCP tools (named mcp__{server}__{tool}) to a top-K set based on audit-log recency. Non-MCP tools (builtins, skills, spawn, plan tools) are always kept. This is a no-op when curation is off.

14. System prompt assembly (engine.rs:2529)

Section titled “14. System prompt assembly (engine.rs:2529)”

The system prompt is assembled from the configured base, with plan-mode instructions appended when active. If the skill router picked a visible skill, one hint line is appended.

15. Cache tier and routing hint (engine.rs:2596, 2636)

Section titled “15. Cache tier and routing hint (engine.rs:2596, 2636)”

pick_cache_tier promotes the request to a 1-hour Anthropic prompt cache tier once the estimated input token count clears the 1024-token minimum. Non-Anthropic providers ignore the field.

A RequestShape is computed from the actual tool call count and token estimate, passed through wcore_providers::route with default heuristics, and stamped onto the request as routing_hint. This surfaces provider routing decisions in observability without affecting correctness.

Orphaned tool_use blocks (any tool_use in history without a matching tool_result) are repaired before the request is sent. Anthropic returns a hard 400 on any orphan.

run_pre_prompt fires after the request is fully assembled and before the stream retry loop. It fires once per turn, not once per stream retry.

17. LLM stream loop with bounded retries (engine.rs:2683)

Section titled “17. LLM stream loop with bounded retries (engine.rs:2683)”

provider.stream() returns Ok(rx) after response headers arrive. The SSE body is drained from the channel. A failure after headers (connection reset, TLS drop, in-band error SSE frame) surfaces as LlmEvent::Error and is retried up to MAX_STREAM_RETRIES = 2 times with linear backoff. A truncated stream (channel closes with no Done event) is also treated as a failed attempt. Exhausting all retries ends the turn as an error verdict.

The loop iterates until StopReason::EndTurn or StopReason::MaxTokens, at which point control falls through to tool dispatch.

18. Tool dispatch with per-category timeouts

Section titled “18. Tool dispatch with per-category timeouts”

Every ContentBlock::ToolUse in the LLM response is routed through orchestration/mod.rs. Each dispatch is wrapped in tokio::time::timeout keyed on the tool’s ToolCategory:

CategoryTimeoutRationale
Exec600sInteractive shells and long builds legitimately need minutes.
Mcp120sCovers MCP/network tools whose subprocess or endpoint can wedge.
Info30sA file read should never take longer.
Edit30sA file edit should never take longer.

On elapse, the dispatcher fires the ToolContext.cancel token (so cooperative tools can wind down) and synthesises an error ToolResult. The agent loop continues rather than hanging.

When the engine is configured to require approval, ToolApprovalManager issues an ApprovalRequired protocol event and a matching Suspend event (gated by capabilities.hitl_suspend). The run pauses until an ApprovalResume arrives. --auto-approve skips all gates. --force / --yolo / --dangerously-skip-permissions bypass the gate entirely.

pre_tool_use hooks run before each tool call; a non-zero exit blocks the tool. post_tool_use hooks run after; they are non-blocking and errors are logged. See Hooks for configuration.

observe_skill_router_outcome feeds the turn’s StopReason back to the router, calling scorer.record(pick, outcome). EndTurn or ToolUse maps to TaskOutcome::Success. MaxTurns, error, or abort maps to TaskOutcome::Failure. TaskOutcome::Neutral leaves both alpha and beta unchanged.

20. Auto-skill observation (bootstrap.rs:1642)

Section titled “20. Auto-skill observation (bootstrap.rs:1642)”

observe_auto_skill feeds the turn trajectory to the Bucketer. The bucketer groups turns by normalised task signature: the input is lowercased, tokenised on whitespace and ASCII punctuation, stopwords and short tokens dropped, top-3 tokens by length taken, sorted and joined with -. Three consecutive successes on the same signature trigger a DraftTrigger, which fires the SkillDrafter. Any failure resets the streak for that signature bucket.

The SkillDrafter writes a candidate skill to $WAYLAND_HOME/skills/auto/<sig>/SKILL.md and a manifest.json alongside it (auto_drafted: true, needs_review: true, score: 0.7). The skill is also inserted into evolved_prompts in SQLite so the next session’s SkillRouter hydrates it as a seed pair at boot.

The drafter is installed only when a real Db handle is present (bootstrap.rs:1663). The bucketer always runs.

When the loop exits for any reason (EndTurn, MaxTurns, error, cancel), session-end hooks fire and save_session is called. The WAL is merged on the next load.

Key or flagDefaultEffect
--max-turns <N>none (budget/context backstops apply)Cap the loop at N iterations.
--auto-approveoffSkip approval prompts for all tool calls.
--force / --yolooffBypass tool gates entirely.
--compaction <off|safe|full>safeContext compaction level.
--resume <id>(none)Load a previous session by ID.
--resume-latest(none)Load the most-recent session.
max_turns (TOML)noneSame as --max-turns, set in config.
auto_approve (TOML)falseSame as --auto-approve, set in config.
compact.level (TOML)safeCompaction level.
WAYLAND_HOME~/.waylandBase directory for skills, logs, evolved prompts.