Agent Loop and Turn Lifecycle

Every conversation turn starts and ends inside one method: AgentEngine::run(&mut self, user_input: &str, msg_id: &str) at crates/wcore-agent/src/engine.rs:2312. This page walks through each stage in order.

Overview

Step	What runs
1	`StyleDetector::observe`: fingerprint user vocabulary
2	`observe_user_turn`: write to UserModelBackend
3	`SkillRouter::choose`: Thompson-sampled skill pick
4	`recall_relevant_facts`: inject durable memory
5	`push_user_turn` + WAL write: persist before any LLM call
6	Workflow detection gate: optional ForgeFlow intercept
loop start	Repeats until `StopReason` fires or `max_turns` is reached
7	`cancel_token` check: cooperative cancellation
8	`max_turns` / context-token check: runaway guards
9	`on_turn_start` hooks: pre-turn hook engine
10	`PreCompact` hooks: hook pass before compaction
11	`run_compaction`: multi-level context compaction
12	Tool-list filtering: plan mode gate
13	MCP tool curation: top-K trim
14	System prompt assembly: append plan / skill hint
15	Cache tier + routing hint: provider optimisation
16	`PrePrompt` hooks: hook pass before stream
17	`provider.stream` / retry loop: LLM call with bounded retries
18	Tool dispatch: per-category timeouts and gates
loop end
19	`observe_skill_router_outcome`: credit or debit the pick
20	`observe_auto_skill`: bucketer and drafter trigger

Step-by-step

1. StyleDetector::observe (engine.rs:2315)

StyleDetector::observe(user_input) is called first, before any other logic. It fingerprints the user’s vocabulary so the UserModelBackend trait can adapt tone and detail level across sessions.

2. User-model write-back (engine.rs:2321)

observe_user_turn(user_input) writes a per-turn preference observation via the UserModelBackend trait. No-op when no backend is installed; errors are logged and swallowed, never fatal.

3. Per-turn SkillRouter pick (engine.rs:2331)

If a SkillRouter is installed and the skill catalog is non-empty, choose is called against wcore_dispatch::DecisionRouter. The pick is a Thompson Beta-sampled selection from the full catalog. The winner is stashed on the engine as current_skill_router_pick so the matching observe call at turn end credits the same arm.

When a skill is picked, a single non-binding hint line is appended to the system prompt (engine.rs:2547). Engines without a router are unaffected.

4. Cross-session memory recall (engine.rs:2372)

recall_relevant_facts(user_input) queries the durable memory store for facts relevant to the current input and injects them before the first LLM call. On resumed sessions, or with NullMemory, this is a no-op. The intent is that a fresh process can answer from prior-session memory without waiting for the model to invoke session_search explicitly.

5. WAL persistence (engine.rs:2376)

The user message is written to disk before any LLM call. A SIGKILL mid-turn cannot erase it. On the first message in a session, persist_first_message writes the session file and the WAL entry together. On subsequent messages, append_wal appends only. When the session is later resumed, SessionManager::load calls merge_wal to fold the WAL entries back into the message list.

6. Workflow detection gate (engine.rs:2426)

Before entering the LLM loop, the engine checks whether the input looks like a ForgeFlow candidate via orchestration::intent::workflow_candidate. If all conditions are true (workflow live mode enabled, an approval manager and protocol writer are wired, and the heuristic matches), try_live_workflow runs the synthesised workflow and returns early, skipping the normal turn loop entirely.

If the user declines, or synthesis fails, or the mode is off, None is returned and execution falls through to the normal loop.

7-8. Loop entry: cooperative cancellation and runaway guards

At the top of each iteration, the engine checks cancel_token.is_cancelled(). A host (TUI, ACP server) that fires the token stops the loop cleanly at turn boundary, avoiding mid-await cancellation. The unpaired tool_use left by an in-turn cancel is repaired on next push_user_turn.

max_turns is an optional override, not the primary runaway guard. When set it caps the loop at that count. When None, the budget cap (from wcore-budget) and the context-token ceiling together act as backstops. The context ceiling is checked after compaction: if the estimated full request size (messages + system + tool defs) still exceeds context_window - output_reserve - emergency_buffer, the run terminates with a user-visible message.

9-10. Pre-turn and PreCompact hooks (engine.rs:2472, 2489)

on_turn_start hooks run at the top of each iteration. A hook that returns block halts the loop cleanly; this is the intended mechanism for “stop after condition X” operators. PreCompact hooks run immediately before the compaction pass and are otherwise non-blocking.

11. Context compaction (engine.rs:2504)

run_compaction runs before each API call. On the first turn last_input_tokens is zero, so neither auto-compact nor emergency compaction fires. A compaction failure (for example ContextTooLong from emergency bail) ends the session: session-end hooks fire and the session is saved before the error propagates.

12. Tool-list filtering: plan mode

When plan_state.is_active is true, the tool registry is filtered to ToolCategory::Info tools only, excluding EnterPlanMode itself. Normal mode excludes ExitPlanMode. See Plan Mode for the full gate description.

13. MCP tool curation (engine.rs:2527)

apply_mcp_curation trims MCP tools (named mcp__{server}__{tool}) to a top-K set based on audit-log recency. Non-MCP tools (builtins, skills, spawn, plan tools) are always kept. This is a no-op when curation is off.

14. System prompt assembly (engine.rs:2529)

The system prompt is assembled from the configured base, with plan-mode instructions appended when active. If the skill router picked a visible skill, one hint line is appended.

15. Cache tier and routing hint (engine.rs:2596, 2636)

pick_cache_tier promotes the request to a 1-hour Anthropic prompt cache tier once the estimated input token count clears the 1024-token minimum. Non-Anthropic providers ignore the field.

A RequestShape is computed from the actual tool call count and token estimate, passed through wcore_providers::route with default heuristics, and stamped onto the request as routing_hint. This surfaces provider routing decisions in observability without affecting correctness.

Orphaned tool_use blocks (any tool_use in history without a matching tool_result) are repaired before the request is sent. Anthropic returns a hard 400 on any orphan.

16. PrePrompt hooks (engine.rs:2658)

run_pre_prompt fires after the request is fully assembled and before the stream retry loop. It fires once per turn, not once per stream retry.

17. LLM stream loop with bounded retries (engine.rs:2683)

provider.stream() returns Ok(rx) after response headers arrive. The SSE body is drained from the channel. A failure after headers (connection reset, TLS drop, in-band error SSE frame) surfaces as LlmEvent::Error and is retried up to MAX_STREAM_RETRIES = 2 times with linear backoff. A truncated stream (channel closes with no Done event) is also treated as a failed attempt. Exhausting all retries ends the turn as an error verdict.

The loop iterates until StopReason::EndTurn or StopReason::MaxTokens, at which point control falls through to tool dispatch.

18. Tool dispatch with per-category timeouts

Every ContentBlock::ToolUse in the LLM response is routed through orchestration/mod.rs. Each dispatch is wrapped in tokio::time::timeout keyed on the tool’s ToolCategory:

Category	Timeout	Rationale
`Exec`	600s	Interactive shells and long builds legitimately need minutes.
`Mcp`	120s	Covers MCP/network tools whose subprocess or endpoint can wedge.
`Info`	30s	A file read should never take longer.
`Edit`	30s	A file edit should never take longer.

On elapse, the dispatcher fires the ToolContext.cancel token (so cooperative tools can wind down) and synthesises an error ToolResult. The agent loop continues rather than hanging.

Approval gates

When the engine is configured to require approval, ToolApprovalManager issues an ApprovalRequired protocol event and a matching Suspend event (gated by capabilities.hitl_suspend). The run pauses until an ApprovalResume arrives. --auto-approve skips all gates. --force / --yolo / --dangerously-skip-permissions bypass the gate entirely.

Pre- and post-tool hooks

pre_tool_use hooks run before each tool call; a non-zero exit blocks the tool. post_tool_use hooks run after; they are non-blocking and errors are logged. See Hooks for configuration.

19. SkillRouter observe (end of turn)

observe_skill_router_outcome feeds the turn’s StopReason back to the router, calling scorer.record(pick, outcome). EndTurn or ToolUse maps to TaskOutcome::Success. MaxTurns, error, or abort maps to TaskOutcome::Failure. TaskOutcome::Neutral leaves both alpha and beta unchanged.

20. Auto-skill observation (bootstrap.rs:1642)

observe_auto_skill feeds the turn trajectory to the Bucketer. The bucketer groups turns by normalised task signature: the input is lowercased, tokenised on whitespace and ASCII punctuation, stopwords and short tokens dropped, top-3 tokens by length taken, sorted and joined with -. Three consecutive successes on the same signature trigger a DraftTrigger, which fires the SkillDrafter. Any failure resets the streak for that signature bucket.

The SkillDrafter writes a candidate skill to $WAYLAND_HOME/skills/auto/<sig>/SKILL.md and a manifest.json alongside it (auto_drafted: true, needs_review: true, score: 0.7). The skill is also inserted into evolved_prompts in SQLite so the next session’s SkillRouter hydrates it as a seed pair at boot.

The drafter is installed only when a real Db handle is present (bootstrap.rs:1663). The bucketer always runs.

Session end

When the loop exits for any reason (EndTurn, MaxTurns, error, cancel), session-end hooks fire and save_session is called. The WAL is merged on the next load.

Configuration reference

Key or flag	Default	Effect
`--max-turns <N>`	none (budget/context backstops apply)	Cap the loop at N iterations.
`--auto-approve`	off	Skip approval prompts for all tool calls.
`--force` / `--yolo`	off	Bypass tool gates entirely.
`--compaction <off\|safe\|full>`	`safe`	Context compaction level.
`--resume <id>`	(none)	Load a previous session by ID.
`--resume-latest`	(none)	Load the most-recent session.
`max_turns` (TOML)	none	Same as `--max-turns`, set in config.
`auto_approve` (TOML)	false	Same as `--auto-approve`, set in config.
`compact.level` (TOML)	`safe`	Compaction level.
`WAYLAND_HOME`	`~/.wayland`	Base directory for skills, logs, evolved prompts.