Decision Router and Thompson Sampling
wcore-dispatch is the internal routing layer that selects orchestration templates, agent personas, and skills at runtime. It is not a model router (provider selection lives in wcore-providers). Its purpose is to learn from past outcomes and prefer choices that have worked on similar tasks.
Architecture
Section titled “Architecture”wcore-dispatch (trait + scorer) | +-- TemplateRouter picks orchestration template (Direct / Consensus / ...) +-- AgentRouter picks a named agent from the AgentPack +-- SkillRouter picks a catalog skill per turn (lives in wcore-skills)All three routers share the same DecisionRouter trait and the same BetaScorer backend.
The DecisionRouter trait (lib.rs)
Section titled “The DecisionRouter trait (lib.rs)”pub trait DecisionRouter<TKey, TInput> { fn choose(&mut self, input: TInput) -> Result<TKey, RouterError>; fn observe(&mut self, choice: &TKey, outcome: TaskOutcome);}choose picks the best candidate for the current input. observe feeds the outcome of a prior choice back to the scorer. The trait is not Send + Sync on its own; callers wrap implementations in a Mutex when crossing async boundaries.
TaskOutcome has three variants:
| Variant | Effect on scorer |
|---|---|
Success | Increments alpha (success count) for the chosen arm. |
Failure | Increments beta (failure count) for the chosen arm. |
Neutral | No-op. Neither alpha nor beta changes. |
BetaScorer (scorer.rs)
Section titled “BetaScorer (scorer.rs)”BetaScorer<TKey> maintains a HashMap<TKey, Stats> where each entry holds a (success, failure) pair. Picking works as follows:
- For each candidate arm, read
(success, failure). A cold-start arm with no observations defaults to(0, 0). - Compute
alpha = success + 1,beta = failure + 1. A cold-start arm is thereforeBeta(1, 1) = Uniform[0, 1]. - Draw one sample from
Beta(alpha, beta)usingrand_distr::Beta. - Return the arm with the highest sample.
All samples are computed before the argmax comparison. Drawing inside the argmax loop would re-randomize and corrupt the comparison.
Cold-start arms receive Beta(1, 1), which gives them equal expected value to any other cold-start arm. As outcomes accumulate, the posteriors sharpen and the router converges toward the stronger arms.
BetaScorer::new() seeds the RNG from the OS. BetaScorer::with_seed(seed) is deterministic and intended for tests only. restore hydrates the stats map from a previously persisted snapshot (used at boot to load GEPA winners).
TemplateRouter (template_router.rs)
Section titled “TemplateRouter (template_router.rs)”TemplateRouter picks among five orchestration templates:
| Template | Description |
|---|---|
Direct | Single-agent call. |
Consensus | Parallel fan-out with majority joiner. |
SelfCritique | Agent then critic loop. |
Adaptive | Replan-on-result via a ReplanFn. |
Hierarchical | Supervisor with delegated sub-graphs. |
TemplateRouter::new() includes all five arms. with_arms(arms) restricts the candidate set (for example, dropping Hierarchical on a single-host deploy). An empty arms list falls back to all five.
Manual override
Section titled “Manual override”Embed @@template=<name> (case-insensitive) anywhere in the input to force a specific template:
please use @@template=consensus for this analysisThe override is honored only if the named template is in the configured arm set. An unrecognised name falls back to the scorer. Accepted spellings: direct, consensus, self_critique or self-critique or selfcritique, adaptive, hierarchical.
Installation (bootstrap.rs:1635)
Section titled “Installation (bootstrap.rs:1635)”The TemplateRouter is installed at bootstrap under the label F-024. It is default-initialised (all five arms, OS-RNG seed). Before this fix, set_template_router had zero production callers and every turn fell through to the deterministic IntentClassifier. The IntentClassifier (keyword-based, in orchestration/intent.rs) remains the cold-start fallback when the router has no data for a given input.
AgentRouter (agent_router.rs)
Section titled “AgentRouter (agent_router.rs)”AgentRouter picks a named agent from the AgentPack registry. The arm set defaults to all 13 bundled agents (via AgentRouter::new_with_all_agents()), or you can restrict it with with_allowlist. Names not present in the registry are silently dropped at construction time, so a stale allowlist does not break startup.
Manual override
Section titled “Manual override”Embed @@agent=<name> (case-insensitive prefix, name preserved verbatim) to force a specific agent:
@@agent=security-auditor review this diffThe override is honored only if the named agent is in the current arm set. An override that names an agent outside the set falls back to the scorer.
SkillRouter (wcore-skills::router)
Section titled “SkillRouter (wcore-skills::router)”SkillRouter extends the same DecisionRouter / BetaScorer pattern but operates on catalog skill names rather than enum variants. It is seeded in two layers at boot.
Boot seeding (bootstrap.rs:1253-1296)
Section titled “Boot seeding (bootstrap.rs:1253-1296)”When a real Db handle is available, the bootstrap process opens a PromptStore against the evolved_prompts SQLite table and runs seed_pairs_for twice:
Layer 1: GEPA bench winners
store.seed_pairs_for(&candidate_names, "bench", 1)Converts the top scoring GEPA-evolved prompt for each skill to a seed pair: score * 5 rounded equals the number of simulated successes loaded into restore. A skill with a bench score of 0.9 gets 5 simulated successes before the session’s first turn.
Layer 1b: Auto-drafted skills
store.seed_pairs_for(&candidate_names, "auto_drafter", 1)Same conversion, but for skills written by the SkillDrafter (scorer = "auto_drafter", score = 0.7). A score of 0.7 produces 4 simulated successes: confident but not dominant over a proven GEPA winner. Idempotent against Layer 1 since restore only fills arms not already seeded.
Layer 2: Prioritizer head-start
sk_router.seed_from_prioritizer(&candidate_names)Fills any arm that neither GEPA nor the auto-drafter touched, using a usage-frequency ranking. Top-quartile skills get 3 simulated successes; the boost fades toward zero at the tail.
Per-turn operation
Section titled “Per-turn operation”At the start of each run() call, the engine calls choose with a SkillRouterInput containing the user input text and the list of candidate names. The winning skill name is stashed as current_skill_router_pick. At turn end, observe_skill_router_outcome maps StopReason to TaskOutcome and records it.
The hint appended to the system prompt is a single non-binding line. It is present only when the router picked a visible catalog skill. Engines without a router are byte-identical to the pre-router behaviour.
How GEPA winners reach the router
Section titled “How GEPA winners reach the router”The full path from an offline GEPA run to the per-turn router:
wcore-evolve binary → EvolveOutcome (winner retained, losers archived) → CuratorPort::submit → Decision::Promote → PromptStore::record_variant (evolved_prompts table) columns: skill_name, scorer="bench", score, generation, ...
Next session boot: → PromptStore::seed_pairs_for(candidates, "bench", 1) → score × 5 = simulated successes → BetaScorer::restore(pairs) → SkillRouter picks with informed priorGEPA winners carry a scorer = "bench" tag. The seed_pairs_for query selects the single best-scoring row per skill for the given scorer, converts the score to a simulated success count, and returns (skill_name, Stats { success, failure: 0 }) pairs.
How auto-drafted skills reach the router
Section titled “How auto-drafted skills reach the router”The in-session drafter path runs in the opposite direction: it produces a row the next session reads.
Session N, turn end: observe_auto_skill → Bucketer (N=3 streak trigger) → SkillDrafter::draft → writes $WAYLAND_HOME/skills/auto/<sig>/SKILL.md → PromptStore::insert scorer="auto_drafter", score=0.7
Session N+1 boot: → seed_pairs_for(candidates, "auto_drafter", 1) → 0.7 × 5 = 4 simulated successes → BetaScorer::restore (only fills arms GEPA did not already seed)The SkillDrafter also registers the new skill into the current session’s catalog immediately (using Box::leak for process-lifetime allocation), so the drafted skill is available for the rest of session N without waiting for a restart.
Router internals: Stats persistence and the UNIQUE constraint
Section titled “Router internals: Stats persistence and the UNIQUE constraint”PromptStore::record_variant enforces a UNIQUE constraint on (skill_name, generation, id). Callers must generate a fresh UUID for each insertion; retrying the same row returns an error. The seed_pairs_for query reads but does not write, so boot seeding is always safe to call multiple times.
Manual override syntax summary
Section titled “Manual override syntax summary”| Router | Override syntax | Scope |
|---|---|---|
TemplateRouter | @@template=<name> in input text | Honored only if name is in configured arm set |
AgentRouter | @@agent=<name> in input text | Honored only if name is in configured arm set |
SkillRouter | none; CLI flag --agent <name> selects persona only | Picker is fully automatic |