Skip to content

Sub-Agents (Delegate)

The Delegate tool lets the model split a task into independent pieces and run them at the same time. Each piece runs as a sub-agent with its own conversation context and the full tool set, while sharing the parent’s provider connection. When the sub-agents finish, their results are merged back to the parent.

The model reaches for Delegate when a task breaks cleanly into parts that do not depend on each other. Typical cases:

  • Search three files at once and summarize each.
  • Run tests and lint in parallel.
  • Read one part of the codebase while searching another.

You do not invoke Delegate yourself. You describe a task that has parallel structure, and the engine decides to fan it out.

Sub-agents are bounded so a fan-out cannot exhaust resources:

SettingDefaultMeaning
Max parallel sub-agents5Upper bound on concurrent sub-agents.
Sub-agent max turns50Loop turns per sub-agent.
Sub-agent max tokens4096Response tokens per sub-agent.

Sub-agents differ from the main agent in a few ways:

  • They auto-approve their own tool calls, so there is no confirmation prompt mid-fan-out.
  • They do not save sessions of their own.
  • They run silently, with no direct output to your terminal.
  • Their results are collected and returned to the parent, which continues the conversation with the merged outcome.

This keeps the parent transcript readable: you see the parent’s reasoning and the merged result, not the interleaved output of five workers.

The same sub-agent mechanism backs ForgeFlow stages. A workflow stage that runs independently of its siblings is executed as a sub-agent, and the stage results aggregate the same way Delegate results do. So the parallelism you get from Delegate in a single conversation is the same parallelism a multi-stage workflow uses under the hood.