Sub-Agents (Delegate)

The Delegate tool lets the model split a task into independent pieces and run them at the same time. Each piece runs as a sub-agent with its own conversation context and the full tool set, while sharing the parent’s provider connection. When the sub-agents finish, their results are merged back to the parent.

When the engine delegates

The model reaches for Delegate when a task breaks cleanly into parts that do not depend on each other. Typical cases:

Search three files at once and summarize each.
Run tests and lint in parallel.
Read one part of the codebase while searching another.

You do not invoke Delegate yourself. You describe a task that has parallel structure, and the engine decides to fan it out.

Limits

Sub-agents are bounded so a fan-out cannot exhaust resources:

Setting	Default	Meaning
Max parallel sub-agents	5	Upper bound on concurrent sub-agents.
Sub-agent max turns	50	Loop turns per sub-agent.
Sub-agent max tokens	4096	Response tokens per sub-agent.

How they behave

Sub-agents differ from the main agent in a few ways:

They auto-approve their own tool calls, so there is no confirmation prompt mid-fan-out.
They do not save sessions of their own.
They run silently, with no direct output to your terminal.
Their results are collected and returned to the parent, which continues the conversation with the merged outcome.

This keeps the parent transcript readable: you see the parent’s reasoning and the merged result, not the interleaved output of five workers.

Relation to ForgeFlows

The same sub-agent mechanism backs ForgeFlow stages. A workflow stage that runs independently of its siblings is executed as a sub-agent, and the stage results aggregate the same way Delegate results do. So the parallelism you get from Delegate in a single conversation is the same parallelism a multi-stage workflow uses under the hood.

Sub-Agents (Delegate)

When the engine delegates

Limits

How they behave

Relation to ForgeFlows

Next