Browser Tool
The Browser tool gives the model a controlled window into the web. It is intentionally narrower than a full browser-automation library: the surface is ARIA-tree-first (no arbitrary JavaScript), every outbound URL is checked by a fail-closed policy, and all HTTP traffic routes through the engine’s egress gate.
Operation surface
Section titled “Operation surface”The op enum is locked at 18 variants (BROWSER_OP_LOCKED_VARIANT_COUNT = 18 in wcore-browser/src/op.rs). Adding a variant requires bumping that constant and a new audit pass. Ops are serialized as a tagged JSON union - { "kind": "navigate", "url": "https://example.com", "wait_until_loaded": true }.
| Op | Kind tag | Description |
|---|---|---|
| Navigate | navigate | Navigate to a URL. wait_until_loaded: bool (default true). |
| Snapshot | snapshot | ARIA-tree snapshot of the current page. Mints fresh @eN element refs. |
| Read | read | Readability-style markdown extraction. mode: main_content / article / raw. |
| Click | click | Click an element by its @eN ref from the most recent snapshot. |
| Fill | fill | Type text into an input field identified by ref. |
| Press | press | Press a single key by name, e.g. "Enter", "Tab", "Escape". |
| Select | select | Choose a <select> option by value. |
| Upload | upload | File upload via a file input. Path is confined to the operator’s downloads root; .. traversal and symlink escapes are rejected before the op reaches any backend. |
| Download | download | Download a URL to dest_path. Same path confinement as Upload. |
| Screenshot | screenshot | Capture the current viewport or full page. |
| GetState | get_state | Return the current URL and page title without touching the DOM. |
| WaitFor | wait_for | Wait until a CSS selector or ARIA role appears. Takes timeout_ms. |
| NetworkLog | network_log | Dump the per-session network request log. |
| Console | console | Dump the per-session browser console log. |
| NewTab | new_tab | Open a new tab, optionally at a URL. |
| CloseTab | close_tab | Close the current tab. |
| Back | back | Navigate back one entry in the tab’s history. |
| Forward | forward | Navigate forward one entry in the tab’s history. |
No Evaluate variant
Section titled “No Evaluate variant”There is no arbitrary-JavaScript execution op. The omission is a deliberate locked-surface decision from design §5.16 (REV-2 audit F6 lock). The ARIA-tree-first approach means the model interacts with semantic element refs rather than DOM positions or XPath, which is both more stable and avoids the attack surface that eval-style ops introduce.
Backends
Section titled “Backends”Three backends are available. The engine picks one based on hints and environment, in this order:
- Browserbase (cloud) - selected when
ProviderHint::Browserbaseis set ANDBROWSERBASE_API_KEY+BROWSERBASE_PROJECT_IDare present ANDallow_cloud: trueis set in the tool spec. Requires thebrowserbaseCargo feature. - Chromium: selected when
ProviderHint::Chromiumis set. Requires thechromiumCargo feature; uses chromiumoxide CDP. - Camoufox (default) - used when no other hint matches. Talks to a sidecar process via HTTP at
localhost:9377(configurable viaWAYLAND_BROWSER_HINT=camoufoxor the defaultProviderHint::Auto). The sidecar wraps a privacy-hardened Firefox fork.
# config.toml - switch to Browserbase when the env keys are set[browser]hint = "browserbase"allow_cloud = true# Or via env (takes precedence over config)WAYLAND_BROWSER_HINT=chromiumThe Camoufox backend communicates via HTTP using EgressClient, so all sidecar traffic passes through the egress gate. Any 3xx redirect from the sidecar is re-checked against the URL policy before the browser follows it (the reqwest_redirect_policy() hook, which caps redirect chains at 10 hops).
URL security policy (BrowserPolicy)
Section titled “URL security policy (BrowserPolicy)”Every URL the model supplies - including redirect targets - is evaluated by BrowserPolicy (wcore-browser/src/policy.rs) before any network I/O happens.
Hard-blocked (always-on, no operator override)
Section titled “Hard-blocked (always-on, no operator override)”The following are blocked unconditionally regardless of allow- or deny-list configuration:
- RFC 1918 private ranges:
10.0.0.0/8,172.16.0.0/12,192.168.0.0/16 - Loopback:
127.0.0.0/8,localhost,*.localhost,::1 - Cloud metadata endpoint:
169.254.169.254(shared by AWS, GCP, Azure, OpenStack) - Link-local IPv4:
169.254.0.0/16 - IPv6 unique-local:
fc00::/7 - IPv6 link-local:
fe80::/10 - IPv6 multicast:
ff00::/8 - IPv4-mapped IPv6 literals, e.g.
::ffff:169.254.169.254, where the embedded v4 address hits any of the above - Legacy IPv4 encodings that bypass strict parsers: octal (
0177.0.0.1), hex (0x7f.0.0.1), two-octet (127.1), and decimal-overflow (2130706433) forms - all normalized viaparse_ipv4_loose()before the block check - RFC 6598 CGN range:
100.64.0.0/10
Scheme allowlist (always-on)
Section titled “Scheme allowlist (always-on)”Only http and https pass. Everything else is refused at the gate:
javascript: data: blob: file: ftp: gopher: view-source: ...Operator-configured lists
Section titled “Operator-configured lists”Three fields in BrowserPolicy are operator-configurable:
| Field | Type | Behaviour |
|---|---|---|
allowed_origins | Vec<String> | Suffix-glob list (*.example.com). When non-empty, only matching origins pass; everything else falls through to default_action. |
denied_origins | Vec<String> | Suffix-glob list. Always wins over the allow list. |
default_action | PolicyAction | deny (default, fail-closed since v0.2.1) / allow / ask |
The ask action routes the URL to Suspend, which triggers an S4 HITL approval event. The operator-facing host must handle the suspend and resume the op.
# Minimal allow-only-example.com policy[browser.policy]default_action = "deny"allowed_origins = ["*.example.com", "example.com"]denied_origins = []The fail-closed default (default_action = "deny") means a fresh install with no allowed_origins configured will refuse every outbound Browse request. Set default_action = "allow" when you want pass-through with only the hard-blocked ranges enforced.
DNS rebinding (TOFU cache)
Section titled “DNS rebinding (TOFU cache)”When a backend resolves a hostname to an IP address, check_resolved_host(host, ip) pins the first-seen IP for that hostname in an in-memory cache (TOFU - trust-on-first-use). Any subsequent resolution of the same hostname that returns a different IP is refused:
DNS rebinding refused: foo.example.com resolved to 127.0.0.1,first-seen resolve was 203.0.113.5This defends against attacks that swap a benign initial resolution for a private or metadata target after the URL policy check has already passed. The TOFU cache is per-policy instance and is cleared when the policy is dropped.
The same IP block rules that apply to URL literals also apply to resolved IPs: resolving foo.example.com to 10.0.0.1 fails even on first resolution, before any TOFU entry is written.
Protocol events
Section titled “Protocol events”When the capabilities.browser_suite flag is set on a JSON-stream session, the engine emits two typed events:
BrowserEvent- per-completed-op trail carrying op kind, URL (when applicable), and a human-readable summary.BrowserPolicyDenied- emitted when a URL is blocked, carrying the blocked URL and the denial reason.