Security Model Overview

Wayland Core’s security posture is built around a fail-closed philosophy: every enforcement layer defaults to the most restrictive behavior. Operators must explicitly opt out of protections; there is no mechanism for a plugin or late code path to quietly weaken the session’s isolation.

The enforcing layers that are active today are:

OS-native sandbox: every model-driven shell command runs inside a platform-native process isolation backend
Egress gate: every outbound HTTP call passes a single gating point, on by default
Approval modes and plan mode: destructive tools require explicit human consent before execution
Budget caps: session-level and per-user token/cost limits that hard-block at the limit

A fifth layer - the explicit-grant ACL (wcore-permissions) - exists in the codebase with full tests but is not yet wired into production call sites. See Permission Model for its design and current status.

OS-native sandbox

Every shell command dispatched by the agent runs inside an OS-native sandbox selected by default_for_platform() in crates/wcore-sandbox/src/lib.rs. The selection order is:

WAYLAND_SANDBOX=docker - Docker container (cross-platform opt-in)
Linux: Bubblewrap (bwrap); Bubblewrap + Landlock LSM; Bubblewrap + libseccomp strict filter
macOS: sandbox-exec (Seatbelt profile)
Windows: AppContainer + Job Objects

If the selected backend is unavailable and the operator has not set WAYLAND_ALLOW_NO_SANDBOX=1, the engine selects FailClosedBackend. Every execute() call on that backend returns an error:

sandbox UNAVAILABLE and unsandboxed execution is not permitted -
set WAYLAND_ALLOW_NO_SANDBOX=1 to allow unsandboxed execution.

The double-key requirement (WAYLAND_SANDBOX=none AND WAYLAND_ALLOW_NO_SANDBOX=1) prevents a stray environment variable from stripping isolation. When an unsandboxed session is running, the engine logs a rate-limited warning at most once every 60 seconds (DEGRADED_WARN_INTERVAL) so the degraded state stays visible throughout a long session.

Each backend receives a SandboxManifest that declares what the process is permitted to do:

# Conceptual manifest fields (passed programmatically, not from a TOML file)
fs_read_allow  = ["/home/user/project"]   # read-only bind mounts / Landlock rules
fs_write_allow = ["/tmp/workspace"]       # read-write mounts
network        = "deny"                   # inherit | deny | allow_hosts([...])
syscall_policy = "strict"                 # inherit | strict (Linux seccomp only)
timeout        = { secs = 30 }
max_memory_bytes = 536870912             # 512 MiB; advisory on most backends
max_cpu_secs     = 60

Backends always scrub the host environment before injecting the declared env entries (env -i style). Windows AppContainer filesystem ACLs are not yet wired; the interim behavior is safe default-deny. On Linux, AllowHosts network policy is not yet supported by the bwrap backend (no DNS gate).

See the sandbox source for per-backend details.

Network egress gate

All outbound HTTP in the workspace is routed through a single EgressClient in crates/wcore-egress. A Clippy disallowed-methods lint bans reqwest::Client::new and reqwest::Client::builder everywhere else - any new code that bypasses the gate fails the build.

The gate is enforced by default. Disabling it requires both:

security.enabled = false in ~/.wayland-core/config.toml
The --i-accept-exfil-risk CLI flag at the same invocation

Disabling with only the config change is silently rejected and emits a tracing::warn!. The comment at crates/wcore-config/src/config.rs documents: “never a bare env var - supply-chain hazard.”

Classification tiers

Every request is classified before the TCP connection is opened:

Verdict	Condition	Outcome
`Allow`	Destination is allowlisted or local	Proceeds silently
`Ask`	New external destination, not exfil-shaped	Operator consent via console doorbell
`Exfil`	Exfil-class destination	Always gated, never auto-allows
(local)	Loopback, RFC 1918, CGNAT, link-local, IPv6 ULA	Always allowed

Exfil classification triggers (any one is sufficient):

POST/PUT/PATCH body to a non-allowlisted external host
GET/HEAD where path + query > 96 chars OR a 24+ character base64/hex-ish token appears in path or query
Any request to a shared-platform host not exact-host-allowed

Shared-platform hosts can never be apex-allowlisted. Adding amazonaws.com to security.egress_allow is silently rejected; a Bedrock endpoint must be exact-host-allowed (e.g. bedrock-runtime.us-east-1.amazonaws.com). Categories include code-sharing platforms (GitHub raw, Gist, Pastebin), object storage (S3, Azure Blob, GCS, R2), serverless/preview domains (workers.dev, vercel.app, netlify.app, ngrok.io), and OOB canary hosts (requestbin.com, webhook.site, burpcollaborator.net, interact.sh).

Hardened client properties

Redirects disabled on streaming and tool clients - prevents credential re-attachment via a 302 redirect
Deny fires before network: a denied request never opens a TCP connection
Timeouts: connect 30 s, read 300 s, total 300 s for tool clients
SSRF blocking at Layer 0 in wcore-tools/src/url_safety.rs (separate from the egress Layer 1)

Default allowlist

crates/wcore-agent/src/egress/defaults.rs seeds the allowlist with approximately 30 LLM provider, tool backend, and package registry domains (anthropic.com, openai.com, github.com, tavily.com, crates.io, pypi.org, and others), plus the active provider host derived from config.base_url. Operator additions go in config.toml:

[security]
# enabled = true   ← default; omitting = enforcing
egress_allow = ["my-internal-tool.example.com"]

Approval modes

The engine has three session modes, selectable via --mode or the protocol SessionMode field:

Mode	Behavior
`Default`	All destructive tools (shell, file writes, etc.) require explicit human approval before each call
`AutoEdit`	File writes are auto-approved; shell commands still require approval
`Force`	All approval gates disabled; the engine dispatches without waiting

Force mode accepts the aliases yolo, dangerously_skip_permissions, and dangerously_skip_sandbox_and_permissions so foreign-agent hosts (Claude Code, Codex, Gemini) can drive the engine without code changes.

Plan mode (EnterPlanMode / ExitPlanMode tools, or --plan) is a read-only mode. When active, the engine guard at engine.rs blocks any write tools from executing. The plan mode flag persists to ~/.wayland-core/plan.toml. Use this for auditing what an agent intends to do before allowing writes.

Budget caps

Budget caps provide a hard upper bound on what a session can consume. They are configured in two blocks:

Global execution caps

[budget]
max_wall_time_secs    = 600    # wall-clock session timeout
max_tool_runtime_secs = 120    # accumulated tool execution time
max_processes         = 8      # concurrent subprocess count
max_agent_depth       = 4      # delegation depth cap
max_tokens_in         = 200000
max_tokens_out        = 16384
max_cost_usd          = 1.50

All caps are optional. An unset cap places no limit. Child agent views propagate counters up to the root, so a parent budget sees the full session rollup. A cap on a child view overrides (does not relax) the parent.

When any cap is exceeded, the engine emits a BudgetExceeded { reason, observed, limit } event on the protocol stream. This event requires no capability flag.

Per-session and per-user daily caps

[session_cap]
per_session_tokens   = 500000
per_session_usd      = 2.00
per_user_daily_usd   = 10.00   # resets at midnight UTC per (user_id, UTC-day)

BudgetTracker emits a CapWarn event at 80% of the strictest configured cap and a hard CapBlock at the limit. A blocked charge does not mutate counters.

Note: the USD value passed to BudgetTracker::charge is trusted from the caller; there is no cryptographic attestation from the provider. Cost tracking is honest for non-adversarial use but is not tamper-proof against a buggy or malicious caller.

Threat model summary

The design addresses these categories of risk:

Threat	Mitigating layer
Model-driven shell command escapes to host filesystem	OS sandbox (bwrap / sandbox-exec / AppContainer); fail-closed when backend unavailable
Agent exfiltrates secrets via HTTP	Egress gate: single chokepoint, shared-platform blocklist, exfil classifier
Model-driven redirect re-attaches credentials	Redirects disabled on EgressClient streaming and tool clients
SSRF against cloud metadata endpoints	URL safety layer (`wcore-tools`) at tool surface; egress gate at network layer
Unconstrained spend or unbounded sessions	Budget caps with CapWarn at 80% and hard CapBlock at limit
Prompt injection via shell metacharacters	`shell_command_argv` constructs commands in argv mode; Clippy bans `sh -c` with LLM-supplied strings outside that function
Secrets leaking into trace output	28-pattern PII scrubber on all trace sinks (`wcore-safety/src/pii.rs`)
Secrets leaking into debug logs	`BearerToken` Debug impl prints `"<redacted>"` for the signature field
Sandbox isolation silently dropped	`FailClosedBackend` refuses execution; double-key opt-out required; rate-limited warn in degraded sessions

Permission Model - ACL design and current wiring status
Credential Storage - OS keychain and encrypted-file vault
Configuration - [security] and [budget] TOML blocks
Tools - tool availability gating and approval flow
Plan Mode - read-only session enforcement