Security Model

This document describes the security architecture of WebBrain — what the extension can do, what it trusts, how it handles credentials, and how it defends against prompt injection.

For vulnerability disclosure, see SECURITY.md.

Extension Privileges

Permissions

{
  "permissions": [
    "sidePanel", "activeTab", "tabs", "tabGroups", "scripting", "storage",
    "webNavigation", "debugger", "downloads", "unlimitedStorage",
    "offscreen", "privateNetworkAccess", "tabCapture",
    "clipboardWrite", "clipboardRead"
  ],
  "host_permissions": ["<all_urls>", "http://localhost/*", "http://127.0.0.1/*", "http://*/*"]
}

(This is the Chrome MV3 manifest. Firefox MV2 grants a narrower set — activeTab, tabs, tabGroups, storage, unlimitedStorage, clipboard*, <all_urls> — and has no debugger/offscreen, see Firefox Differences below.)

Permission	Risk	Mitigation
`<all_urls>`	Content script injection anywhere — the agent can read and interact with any page the user visits	The user must explicitly switch to Act mode; Ask mode is read-only. The agent never auto-activates on new tabs.
`debugger`	CDP access provides trusted events and full DOM/network control on any tab	The debugger is only attached during active agent runs and detached on completion/abort.
`downloads`	Can save files to the user's Downloads folder without prompting	Only the agent's explicit tool calls (`download_files`, `download_file`, `download_resource_from_page`, `download_social_media`, `screenshot({save:true})`) use this, and each is gated by the capability × origin permission prompt.
`offscreen`	An offscreen document can make HTTP requests immune to user CSP	Only used for localhost LLM provider proxy and tab recording. Never forwards arbitrary URLs.

Authentication

The extension runs inside the user's authenticated browser session. There is no separate "AI account" — every site the user is logged into (GitHub, Gmail, banking, internal tools) is accessible to the agent with the user's full permissions, exactly as if they were clicking themselves.

The system prompt explicitly tells the model:

"You do NOT need API tokens, OAuth flows, or 'permission to act on the user's behalf'. The browser session already has all that."

This is a feature (it makes the agent useful with zero setup) but also the most important risk: the agent can do anything the user can do in a browser.

Credential Handling

Detection

After every set_field / type_ax call, credential-fields.js checks whether the filled field is a credential input. Triggers:

<input type="password">
autocomplete="current-password" | "new-password" | "one-time-code"
Field name / id / aria-label / placeholder / label text matches SENSITIVE_NAME_RE

Strict Secret Mode

When enabled (Settings → "Strict secret handling"), the agent:

Never quotes credentials in summaries, assistant text, or tool-call arguments — even when the user explicitly asks
The done tool description is swapped for DONE_TOOL_STRICT, which adds a hard prohibition
After filling a sensitive field, CREDENTIAL_NOTE_STRICT is injected into the tool result

When disabled (the default — this is a personal-computer tool, not a third-party deployment):

The model gets soft hygiene guidance ("prefer generic phrasing unless the user asks for the value")
The user can ask to see credentials and the model will show them
The done tool description still encourages tidy summaries

Profile Auto-Fill

Users can store a short profile (name, email, throwaway password) in Settings → Profile. This text is appended to the system prompt when enabled. Warnings in the UI:

Stored in plaintext in chrome.storage.local
Sent to the LLM provider on every turn as part of the system prompt
Do not put passwords for important accounts here

Prompt Injection Defenses

The primary threat: a malicious page crafts content that, when read by the agent and fed to the LLM, causes the model to execute unintended actions.

Defense Layers

Layer	Mechanism
Untrusted-content wrapping	Page-derived tool results are wrapped in `<untrusted_page_content>` markers (`_wrapUntrusted` + `UNTRUSTED_CONTENT_TOOLS`) so the model treats them as data, not instructions. See prompt-injection-defense.md.
Capability × origin gate	Before a consequential tool runs (click/type/navigate/execute_js/network/download/…), the agent requires a `(capability, host)` grant — Allow once / Always / Deny. Language-agnostic, deterministic, human-in-the-loop (`permission-gate.js`).
Tool result cap	Individual tool results truncated at 8,000 chars (`_limitToolResult`). Injected text beyond that is silently dropped.
Ask/Act mode	In Ask mode, only read-only tools are available. The user must explicitly switch to Act for the agent to click/type/navigate.
`/allow-api`	A per-conversation `/allow-api` flag that waives the permission prompt for write-method network egress (`fetch_url`/`research_url` with POST/PUT/PATCH/DELETE). It does NOT waive GET egress or any other capability. Clears on conversation reset.
`done()` blocking	Before accepting completion, the agent probes for open dialogs/forms. If the summary claims "created"/"saved" but a modal is still open, the agent is forced to continue.
Duplicate-submit guard	Clicks on submit-like text (create/save/submit/add/post/publish/send/confirm/sign up/log in/pay/checkout/order, etc.) are blocked within a 45-second window per tab+URL (Chrome).
CLICK occlusion test	Before clicking, the resolver calls `elementFromPoint()`. If another element is visually on top, the click is refused.
Modal-scoped click	When a dialog is open, text clicks are scoped to that subtree so the agent doesn't click a dimmed background element.
Universal preamble	Every system prompt includes guidance on cookie banners and paywalls — two common injection vectors that look like benign page content.
Loop detection	Three independent detectors stop the agent if it's repeating the same action or oscillating. Limits damage from a persistently injected prompt.
Finance adapters	Adapters with `category: 'finance'` inject extra confirmation guidance and a warning banner.
Strict secret handling	Prevents credential exfiltration even if the model is jailbroken into quoting secrets.
Local network blocking	When disabled (default), `fetch_url` cannot reach private/RFC1918 addresses. Cloud-metadata endpoints (169.254.169.254) are always blocked.

What is NOT defended

The LLM provider itself: if the provider is compromised or malicious, it sees all conversation content including credentials the user types.
Extension-unique fingerprinting: websites could detect the content script (pulsing border, window.__wbElementMap, custom event handlers).
Timing-channel attacks: the agent's tool-call latency could be observable from page JS.

`/allow-api` Flag

Set per-conversation via the /allow-api slash command in the side panel. When active, it waives the permission prompt for write-method network egress only:

fetch_url / research_url with method: POST/PUT/PATCH/DELETE

It does NOT waive GET egress, execute_js, or any other capability — those still go through the capability × origin gate. (isNetworkMutation in permission-gate.js is what /allow-api keys off; execute_js is its own Capability.EXECUTE_JS and is always gated.)

The system prompt adds a preamble telling the model to:

State the URL, method, and payload in plain text before any destructive API call
Default to UI-first; only reach for the API when UI has actually failed

Cleared on conversation reset.

Trace Data Isolation

The trace recorder (trace/recorder.js) writes to IndexedDB on the user's machine when explicitly enabled (Settings → Display → "Record traces"). Data never leaves the browser:

runs store: model, provider, token totals, timestamps
events store: LLM requests/responses, tool calls, screenshot metadata
shots store: screenshot blobs

The traces page (ui/traces.html) reads from local IndexedDB only. Export produces a JSON blob identical to what the user sees on screen — no telemetry, no network calls.

Firefox Differences

Firefox has no CDP (debugger permission), so:

No trusted events (synthetic el.click() only)
No full-page screenshots
No shadow DOM piercing for closed roots
No offscreen document (CORS must be handled by LLM servers)
No tab recording (record_tab — Chrome's recorder/ is absent)
No duplicate-submit guard (the timestamp Map is declared but unwired)

Everything else — the permission gate, untrusted-content wrapping, credential detection, loop detection, adapter system, and the trace recorder (it ships identically in src/firefox/src/trace/recorder.js) — is the same.

Reporting Issues

See SECURITY.md for the disclosure contact and policy.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Security Model

Extension Privileges

Permissions

Authentication

Credential Handling

Detection

Strict Secret Mode

Profile Auto-Fill

Prompt Injection Defenses

Defense Layers

What is NOT defended

`/allow-api` Flag

Trace Data Isolation

Firefox Differences

Reporting Issues

FilesExpand file tree

security-model.md

Latest commit

History

security-model.md

File metadata and controls

Security Model

Extension Privileges

Permissions

Authentication

Credential Handling

Detection

Strict Secret Mode

Profile Auto-Fill

Prompt Injection Defenses

Defense Layers

What is NOT defended

/allow-api Flag

Trace Data Isolation

Firefox Differences

Reporting Issues

`/allow-api` Flag