This document describes the security architecture of WebBrain — what the extension can do, what it trusts, how it handles credentials, and how it defends against prompt injection.
For vulnerability disclosure, see SECURITY.md.
{
"permissions": [
"sidePanel", "activeTab", "tabs", "tabGroups", "scripting", "storage",
"webNavigation", "debugger", "downloads", "unlimitedStorage",
"offscreen", "privateNetworkAccess", "tabCapture",
"clipboardWrite", "clipboardRead"
],
"host_permissions": ["<all_urls>", "http://localhost/*", "http://127.0.0.1/*", "http://*/*"]
}(This is the Chrome MV3 manifest. Firefox MV2 grants a narrower set —
activeTab, tabs, tabGroups, storage, unlimitedStorage, clipboard*,
<all_urls> — and has no debugger/offscreen, see Firefox Differences below.)
| Permission | Risk | Mitigation |
|---|---|---|
<all_urls> |
Content script injection anywhere — the agent can read and interact with any page the user visits | The user must explicitly switch to Act mode; Ask mode is read-only. The agent never auto-activates on new tabs. |
debugger |
CDP access provides trusted events and full DOM/network control on any tab | The debugger is only attached during active agent runs and detached on completion/abort. |
downloads |
Can save files to the user's Downloads folder without prompting | Only the agent's explicit tool calls (download_files, download_file, download_resource_from_page, download_social_media, screenshot({save:true})) use this, and each is gated by the capability × origin permission prompt. |
offscreen |
An offscreen document can make HTTP requests immune to user CSP | Only used for localhost LLM provider proxy and tab recording. Never forwards arbitrary URLs. |
The extension runs inside the user's authenticated browser session. There is no separate "AI account" — every site the user is logged into (GitHub, Gmail, banking, internal tools) is accessible to the agent with the user's full permissions, exactly as if they were clicking themselves.
The system prompt explicitly tells the model:
"You do NOT need API tokens, OAuth flows, or 'permission to act on the user's behalf'. The browser session already has all that."
This is a feature (it makes the agent useful with zero setup) but also the most important risk: the agent can do anything the user can do in a browser.
After every set_field / type_ax call, credential-fields.js checks whether the filled field is a credential input. Triggers:
<input type="password">autocomplete="current-password" | "new-password" | "one-time-code"- Field name / id / aria-label / placeholder / label text matches
SENSITIVE_NAME_RE
The regex: pwd|password|passwd|secret|token|api[-_\s]?key|otp|2fa|mfa|credential|recovery[-_\s]?code|backup[-_\s]?code|access[-_\s]?token|refresh[-_\s]?token|client[-_\s]?secret|private[-_\s]?key|seed[-_\s]?phrase|passphrase|pin[-_\s]?code
When enabled (Settings → "Strict secret handling"), the agent:
- Never quotes credentials in summaries, assistant text, or tool-call arguments — even when the user explicitly asks
- The
donetool description is swapped forDONE_TOOL_STRICT, which adds a hard prohibition - After filling a sensitive field,
CREDENTIAL_NOTE_STRICTis injected into the tool result
When disabled (the default — this is a personal-computer tool, not a third-party deployment):
- The model gets soft hygiene guidance ("prefer generic phrasing unless the user asks for the value")
- The user can ask to see credentials and the model will show them
- The
donetool description still encourages tidy summaries
Users can store a short profile (name, email, throwaway password) in Settings → Profile. This text is appended to the system prompt when enabled. Warnings in the UI:
- Stored in plaintext in
chrome.storage.local - Sent to the LLM provider on every turn as part of the system prompt
- Do not put passwords for important accounts here
The primary threat: a malicious page crafts content that, when read by the agent and fed to the LLM, causes the model to execute unintended actions.
| Layer | Mechanism |
|---|---|
| Untrusted-content wrapping | Page-derived tool results are wrapped in <untrusted_page_content> markers (_wrapUntrusted + UNTRUSTED_CONTENT_TOOLS) so the model treats them as data, not instructions. See prompt-injection-defense.md. |
| Capability × origin gate | Before a consequential tool runs (click/type/navigate/execute_js/network/download/…), the agent requires a (capability, host) grant — Allow once / Always / Deny. Language-agnostic, deterministic, human-in-the-loop (permission-gate.js). |
| Tool result cap | Individual tool results truncated at 8,000 chars (_limitToolResult). Injected text beyond that is silently dropped. |
| Ask/Act mode | In Ask mode, only read-only tools are available. The user must explicitly switch to Act for the agent to click/type/navigate. |
/allow-api |
A per-conversation /allow-api flag that waives the permission prompt for write-method network egress (fetch_url/research_url with POST/PUT/PATCH/DELETE). It does NOT waive GET egress or any other capability. Clears on conversation reset. |
done() blocking |
Before accepting completion, the agent probes for open dialogs/forms. If the summary claims "created"/"saved" but a modal is still open, the agent is forced to continue. |
| Duplicate-submit guard | Clicks on submit-like text (create/save/submit/add/post/publish/send/confirm/sign up/log in/pay/checkout/order, etc.) are blocked within a 45-second window per tab+URL (Chrome). |
| CLICK occlusion test | Before clicking, the resolver calls elementFromPoint(). If another element is visually on top, the click is refused. |
| Modal-scoped click | When a dialog is open, text clicks are scoped to that subtree so the agent doesn't click a dimmed background element. |
| Universal preamble | Every system prompt includes guidance on cookie banners and paywalls — two common injection vectors that look like benign page content. |
| Loop detection | Three independent detectors stop the agent if it's repeating the same action or oscillating. Limits damage from a persistently injected prompt. |
| Finance adapters | Adapters with category: 'finance' inject extra confirmation guidance and a warning banner. |
| Strict secret handling | Prevents credential exfiltration even if the model is jailbroken into quoting secrets. |
| Local network blocking | When disabled (default), fetch_url cannot reach private/RFC1918 addresses. Cloud-metadata endpoints (169.254.169.254) are always blocked. |
- The LLM provider itself: if the provider is compromised or malicious, it sees all conversation content including credentials the user types.
- Extension-unique fingerprinting: websites could detect the content script (pulsing border,
window.__wbElementMap, custom event handlers). - Timing-channel attacks: the agent's tool-call latency could be observable from page JS.
Set per-conversation via the /allow-api slash command in the side panel. When active, it waives the permission prompt for write-method network egress only:
fetch_url/research_urlwithmethod: POST/PUT/PATCH/DELETE
It does NOT waive GET egress, execute_js, or any other capability — those still
go through the capability × origin gate. (isNetworkMutation in
permission-gate.js is what /allow-api keys off; execute_js is its own
Capability.EXECUTE_JS and is always gated.)
The system prompt adds a preamble telling the model to:
- State the URL, method, and payload in plain text before any destructive API call
- Default to UI-first; only reach for the API when UI has actually failed
Cleared on conversation reset.
The trace recorder (trace/recorder.js) writes to IndexedDB on the user's machine when explicitly enabled (Settings → Display → "Record traces"). Data never leaves the browser:
runsstore: model, provider, token totals, timestampseventsstore: LLM requests/responses, tool calls, screenshot metadatashotsstore: screenshot blobs
The traces page (ui/traces.html) reads from local IndexedDB only. Export produces a JSON blob identical to what the user sees on screen — no telemetry, no network calls.
Firefox has no CDP (debugger permission), so:
- No trusted events (synthetic
el.click()only) - No full-page screenshots
- No shadow DOM piercing for closed roots
- No offscreen document (CORS must be handled by LLM servers)
- No tab recording (
record_tab— Chrome'srecorder/is absent) - No duplicate-submit guard (the timestamp Map is declared but unwired)
Everything else — the permission gate, untrusted-content wrapping, credential
detection, loop detection, adapter system, and the trace recorder (it ships
identically in src/firefox/src/trace/recorder.js) — is the same.
See SECURITY.md for the disclosure contact and policy.