Skip to content

feat: sidebar CSS inspector + per-tab agents (v0.13.9.0)#650

Open
garrytan wants to merge 31 commits intomainfrom
garrytan/sidebar-css-inspector
Open

feat: sidebar CSS inspector + per-tab agents (v0.13.9.0)#650
garrytan wants to merge 31 commits intomainfrom
garrytan/sidebar-css-inspector

Conversation

@garrytan
Copy link
Copy Markdown
Owner

@garrytan garrytan commented Mar 30, 2026

Summary

CSS Inspector: Full CDP-powered CSS inspection in the Chrome extension sidebar. Pick an element, see the complete rule cascade with specificity badges and source file:line, box model visualization, and computed styles. Edit styles live via $B style command. Undo changes with $B style --undo.

Per-Tab Agents: Each browser tab gets its own independent Claude agent. BROWSE_TAB environment variable pins each agent to its tab, preventing cross-tab interference when multiple agents run simultaneously. Tab switching in the browser swaps the sidebar's chat context.

Tab Tracking: context.on('page') auto-tracks user-created tabs. page.on('close') cleans up closed tabs. The sidebar tab bar renders and syncs in real time.

Page Cleanup: $B cleanup --all removes ads, cookie banners, sticky headers, social widgets. $B prettyscreenshot combines cleanup + scroll + screenshot.

Sidebar UX: Stop button, "Browser co-pilot" banner, "Ask about this page..." placeholder, streaming narration, per-tab chat history. Cleanup and Screenshot buttons in the inspector toolbar.

CSP Fallback: Sites with strict CSP (SF Chronicle, etc.) get a basic picker via content.js fallback. Computed styles, box model, and same-origin CSS rules. Full CDP on sites that allow injection.

Security (from main merge): XML escaping for user messages, prompt injection defense, allowed-commands whitelist, stderr capture for error reporting. Fixed inspector message allowlist (pre-existing bug found by Codex).

Commits (14)

  • f5daf7b CDP inspector module (persistent sessions, CSS cascade, style modification)
  • e084ca9 Browse server endpoints + inspect/style/cleanup/prettyscreenshot CLI
  • f395f58 Extension inspector (element picker, box model, rule cascade, quick edit)
  • da5d008 Docs (SKILL.md template + generated files)
  • dcf1b0d Auto-track user-created tabs + handle tab close
  • 54fec2d Per-tab agent isolation via BROWSE_TAB env var
  • a36a3ac Sidebar per-tab chat, tab bar sync, stop button, UX polish
  • 812882d Tests (140 tests for per-tab isolation, BROWSE_TAB, tab tracking, sidebar UX)
  • fe4441b Merge main (security audit round 2)
  • 91d2f73 Resolve merge conflicts (security + per-tab isolation)
  • 6238edd Fix inspector message allowlist + CSP fallback in background.js
  • 8d65628 Basic element picker in content.js for CSP-restricted pages
  • 7e31568 Cleanup + screenshot buttons in sidebar toolbar
  • 9698095 Tests for CSP fallback, buttons, allowlist

Test Coverage

156 tests across sidebar-agent.test.ts and sidebar-ux.test.ts covering:

  • CDP inspector data flow (element pick, rule cascade, specificity sorting)
  • Per-tab agent isolation (BROWSE_TAB env var, handleCommand tab pinning, save/restore)
  • Tab tracking (context.on('page'), page.on('close'), syncActiveTabByUrl)
  • Sidebar UX (tab bar, stop button, banner text, per-tab chat history)
  • CSP fallback (basic picker, captureBasicData, outline save/restore, message allowlist)
  • Cleanup + Screenshot buttons (POST /command, loading state, notifications)

Pre-Landing Review

Pre-Landing Review: No issues found. All code is local dev tooling (browser extension + Playwright server). No SQL, no user data persistence, no production deployment.

Design Review

Design review CLEAR (8/10) from plan phase. 12 design decisions: information hierarchy, interaction states (empty/loading/error), box model colors (gstack palette), design tokens, width constraints, picker overlay, click-to-edit, accessibility specs.

Documentation

  • BROWSER.md: added inspect, style, cleanup, prettyscreenshot to command table
  • CLAUDE.md: updated extension description to mention CSS inspector
  • README.md: updated /connect-chrome row to mention CSS inspection, per-tab agents, cleanup, screenshots
  • CHANGELOG.md: added CSP fallback, cleanup/screenshot buttons, allowlist fix entries

Test plan

  • All bun test pass (0 failures, 156 sidebar tests)
  • Merge conflicts resolved (security defenses + per-tab isolation)
  • Eng review CLEAR (plan phase)
  • Design review CLEAR (plan phase, 8/10)
  • 2x Codex reviews completed (cross-model tension resolved, allowlist bug found)

🤖 Generated with Claude Code

garrytan and others added 11 commits March 29, 2026 20:25
…modification

New browse/src/cdp-inspector.ts with full CDP inspection engine:
- inspectElement() via CSS.getMatchedStylesForNode + DOM.getBoxModel
- modifyStyle() via CSS.setStyleTexts with headless page.evaluate fallback
- Persistent CDP session lifecycle (create, reuse, detach on nav, re-create)
- Specificity sorting, overridden property detection, UA rule filtering
- Modification history with undo support
- formatInspectorResult() for CLI output

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…yscreenshot CLI

Server endpoints: POST /inspector/pick, GET /inspector, POST /inspector/apply,
POST /inspector/reset, GET /inspector/history, GET /inspector/events (SSE).
CLI commands: inspect (CDP cascade), style (live CSS mod), cleanup (page clutter
removal), prettyscreenshot (clean screenshot pipeline).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…, quick edit

Extension changes for the visual CSS inspector:
- inspector.js: element picker with hover highlight, CSS selector generation,
  basic mode fallback (getComputedStyle + CSSOM), page alteration handlers
- inspector.css: picker overlay styles (blue highlight + tooltip)
- background.js: inspector message routing (picker <-> server <-> sidepanel)
- sidepanel: Inspector tab with box model viz (gstack palette), matched rules
  with specificity badges, computed styles, click-to-edit quick edit,
  Send to Agent/Code button, empty/loading/error states

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
browser-manager.ts changes:
- context.on('page') listener: automatically tracks tabs opened by the user
  (Cmd+T, right-click open in new tab, window.open). Previously only
  programmatic newTab() was tracked, so user tabs were invisible.
- page.on('close') handler in wirePageEvents: removes closed tabs from the
  pages map and switches activeTabId to the last remaining tab.
- syncActiveTabByUrl: match Chrome extension's active tab URL to the correct
  Playwright page for accurate tab identity.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Prevents parallel sidebar agents from interfering with each other's tab context.

Three-layer fix:
- sidebar-agent.ts: passes BROWSE_TAB=<tabId> env var to each claude process,
  per-tab processing set allows concurrent agents across tabs
- cli.ts: reads process.env.BROWSE_TAB and includes tabId in command request body
- server.ts: handleCommand() temporarily switches activeTabId when tabId is present,
  restores after command completes (safe: Bun event loop is single-threaded)

Also: per-tab agent state (TabAgentState map), per-tab message queuing,
per-tab chat buffers, verbose streaming narration, stop button endpoint.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Extension changes:
- sidepanel.js: per-tab chat history (tabChatHistories map), switchChatTab()
  swaps entire chat view, browserTabActivated handler for instant tab sync,
  stop button wired to /sidebar-agent/stop, pollTabs renders tab bar
- sidepanel.html: updated banner text ("Browser co-pilot"), stop button markup,
  input placeholder "Ask about this page..."
- sidepanel.css: tab bar styles, stop button styles, loading state fixes
- background.js: chrome.tabs.onActivated sends browserTabActivated to sidepanel
  with tab URL for instant tab switch detection

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
sidebar-agent.test.ts (new tests):
- BROWSE_TAB env var passed to claude process
- CLI reads BROWSE_TAB and sends tabId in body
- handleCommand accepts tabId, saves/restores activeTabId
- Tab pinning only activates when tabId provided
- Per-tab agent state, queue, concurrency
- processingTabs set for parallel agents

sidebar-ux.test.ts (new tests):
- context.on('page') tracks user-created tabs
- page.on('close') removes tabs from pages map
- Tab isolation uses BROWSE_TAB not system prompt hack
- Per-tab chat context in sidepanel
- Tab bar rendering, stop button, banner text

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…inspector

# Conflicts:
#	browse/src/server.ts
#	browse/src/sidebar-agent.ts
…tion

Merged main's security improvements (XML escaping, prompt injection defense,
allowed commands whitelist, --model opus, Write tool, stderr capture) with
our branch's per-tab isolation (BROWSE_TAB env var, processingTabs set,
no --resume). Updated test expectations for expanded system prompt.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@github-actions
Copy link
Copy Markdown

github-actions bot commented Mar 30, 2026

E2E Evals: ✅ PASS

61/61 tests passed | $6.09 total cost | 12 parallel runners

Suite Result Status Cost
e2e-browse 7/7 $0.34
e2e-deploy 6/6 $1.09
e2e-design 3/3 $0.54
e2e-plan 7/7 $1.2
e2e-qa-workflow 3/3 $0.89
e2e-review 6/6 $1.07
e2e-workflow 4/4 $0.46
llm-judge 25/25 $0.5

12x ubicloud-standard-2 (Docker: pre-baked toolchain + deps) | wall clock ≈ slowest suite

garrytan and others added 18 commits March 29, 2026 23:10
Pre-existing bug found by Codex: ALLOWED_TYPES in background.js was missing
all inspector message types (startInspector, stopInspector, elementPicked,
pickerCancelled, applyStyle, toggleClass, injectCSS, resetAll, inspectResult).
Messages were silently rejected, making the inspector broken on ALL pages.

Also: separate executeScript and insertCSS into individual try blocks in
injectInspector(), store inspectorMode for routing, and add content.js
fallback when script injection fails (CSP, chrome:// pages).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
When inspector.js can't be injected (CSP, chrome:// pages), content.js
provides a basic picker using getComputedStyle + CSSOM:
- startBasicPicker/stopBasicPicker message handlers
- captureBasicData() with ~30 key CSS properties, box model, matched rules
- Hover highlight with outline save/restore (never leaves artifacts)
- Click uses e.target directly (no re-querying by selector)
- Sends inspectResult with mode:'basic' for sidebar rendering
- Escape key cancels picker and restores outlines

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Two action buttons in the inspector toolbar:
- Cleanup (🧹): POSTs cleanup --all to server, shows spinner, chat
  notification on success, resets inspector state (element may be removed)
- Screenshot (📸): POSTs screenshot to server, shows spinner, chat
  notification with saved file path

Shared infrastructure:
- .inspector-action-btn CSS with loading spinner via ::after pseudo-element
- chat-notification type in addChatEntry() for system messages
- package.json version bump to 0.13.9.0

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
16 new tests in sidebar-ux.test.ts:
- Inspector message allowlist includes all inspector types
- content.js basic picker (startBasicPicker, captureBasicData, CSSOM,
  outline save/restore, inspectResult with mode basic, Escape cleanup)
- background.js CSP fallback (separate try blocks, inspectorMode, fallback)
- Cleanup button (POST /command, inspector reset after success)
- Screenshot button (POST /command, notification rendering)
- Chat notification type and CSS styles

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Quick actions toolbar (🧹 Cleanup, 📸 Screenshot) now appears above the chat
input, always visible. Both inspector and chat buttons share runCleanup() and
runScreenshot() helper functions. Clicking either set shows loading state on
both simultaneously.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Tests that chat toolbar exists (chat-cleanup-btn, chat-screenshot-btn,
quick-actions container), CSS styles (.quick-action-btn, .quick-action-btn.loading),
shared runCleanup/runScreenshot helper functions, and cleanup inspector reset.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…emoval

Massively expanded CLEANUP_SELECTORS with patterns from uBlock Origin and
Readability.js research:
- ads: 30+ selectors (Google, Amazon, Outbrain, Taboola, Criteo, etc.)
- cookies: OneTrust, Cookiebot, TrustArc, Quantcast + generic patterns
- overlays (NEW): paywalls, newsletter popups, interstitials, push prompts,
  app download banners, survey modals
- social: follow prompts, share tools
- Cleanup now defaults to --all when no args (sidebar button fix)
- Uses !important on all display:none (overrides inline styles)
- Unlocks body/html scroll (overflow:hidden from modal lockout)
- Removes blur/filter effects (paywall content blur)
- Removes max-height truncation (article teaser truncation)
- Collapses empty ad placeholder whitespace (empty divs after ad removal)
- Skips gstack-ctrl indicator in sticky removal

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- setActionButtonsEnabled() toggles .disabled class on all cleanup/screenshot
  buttons (both chat toolbar and inspector toolbar)
- Called with false in updateConnection when server URL is null
- Called with true when connection established
- runCleanup/runScreenshot silently return when disconnected instead of
  showing 'Not connected' error notifications
- CSS .disabled style: pointer-events:none, opacity:0.3, cursor:not-allowed

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
17 new tests:
- cleanup defaults to --all on empty args
- CLEANUP_SELECTORS overlays category (paywall, newsletter, interstitial)
- Major ad networks in selectors (doubleclick, taboola, criteo, etc.)
- Major consent frameworks (OneTrust, Cookiebot, TrustArc, Quantcast)
- !important override for inline styles
- Scroll unlock (body overflow:hidden)
- Blur removal (paywall content blur)
- Article truncation removal (max-height)
- Empty placeholder collapse
- gstack-ctrl indicator skip in sticky cleanup
- setActionButtonsEnabled function
- Buttons disabled when disconnected
- No error spam from cleanup/screenshot when disconnected
- CSS disabled styles for action buttons

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Instead of brittle CSS selectors, the cleanup button now sends a prompt to
the sidebar agent (which IS an LLM). The agent:
1. Runs deterministic $B cleanup --all as a quick first pass
2. Takes a snapshot to see what's left
3. Analyzes the page semantically to identify remaining clutter
4. Removes elements intelligently, preserving site branding

This means cleanup works correctly on any site without site-specific selectors.
The LLM understands that "Your Daily Puzzles" is clutter, "ADVERTISEMENT" is
junk, but the SF Chronicle masthead should stay.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Deterministic cleanup improvements (used as first pass before LLM analysis):
- New 'clutter' category: audio players, podcast widgets, sidebar puzzles/games,
  recirculation widgets (taboola, outbrain, nativo), cross-promotion banners
- Text-content detection: removes "ADVERTISEMENT", "Article continues below",
  "Sponsored", "Paid content" labels and their parent wrappers
- Sticky fix: preserves the topmost full-width element near viewport top (site
  nav bar) instead of hiding all sticky/fixed elements. Sorts by vertical
  position, preserves the first one that spans >80% viewport width.

Tests: clutter category, ad label removal, nav bar preservation logic.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…y nav

22 new tests covering:
- Cleanup button uses /sidebar-command (agent) not /command (deterministic)
- Cleanup prompt includes deterministic first pass + agent snapshot analysis
- Cleanup prompt lists specific clutter categories for agent guidance
- Cleanup prompt preserves site identity (masthead, headline, body, byline)
- Cleanup prompt instructs scroll unlock and $B eval removal
- Loading state management (async agent, setTimeout)
- Deterministic clutter: audio/podcast, games/puzzles, recirculation
- Ad label text patterns (ADVERTISEMENT, Sponsored, Article continues)
- Ad label parent wrapper hiding for small containers
- Sticky nav preservation (sort by position, first full-width near top)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
CHANGELOG: keep both main's entries (0.13.10.0 Office Hours, 0.13.9.0
Composable Skills) and our sidebar inspector entry, re-versioned to
0.13.11.0 since our features land after main's 0.13.10.0.

VERSION: 0.13.9.0 (ours) vs 0.13.10.0 (main) → 0.13.11.0 (combined).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Root cause: server persists chat to disk (chat.jsonl) and replays on restart.
Client had no dedup, so every reconnect re-rendered the entire history.
Messages from an old HN session would repeat endlessly on the SF Chronicle tab.

Fix: renderedEntryIds Set tracks which entry IDs have been rendered. addChatEntry
skips entries already in the set. Entries without an id (local notifications)
bypass the check. Clear chat resets the set.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…ion safety

Three fixes for sidebar agent UX:
- System prompt: "Be CONCISE. STOP as soon as the task is done. Do NOT keep
  exploring or doing bonus work." Prevents agent from endlessly taking
  screenshots and highlighting elements after answering the question.
- switchTab(id, opts): new bringToFront option. Internal tab pinning
  (BROWSE_TAB) uses bringToFront: false so agent commands never steal
  window focus from the user's active app.
- Keep opus model (not sonnet) for prompt injection resistance on untrusted
  web pages. Remove Write from allowedTools (agent only needs Bash for $B).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Tests for the three UX fixes:
- System prompt contains STOP/CONCISE/Do NOT keep exploring
- sidebar agent uses opus (not sonnet) for prompt injection resistance
- switchTab has bringToFront option, defaults to true (opt-out)
- handleCommand tab pinning uses bringToFront: false (no focus steal)
- Updated stale tests: switchTab signature, allowedTools excludes Write,
  narration -> conciseness, tab pinning restore calls

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
New E2E test (periodic tier, ~$2/run) that exercises the full sidebar
agent pipeline with CSS interaction:
1. Agent navigates to Hacker News
2. Clicks into the top story's comments
3. Reads comments and identifies the most insightful one
4. Highlights it with a 4px solid orange outline via style injection

Tests: navigation, snapshot, text reading, LLM judgment, CSS modification.
Requires real browser + real Claude (ANTHROPIC_API_KEY).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
garrytan and others added 2 commits March 30, 2026 00:58
…14.0.0

Conflict cause: main merged v0.14.0.0 (design-to-code / design-html skill)
while this branch had v0.13.11.0 (sidebar CSS inspector). Both touched
VERSION, package.json, and CHANGELOG.md.

Resolution:
- VERSION/package.json: bump to 0.14.1.0 (our branch lands after main's 0.14.0.0)
- CHANGELOG: our 0.14.1.0 entry on top, main's 0.14.0.0 entry preserved below
- Version sequence verified contiguous: 0.14.1.0 → 0.14.0.0 → 0.13.10.0

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Root cause of test failure: BROWSE_IDLE_TIMEOUT is in milliseconds, not
seconds. '600' = 0.6 seconds, server died immediately after health check.
Fixed to '600000' (10 minutes).

Also: use 'pipe' stdio instead of file descriptors (closing fds kills child
on macOS/bun), catch ConnectionRefused on poll retry, 4 min poll timeout
for the multi-step opus task.

Test passes: agent navigates to HN, reads comments, identifies most
insightful one, highlights it with orange CSS, stops. 114s, $0.00.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant