feat(browser): interactive browser viewing UI with CDP screencast#531
feat(browser): interactive browser viewing UI with CDP screencast#531
Conversation
Add a browser tab to the settings page that shows active browser sessions and allows viewing them via CDP screencast. Users can log in to websites manually and export cookies for agent automation. Backend: - Add screencast module (ScreencastRegistry) for CDP frame relay - Add mouse/keyboard input, cookie export/import to BrowserManager - Add browser.screencast.frame protocol event - Add /api/browser/action and /api/browser/sessions endpoints - Wire screencast frame broadcasting to WebSocket clients Frontend: - Add page-browser.js with canvas-based screencast viewer - Session list with view, export cookies, close actions - Mouse/keyboard input relay to browser viewport - Navigation bar for URL entry - Register as "Browser" tab in settings page https://claude.ai/code/session_015M64wR6GhhnyiEFodhAuAH
Merging this PR will improve performance by 21.75%
Performance Changes
Comparing Footnotes
|
Greptile SummaryThis PR introduces real-time browser session viewing and interactive control to the web UI. A new CDP screencast pipeline streams JPEG frames from Chromium via broadcast channels (
Confidence Score: 4/5Safe to merge with caution — the three P1 findings affect screencast reliability but don't risk data loss or security; most core functionality works correctly Three P1 findings exist: the relay task can permanently stop if a frame arrives before the subscriber connects, duplicate relay tasks are spawned on repeated StartScreencast calls, and dead code in list_sessions performs unnecessary async work. None cause crashes or data corruption, but the screencast feature may silently stop delivering frames in some conditions and will deliver duplicate frames on reconnect. crates/browser/src/screencast.rs (relay lifetime), crates/web/src/api.rs (relay dedup), crates/browser/src/manager.rs (dead code in list_sessions) Important Files Changed
Sequence DiagramsequenceDiagram
participant UI as Browser UI (page-browser.js)
participant API as Web API (api.rs)
participant GW as Gateway (services.rs)
participant REG as ScreencastRegistry (screencast.rs)
participant CDP as Chromium CDP
UI->>API: POST /api/browser/action {action: start_screencast}
API->>GW: browser.request(body)
GW->>REG: screencasts.start(session_id, page, ...)
REG->>CDP: Page.startScreencast
REG-->>GW: broadcast::Receiver (_rx, dropped immediately)
GW-->>API: BrowserResponse::success
API->>GW: subscribe_screencast(session_id)
GW->>REG: registry.subscribe(session_id)
REG-->>GW: raw broadcast::Receiver
GW->>GW: spawn adapter task (ScreencastFrame → Value)
GW-->>API: Value broadcast::Receiver
API->>API: spawn relay task (Value → WebSocket broadcast)
loop CDP screencast frames
CDP->>REG: EventScreencastFrame
REG->>REG: relay task: ack + tx.send(frame)
REG-->>GW: adapter task receives frame
GW-->>API: relay task receives Value
API-->>UI: WS event browser.screencast.frame
end
UI->>API: POST /api/browser/action {action: stop_screencast}
API->>GW: browser.request(body)
GW->>REG: screencasts.stop(session_id, page)
REG->>CDP: Page.stopScreencast
REG->>REG: remove from registry, abort relay task
Reviews (1): Last reviewed commit: "feat(browser): add browser viewing UI wi..." | Re-trigger Greptile |
crates/browser/src/screencast.rs
Outdated
| if tx.send(frame).is_err() { | ||
| debug!(session_id = %session_id, "no screencast subscribers, stopping relay"); | ||
| break; |
There was a problem hiding this comment.
Relay task permanently exits when no subscribers, silently breaking future subscriptions
When tx.send(frame).is_err(), the relay task exits (line 239). This becomes a problem because of how the initial receiver is handled in start_screencast (manager.rs:799–802): the returned _rx is discarded immediately when that function returns. The relay task then becomes the sole holder of the tx side. If even a single CDP frame arrives between _rx being dropped and spawn_screencast_relay subscribing (api.rs:1141-1142), the relay task finds no receivers, exits, and is gone permanently.
After this point, any call to subscribe_screencast() → registry.subscribe() will successfully hand out a new receiver from handle.tx.subscribe(), but since the relay task is dead no more frames will ever be sent into that channel. From the caller's perspective the screencast appears active but produces no frames.
A simple fix is to keep the relay running even when there are temporarily no receivers:
if tx.send(frame).is_err() {
debug!(session_id = %session_id, "no screencast subscribers, frame dropped");
// do NOT break — a subscriber may connect shortly
}Alternatively, hold a permanent "keep-alive" receiver inside ActiveScreencast.
| if is_start_screencast { | ||
| if let Some(sid) = session_id { | ||
| spawn_screencast_relay(state, &sid).await; | ||
| } |
There was a problem hiding this comment.
Multiple relay tasks spawned per session causes duplicate frame delivery
Every successful StartScreencast request unconditionally calls spawn_screencast_relay, which in turn calls subscribe_screencast (services.rs:1398-1429). subscribe_screencast always spawns a brand-new adapter task and returns a new Value receiver. spawn_screencast_relay then always spawns an additional broadcast relay task.
There is no guard to check whether a relay is already running for the given session_id. Calling StartScreencast N times for the same session creates N adapter tasks and N relay tasks — every CDP frame gets forwarded N times to all WebSocket clients.
In the UI, startScreencast is called from the "View" button with no dedup, so a user double-clicking or page-refreshing will trigger duplicate relays.
Consider tracking active relay sessions server-side, e.g. via a DashSet<String> in AppState, and skipping spawn_screencast_relay if an entry already exists.
| pub async fn list_sessions(&self) -> Vec<crate::pool::BrowserSessionInfo> { | ||
| let mut sessions = self.pool.list_sessions().await; | ||
| // Annotate which sessions have active screencasts. | ||
| let screencast_sessions = self.screencasts.active_sessions().await; | ||
| for session in &mut sessions { | ||
| // The BrowserSessionInfo doesn't have a screencast field yet, | ||
| // but clients can check the screencast endpoint separately. | ||
| let _ = screencast_sessions.contains(&session.session_id); | ||
| } | ||
| sessions | ||
| } |
There was a problem hiding this comment.
Dead code:
screencast_sessions.contains(...) result is silently discarded
The loop body computes whether a session is screencasting but throws the result away with let _ = .... The BrowserSessionInfo struct has no screencasting field, so the information never reaches callers.
The enrichment is correctly performed independently in services.rs, so the actual API response is accurate — but the code inside this loop is completely useless: it performs an async allocation (active_sessions() returning a Vec<String>) and O(n) lookup per session for no effect.
Either add screencasting: bool to BrowserSessionInfo and populate it here (removing the duplicate logic in services.rs), or remove this loop entirely:
pub async fn list_sessions(&self) -> Vec<crate::pool::BrowserSessionInfo> {
self.pool.list_sessions().await
}| // If already active, return a new subscriber. | ||
| { | ||
| let active = self.active.read().await; | ||
| if let Some(handle) = active.get(session_id) { | ||
| debug!(session_id, "screencast already active, adding subscriber"); | ||
| return Ok(handle.tx.subscribe()); | ||
| } | ||
| } | ||
|
|
||
| // Start CDP screencast via the builder pattern. | ||
| let params = StartScreencastParams { | ||
| format: Some(StartScreencastFormat::Jpeg), | ||
| quality: Some(i64::from(quality.min(100))), | ||
| max_width: Some(i64::from(max_width)), | ||
| max_height: Some(i64::from(max_height)), | ||
| every_nth_frame: Some(1), | ||
| }; | ||
|
|
||
| page.execute(params) | ||
| .await | ||
| .map_err(|e| crate::error::Error::Cdp(format!("failed to start screencast: {e}")))?; | ||
|
|
||
| let (tx, rx) = broadcast::channel(FRAME_CHANNEL_CAPACITY); | ||
|
|
||
| // Spawn background task to relay CDP screencast frame events. | ||
| let tx_clone = tx.clone(); | ||
| let sid = session_id.to_string(); | ||
| let page_clone = page.clone(); | ||
|
|
||
| let task = tokio::spawn(async move { | ||
| relay_screencast_frames(page_clone, tx_clone, sid).await; | ||
| }); | ||
|
|
||
| let inner = Arc::new(ActiveScreencast { | ||
| tx: tx.clone(), | ||
| abort: task.abort_handle(), | ||
| }); | ||
|
|
||
| self.active | ||
| .write() | ||
| .await | ||
| .insert(session_id.to_string(), inner); | ||
|
|
||
| debug!(session_id, "screencast started"); | ||
| Ok(rx) |
There was a problem hiding this comment.
TOCTOU race: two concurrent
start() calls for the same session can both spawn relay tasks
The existence check (line 107, read lock) and the insert (line 143, write lock) are not atomic. Two concurrent callers for the same session_id can both pass the read-lock check, both send the CDP StartScreencast command, and both spawn a relay task. The second insert overwrites the first Arc<ActiveScreencast>, dropping it and aborting the first relay task, while two CDP start commands have been sent.
Fix by holding the write lock for the entire operation:
let mut active = self.active.write().await;
if let Some(handle) = active.get(session_id) {
return Ok(handle.tx.subscribe());
}
// … execute CDP command, spawn task …
active.insert(session_id.to_string(), inner);| function onMouse(e) { | ||
| relayMouseEvent(e, canvas); | ||
| } | ||
| canvas.addEventListener("mousedown", onMouse); | ||
| canvas.addEventListener("mouseup", onMouse); | ||
| canvas.addEventListener("mousemove", onMouse); | ||
|
|
||
| // Keyboard: focus the canvas to receive key events | ||
| canvas.setAttribute("tabindex", "0"); | ||
| canvas.addEventListener("keydown", relayKeyEvent); | ||
| canvas.addEventListener("keyup", relayKeyEvent); | ||
|
|
||
| return () => { | ||
| canvas.removeEventListener("mousedown", onMouse); | ||
| canvas.removeEventListener("mouseup", onMouse); | ||
| canvas.removeEventListener("mousemove", onMouse); | ||
| canvas.removeEventListener("keydown", relayKeyEvent); |
There was a problem hiding this comment.
Scroll/wheel events not relayed —
MouseInputType::Wheel is unreachable from the UI
The canvas event listeners cover mousedown, mouseup, and mousemove, but there is no wheel listener. The backend has full support for MouseInputType::Wheel mapped to CDP MouseWheel, but the frontend never generates it. Users cannot scroll pages within the browser viewer.
| function onMouse(e) { | |
| relayMouseEvent(e, canvas); | |
| } | |
| canvas.addEventListener("mousedown", onMouse); | |
| canvas.addEventListener("mouseup", onMouse); | |
| canvas.addEventListener("mousemove", onMouse); | |
| // Keyboard: focus the canvas to receive key events | |
| canvas.setAttribute("tabindex", "0"); | |
| canvas.addEventListener("keydown", relayKeyEvent); | |
| canvas.addEventListener("keyup", relayKeyEvent); | |
| return () => { | |
| canvas.removeEventListener("mousedown", onMouse); | |
| canvas.removeEventListener("mouseup", onMouse); | |
| canvas.removeEventListener("mousemove", onMouse); | |
| canvas.removeEventListener("keydown", relayKeyEvent); | |
| canvas.addEventListener("mousedown", onMouse); | |
| canvas.addEventListener("mouseup", onMouse); | |
| canvas.addEventListener("mousemove", onMouse); | |
| canvas.addEventListener("wheel", onMouse, { passive: false }); |
Also add a wheel case in relayMouseEvent and pass e.deltaX/e.deltaY in the action payload, mapping them to DispatchMouseEventParams on the Rust side.
Codecov Report❌ Patch coverage is 📢 Thoughts on this report? Let us know! |
Three fixes for the browser viewing UI: 1. Strip explicit null values from LLM tool call params before deserialization — serde(default) only handles missing keys, not null values, causing "invalid type: null, expected u64" errors. 2. Share a single BrowserManager between BrowserTool and RealBrowserService via Arc<OnceCell>. Previously each had its own manager, so sessions created by agents were invisible to the UI. 3. Add "New Session" button to the browser page so users can create sessions directly from the UI (navigate to about:blank + auto-start screencast). Agents share the same cookie profile. Entire-Checkpoint: 6719842d0d2c
Use provider-btn-sm and provider-btn-secondary classes consistent with the MCP and Skills pages. Entire-Checkpoint: 769d0e7ad9b2
…on styles - Allow about:blank in URL validation so "New Session" can create an empty browser session without navigating to an external URL. - Use provider-btn / provider-btn-sm classes on all session card buttons to match the style used on MCP, Skills, and other pages. Entire-Checkpoint: 0402a72e2760
Entire-Checkpoint: 4100596d19db
- Don't auto-start screencast on about:blank (no frames generated) - Show navigate bar immediately so user can enter a URL - Auto-start screencast after navigating to a real page - Show "Enter a URL above" hint in the canvas placeholder Entire-Checkpoint: 17379e8abfdf
The service's request() now injects sandbox=true when the field is absent, matching the behavior of the BrowserTool which reads from the sandbox router. Without this, clicking "New Session" in the UI would always launch a host browser even when sandbox is enabled. Entire-Checkpoint: af07d8ac7130
- Auto-prepend https:// for bare domains (e.g. "lemonde.fr") - Fall back to Google search for non-URL queries - Show Google suggestions, session URL history, and direct URL matches in a dropdown with keyboard navigation - Check res.success on navigate responses to surface errors - Match input and Go button heights using provider-btn-sm sizing Entire-Checkpoint: eb57efdb4ba2
browserAction() now checks res.success on every response and throws on failure. Previously errors returned as 200 with success=false were silently swallowed (e.g. invalid URLs, connection failures). Entire-Checkpoint: 76e24ac2897a
The relay task exited immediately when tx.send() had no receivers, which races with the UI subscribing shortly after. CDP sends the first frame before spawn_screencast_relay can subscribe, causing the relay to exit and all subsequent frames to be lost. The relay now drops frames when no one is listening instead of exiting. The task is still properly cleaned up via its abort_handle when the screencast is stopped. Entire-Checkpoint: 28b9900be3c1
- Add integration test that verifies CDP frames arrive through the ScreencastRegistry broadcast channel (passes in host mode). - Add info/warn tracing at every stage of the screencast relay chain: CDP frame listener, service subscribe, API relay spawn. This will surface exactly where frames get stuck. Entire-Checkpoint: fcdec9d1c2d8
The v4 protocol requires explicit event subscription. The browser.screencast.frame event was missing from the subscription list, so all frames were silently dropped by the broadcast filter. Entire-Checkpoint: 42a252e4a873
Rust integration tests (crates/browser): - screencast_host: verifies CDP frames arrive in host mode - screencast_sandbox: verifies CDP frames arrive in container mode (auto-skips when no container runtime available) Playwright e2e tests (crates/web/ui/e2e/specs/browser.spec.js): - Browser page renders with heading and buttons - Empty state message shown when no sessions - New session button shows creating state - Navigate bar appears after session creation - Bare domains auto-normalized with https:// - Screencast delivers frames after navigation (canvas visible) - Session can be closed - Sandbox badge rendered when sandbox enabled Smoke test: added /settings/browser route. Entire-Checkpoint: 404370604901
…on fixes Interaction improvements: - Add mouse wheel/scroll relay via CDP mouseWheel events with delta_x/delta_y support - Prevent default on mousedown (no text selection/drag on canvas) - Block context menu on canvas - Auto-focus canvas on mousedown for keyboard capture Session list UX: - Clicking a session card selects it and auto-starts screencast - Active session highlighted with accent border - Removed redundant View/Stop Viewing buttons — card click handles it - Buttons (Export Cookies, Close) use stopPropagation to not trigger card selection Entire-Checkpoint: 872e5b20d786
When switching sessions, the canvas was blank until the first screencast frame arrived via WebSocket. Now fetches an immediate screenshot via REST API when selecting a session, displaying it on the canvas while the screencast stream connects. Also tracks frame MIME type (PNG for screenshots, JPEG for screencast frames) for correct canvas rendering. Entire-Checkpoint: 4cf5c9dfbb6b
- Show "Fetching browser view..." while taking screenshot on session switch instead of a blank canvas - Show "paused" badge on sessions that have a URL but aren't being actively viewed — clicking them makes them live - Clear frame data when switching sessions to avoid showing stale content from the previous session Entire-Checkpoint: b8875017d863
…ify cards - Prefetch screenshots for all sessions in the background after fetching the session list — switching is instant from cache - Clear frame data, screencast state, and stop previous screencast when creating a new session (no stale canvas from previous session) - Reset URL bar to the selected session's URL when switching - Remove Export Cookies button, keep only a small Close link - Cache screenshots on first fetch to avoid re-fetching Entire-Checkpoint: 3da290c6da8b
When clicking "New Session", a placeholder card with "creating" badge and "Starting browser..." appears immediately in the session list, selected and highlighted. Once the backend finishes, the placeholder is replaced with the real session. On failure, the placeholder is removed. Entire-Checkpoint: 325a71fa44b7
The canvas is conditionally rendered — placeholders show when there's no frame data. The useEffect with [] deps ran once on mount when the canvas didn't exist yet, so mouse/keyboard/scroll listeners were never attached. Switched to a ref callback that fires whenever the canvas DOM element appears or disappears, properly attaching and cleaning up event listeners. Clicks, scrolling, and keyboard input now work. Entire-Checkpoint: a2823835c7b2
Wheel events fire at 60fps and each one was creating a separate HTTP request, overwhelming the server and causing scroll to feel broken. - Batch wheel deltas into a single request every 50ms - Throttle mousemove to one event per 50ms (was flooding at 60fps) - Click (mousedown/mouseup) events remain unthrottled for accuracy Entire-Checkpoint: 3ce2f51400c7
Major fixes: - Session switching no longer resets activeSession to null (was causing blank canvas and lost state) - Frame listener stays active across session switches (no gap where frames are dropped) - fetchSessions preserves placeholder "creating" entries - Removed noisy start/stop screencast toasts during switches - Screenshot fetch errors trigger session list refresh (handles dead sessions gracefully) - Guard against stale async results when user switches rapidly - Canvas auto-focuses when it appears for immediate keyboard input - Click coordinates use offsetX/offsetY for accurate mapping (fixes clicks landing too high due to border offset) - Invalidate screenshot cache when navigating to new URL Entire-Checkpoint: 5d1ab2347def
Root cause of broken scrolling: CDP rejects mouseWheel events when deltaX or deltaY are missing. The code conditionally set them only when non-zero, so vertical-only scrolls omitted deltaX and failed with "deltaX and deltaY are expected for mouseWheel event". Fix: always set both deltas for mouseWheel events. Click accuracy fixes: - Add offset_top from screencast metadata to y-coordinate (accounts for browser chrome/infobars) - Use dynamic aspect-ratio from frame metadata instead of hardcoded 16/10 (viewport is 16:9, mismatch caused y-axis distortion) New integration tests: - click_dispatches: verifies mousePressed+mouseReleased succeed - scroll_dispatches: verifies mouseWheel with deltas succeeds - screencast_metadata_valid: captures actual viewport dimensions for coordinate mapping validation Tests use ephemeral profiles to avoid SingletonLock conflicts. Entire-Checkpoint: f818e6d815b2
Root cause of captcha/content appearing tiny and off-center: viewport was 2560x1440@2x (5K physical pixels). Websites laid out for that resolution, then screencast squished everything down. Changed default viewport to 1440x900@1x — a standard laptop resolution. Content renders at a normal size and fills the canvas. Other improvements: - rAF-gated canvas rendering: frames are drawn at display refresh rate instead of on every signal change, avoiding wasted draws - Screenshot coordinate mapping uses actual image natural dimensions (img.naturalWidth/Height) instead of hardcoded assumptions - Canvas aspect-ratio follows actual frame dimensions dynamically instead of hardcoded 16/10 Entire-Checkpoint: 992c2ba77b21
The session ID was truncated to 12 chars + "..." but the card already has CSS truncate class which handles overflow naturally. Entire-Checkpoint: 7ba05c18a610
The URL bar now tracks the remote browser's current URL: - Switching sessions instantly shows that session's URL - Clicking links in the remote browser updates the URL bar (polled via CDP get_url every 2 seconds) - Navigating via the URL bar updates it after page loads - While typing, the live URL is paused; pressing Escape or clicking away reverts to the live URL This makes the browser viewer feel like a real browser tab — the URL bar always reflects what page you're looking at. Entire-Checkpoint: 602407a16bde
Sessions were dying from a single CDP error (e.g. Chrome briefly busy during mouse event) and then flooding logs with cleanup warnings from every queued event hitting the dead session. Fixes: - Idle timeout: 5 min → 30 min (interactive browsing needs more time) - Hard TTL: 30 min → 2 hours (matches container TIMEOUT via browserless_session_timeout_ms) - cleanup_stale_session checks has_session() before closing, so only the first event triggers cleanup — subsequent queued events skip silently instead of spamming warnings - Added BrowserPool::has_session() for the guard check Entire-Checkpoint: 603c4e91eae2
… to idle Entire-Checkpoint: 82625f83d2c6
… lazily Two root causes fixed: 1. action_name from Display includes params (e.g. "mouse_input(x=1,y=2)") but the is_input_event check used exact matches. Changed to starts_with() so mouse_input/keyboard_input/evaluate are correctly recognized as non-fatal. 2. Action hook was set at startup when manager_if_ready() returned None (manager lazy-inits on first use). Moved hook registration into the manager() init closure so it's applied as soon as the manager exists. Session history now works for both agent and UI sessions. Entire-Checkpoint: b696fa3b9617
Each session switch called start_screencast which spawns a new WebSocket relay task. Since we stopped calling stop_screencast on switch, old relay tasks were never cleaned up. After several switches, multiple relays broadcast duplicate frames, flooding the WebSocket and freezing the UI. Fix: check the session's screencasting field from the API before starting a new screencast. If already running, just ensure the frame listener is active without spawning another relay. Entire-Checkpoint: 2201c4b39939
Two issues fixed: 1. Sessions dying from connection errors were never marked as closed in the database — cleanup_stale_session removed from pool but didn't update SQLite. Now fires the action hook with "close" action and "connection lost" error so closed_at gets set. 2. History section only showed sessions with closed_at set. Sessions that died without proper close never appeared. Now shows all past sessions not currently active in the pool. Entire-Checkpoint: 7148fca2419c
Replaced the stacked layout (live sessions + history below) with a tabbed interface: - "Live" tab shows active browser sessions with count badge - "History" tab shows past sessions (closed or lost) with count - Switching to History tab refreshes the list - Past sessions show "closed" or "lost" badge based on whether they were explicitly closed or died from connection loss - Clicking a history session shows its action log in the right panel Entire-Checkpoint: 4f9536295073
Entire-Checkpoint: c1bb8003a2af
…croll after switch Entire-Checkpoint: da2f35c55cc4
Clicking a session in the History tab now creates a new browser session and navigates to the last URL from the dead session. The view switches to the Live tab automatically. "View Log" link at the bottom of each history card still shows the action log in the right panel. Sessions without a URL (about:blank) show the action log instead of reviving. Entire-Checkpoint: 7815d593f70b
URL changes are now detected via CDP Page.frameNavigated events instead of polling get_url every 2 seconds. The screencast relay listens for navigation events alongside screencast frames and includes the new URL in the frame payload when it changes. This eliminates the get_url HTTP request every 2 seconds. Scroll info polling remains at 5-second intervals (reduced from 2s). Entire-Checkpoint: 72716eae41b2
The evaluate calls every 5 seconds for scroll info are gone. Scroll position now comes from the screencast frame metadata (scroll_offset_y) which is already included in every frame at zero cost. Page height (scrollHeight) is queried once via evaluate when a navigation event occurs, not polled. This means zero background HTTP requests while viewing a browser session. Entire-Checkpoint: e42cba470a49
Chrome by default discards session cookies (no expiry) when it exits. Sites like LinkedIn use session cookies for auth, so logins were lost between moltis restarts even with persistent profile directories. Added --restore-session-cookies flag to both host browser launches and containerized launches (via CHROME_FLAGS env var). This tells Chrome to save session cookies to the Cookies database on exit and restore them on next launch. Entire-Checkpoint: f7b599438e01
Only truly fatal actions (navigate retry failure, close, snapshot) trigger session cleanup on connection error. Screenshot, screencast, mouse, keyboard, evaluate, and get_url/get_title are all non-fatal. This was causing sessions to die when switching sessions triggered a screenshot that timed out. Entire-Checkpoint: 7fc3fb8d154f
When creating a new session with the same profile_id as an existing running session, the pool now creates a new tab in the existing browser instead of launching a new container. This shares cookies, local storage, and login state between sessions in real time. Implementation: BrowserInstance.browser is now Arc<Browser> so the handle can be shared. get_or_create checks for an existing sandboxed instance with the same profile_id and creates a new BrowserInstance pointing to the same Browser (new tab). This means: log into LinkedIn in one session, open a new session with the same profile → already logged in. Entire-Checkpoint: a1d6b26e98c8
Entire-Checkpoint: ef2938c1da7c
Entire-Checkpoint: 39f54ce3df66
Removed all connection-error cleanup from execute_action. Sessions were being killed by screenshot timeouts, refresh failures, and other transient errors. Navigate already has its own retry logic. Now sessions only die from: - Explicit close action - Idle/TTL timeout (pool cleanup) - Navigate retry failure (its own logic) All other errors are returned to the caller without destroying the session. Chrome may recover on the next request. Entire-Checkpoint: 71776737468e
New config option [tools.browser] sandbox = false forces browsers to run on the host instead of in containers. Saves ~200-400MB per browser instance. Defaults to following the global sandbox mode. Example in moltis.toml: [tools.browser] sandbox = false # run browsers on host, not in containers Entire-Checkpoint: 210224bcb5ef
Unit tests covering every major bug encountered during development: types.rs (4 tests): - null timeout_ms fails without stripping (Bug 1) - all optional fields null after stripping (Bug 1) - mouseWheel defaults deltas to zero (Bug 6) - MouseInput Display includes params (Bug 13) manager.rs (5 tests): - key_to_vk Backspace=8, Enter=13 (Bug 7) - key_to_vk arrows 37/38/39/40 (Bug 7) - key_to_vk printable returns None (Bug 7) - key_to_vk Delete=46, Escape=27, Tab=9 (Bug 7) pool.rs (3 tests): - dangling symlink detected by symlink_metadata (Bug 11) - same profile_id produces same path (Bug 19) - different profile_ids produce different paths (Bug 19) screencast.rs (1 test): - frame url serialized only when Some (Bug 15) Entire-Checkpoint: b639577544f3
6 new e2e tests covering specific bugs: - Bug 2: sessions created via REST API appear in UI list (validates shared BrowserManager between tool and service) - Bug 15: URL bar shows target URL immediately on navigation (validates no flicker from poll overwriting) - Bug 16: switching sessions rapidly causes no JS errors (validates no relay task accumulation) - Bug 17: closed session appears in History tab (validates session history persistence and UI) - Delayed highlight: creating session shows placeholder immediately - Bug 12: dead session shows error and recovers (validates no stuck "Fetching browser view..." state) Entire-Checkpoint: 0db97ecdd06f
The [tools.browser] sandbox config only affected UI sessions (via RealBrowserService). The BrowserTool always used the SandboxRouter (global exec sandbox mode), ignoring the browser-specific config. Added sandbox_override to BrowserTool that takes precedence over the router. Wired from config.tools.browser.sandbox in server.rs. Also added stealth Chrome flags to reduce headless detection: - --disable-blink-features=AutomationControlled (removes navigator.webdriver) - --headless=new (more realistic headless mode) - Realistic macOS Chrome user agent - --disable-infobars, --disable-automation-extension Entire-Checkpoint: f3232341748e
When the browser tool returns a result with a session_id, a clickable link appears below the tool card in the chat: "🌐 View browser session" → navigates to /settings/browser with the session auto-selected. The browser page reads the ?session= parameter from the URL hash and auto-selects the matching session (with screencast). Entire-Checkpoint: 9b682dff16a6
…ndering Entire-Checkpoint: be560945e6c6
Entire-Checkpoint: 7ec81c9e0cbc
Entire-Checkpoint: 368d96bc925a
Entire-Checkpoint: 418e519b6f11
…f query params
The settings router strips query strings, so ?session=xxx was lost.
Now uses navigateToBrowserSession() to set a pending session ID
before calling navigate("/settings/browser"). The browser page
reads it on init and auto-selects the session.
Works for both live tool cards and history re-rendering.
Entire-Checkpoint: fa56d7b1add5
Entire-Checkpoint: 40c737a0671b
Entire-Checkpoint: cf84cfb8472d
Two fixes: 1. Reviving a dead session now passes the old session_id to the navigate action. get_or_create registers the new browser under the same ID, so the session keeps its identity in the UI and history. 2. Host-mode browser launches now clean up stale SingletonLock/ Cookie/Socket files before starting. Previously only sandbox launches did this, causing "Failed to create SingletonLock" errors when sandbox=false. Entire-Checkpoint: ab20339a2d2b
Summary
Adds a full browser viewing and interaction UI to the Settings > Browser page. Users can create browser sessions, view them live via CDP screencast, interact with mouse/keyboard/scroll, and review session history with action logs. Per-agent browser profiles provide cookie isolation.
Key features:
Technical highlights:
Validation
Completed
cargo +nightly-2025-11-30 fmt --all -- --checkcargo clippy -p moltis-browser -p moltis-gateway -p moltis-tools --all-targets -- -D warningscargo test -p moltis-browser -p moltis-gatewaynpx biome check --write(JS)Remaining
./scripts/local-validate.shManual QA
Future Work