feat(browser): interactive browser viewing UI with CDP screencast by penso · Pull Request #531 · moltis-org/moltis

penso · 2026-03-31T21:40:45Z

Summary

Adds a full browser viewing and interaction UI to the Settings > Browser page. Users can create browser sessions, view them live via CDP screencast, interact with mouse/keyboard/scroll, and review session history with action logs. Per-agent browser profiles provide cookie isolation.

Key features:

Live browser viewing via CDP screencast with mouse, keyboard, and scroll relay
URL bar with Google suggestions autocomplete, auto-https, and live URL sync
Session management: create, switch, close with instant screenshot prefetch
Persistent session history with per-session action log (SQLite)
Per-agent browser profiles for cookie isolation (UI=default, agents=session_key)
Shared BrowserManager between tool and UI service
Viewport auto-resize on screencast start for proper content rendering
Scrollbar overlay with click-to-scroll
Cookie persistence across restarts via shared profile directories

Technical highlights:

Screencast frames delivered via WebSocket event subscription (v4 protocol)
rAF-gated canvas rendering for smooth display
Coordinate mapping using image natural dimensions + offset_top correction
Wheel events batched (50ms), mousemove throttled to avoid flooding CDP
CDP requires both deltaX/deltaY for mouseWheel and windowsVirtualKeyCode for special keys
Sessions persist for 2 hours (idle + hard TTL), no stop-on-switch to avoid Chrome crashes
Screencast relay keeps running when no subscribers (avoids race condition)
Screenshot prefetch cache with live frame updates for instant session switching

Validation

Completed

cargo +nightly-2025-11-30 fmt --all -- --check
cargo clippy -p moltis-browser -p moltis-gateway -p moltis-tools --all-targets -- -D warnings
cargo test -p moltis-browser -p moltis-gateway
npx biome check --write (JS)
Integration tests: screencast_host, screencast_sandbox, click_dispatches, scroll_dispatches, screencast_metadata_valid
Playwright e2e test spec: browser.spec.js + smoke route

Remaining

./scripts/local-validate.sh
Manual QA on fresh install

Manual QA

Go to Settings > Browser
Click "New Session" — placeholder appears with "creating" badge
Enter a URL — autocomplete shows Google suggestions
Verify screencast displays and updates live
Click links — URL bar updates automatically
Scroll with mouse wheel or scrollbar overlay
Type in text fields (including backspace/special keys)
Create second session, switch between them — instant cached frame
Close a session — appears in History section
Click closed session — action log displayed
Restart moltis — cookies persist (no captcha on revisit)
Agent browser sessions use separate cookie profile from UI sessions

Future Work

WebRTC upgrade for sub-100ms latency (reference: neko project)
Binary WebSocket frames instead of base64 JSON (~33% bandwidth savings)
Video recording/replay of browser sessions
Scrollbar drag support (currently click-to-scroll only)

Add a browser tab to the settings page that shows active browser sessions and allows viewing them via CDP screencast. Users can log in to websites manually and export cookies for agent automation. Backend: - Add screencast module (ScreencastRegistry) for CDP frame relay - Add mouse/keyboard input, cookie export/import to BrowserManager - Add browser.screencast.frame protocol event - Add /api/browser/action and /api/browser/sessions endpoints - Wire screencast frame broadcasting to WebSocket clients Frontend: - Add page-browser.js with canvas-based screencast viewer - Session list with view, export cookies, close actions - Mouse/keyboard input relay to browser viewport - Navigation bar for URL entry - Register as "Browser" tab in settings page https://claude.ai/code/session_015M64wR6GhhnyiEFodhAuAH

codspeed-hq · 2026-03-31T21:44:03Z

Merging this PR will improve performance by 21.75%

⚡ 1 improved benchmark
✅ 38 untouched benchmarks
⏩ 5 skipped benchmarks¹

Performance Changes

	Benchmark	`BASE`	`HEAD`	Efficiency
⚡	`env_substitution`	12.2 µs	10.1 µs	+21.75%

_{Comparing claude/plan-browser-viewing-HOHno (6663f11) with main (a113473)}

5 benchmarks were skipped, so the baseline results were used instead. If they were deleted from the codebase, click here and archive them to remove them from the performance reports. ↩

greptile-apps · 2026-03-31T21:48:03Z

Greptile Summary

This PR introduces real-time browser session viewing and interactive control to the web UI. A new CDP screencast pipeline streams JPEG frames from Chromium via broadcast channels (ScreencastRegistry → adapter task → WebSocket relay), and new action types handle mouse/keyboard input and cookie import/export. The overall architecture is reasonable, but there are three issues that affect correctness of the streaming path and one dead-code inconsistency:

Relay task exits permanently on zero receivers (screencast.rs): The relay task breaks when tx.send returns an error (no receivers). Because the initial receiver _rx from start_screencast is immediately dropped, any frame arriving before spawn_screencast_relay subscribes causes the relay to exit permanently — subsequent subscribers get a live receiver but no frames.
Duplicate relay tasks per session (api.rs): spawn_screencast_relay is called unconditionally on every StartScreencast success with no check for an existing relay. Repeat calls spawn N adapter tasks and N broadcast relay tasks, delivering N copies of each frame to all WebSocket clients.
Dead code in BrowserManager::list_sessions (manager.rs): The screencasting enrichment loop computes contains() but discards the result via let _ = ...; the field has not been added to BrowserSessionInfo. The enrichment is handled correctly in services.rs, making this loop purely wasteful.
Scroll not forwarded (page-browser.js): No wheel event listener is attached to the canvas, so MouseInputType::Wheel (fully wired on the backend) is unreachable from the UI.

Confidence Score: 4/5

Safe to merge with caution — the three P1 findings affect screencast reliability but don't risk data loss or security; most core functionality works correctly

Three P1 findings exist: the relay task can permanently stop if a frame arrives before the subscriber connects, duplicate relay tasks are spawned on repeated StartScreencast calls, and dead code in list_sessions performs unnecessary async work. None cause crashes or data corruption, but the screencast feature may silently stop delivering frames in some conditions and will deliver duplicate frames on reconnect.

crates/browser/src/screencast.rs (relay lifetime), crates/web/src/api.rs (relay dedup), crates/browser/src/manager.rs (dead code in list_sessions)

Important Files Changed

Filename	Overview
crates/browser/src/screencast.rs	New module implementing CDP screencast relay — relay task exits permanently when no receivers are present, and has a TOCTOU race between the read-lock check and write-lock insert in `start()`
crates/web/src/api.rs	New browser action and session handlers — `spawn_screencast_relay` has no dedup guard, spawning N duplicate relay tasks on N `StartScreencast` calls for the same session
crates/browser/src/manager.rs	Extended with screencast/input/cookie actions — `list_sessions` contains dead code that computes screencasting status but discards the result with `let _ = ...`
crates/browser/src/types.rs	New action variants and supporting types added cleanly with proper serde defaults
crates/browser/src/pool.rs	Added `BrowserSessionInfo` and `list_sessions()` — clean implementation reading session metadata under the existing RwLock
crates/gateway/src/services.rs	Implements `list_sessions` and `subscribe_screencast` — each `subscribe_screencast` call spawns a new adapter task with no dedup, contributing to the duplicate relay problem
crates/web/src/assets/js/page-browser.js	New Preact browser viewer page — solid structure with coordinate scaling and canvas rendering, but missing `wheel` event relay for scroll support
crates/service-traits/src/lib.rs	Extended `BrowserService` trait with `list_sessions` and `subscribe_screencast` with sensible no-op defaults — clean trait extension
crates/protocol/src/lib.rs	Registers three new event types for screencast — straightforward additions to `KNOWN_EVENTS`
crates/web/src/assets/js/page-settings.js	Adds the Browser nav entry and page section handler — clean integration using the existing settings SPA pattern

Sequence Diagram

sequenceDiagram
    participant UI as Browser UI (page-browser.js)
    participant API as Web API (api.rs)
    participant GW as Gateway (services.rs)
    participant REG as ScreencastRegistry (screencast.rs)
    participant CDP as Chromium CDP

    UI->>API: POST /api/browser/action {action: start_screencast}
    API->>GW: browser.request(body)
    GW->>REG: screencasts.start(session_id, page, ...)
    REG->>CDP: Page.startScreencast
    REG-->>GW: broadcast::Receiver (_rx, dropped immediately)
    GW-->>API: BrowserResponse::success
    API->>GW: subscribe_screencast(session_id)
    GW->>REG: registry.subscribe(session_id)
    REG-->>GW: raw broadcast::Receiver
    GW->>GW: spawn adapter task (ScreencastFrame → Value)
    GW-->>API: Value broadcast::Receiver
    API->>API: spawn relay task (Value → WebSocket broadcast)

    loop CDP screencast frames
        CDP->>REG: EventScreencastFrame
        REG->>REG: relay task: ack + tx.send(frame)
        REG-->>GW: adapter task receives frame
        GW-->>API: relay task receives Value
        API-->>UI: WS event browser.screencast.frame
    end

    UI->>API: POST /api/browser/action {action: stop_screencast}
    API->>GW: browser.request(body)
    GW->>REG: screencasts.stop(session_id, page)
    REG->>CDP: Page.stopScreencast
    REG->>REG: remove from registry, abort relay task

_{Reviews (1): Last reviewed commit: "feat(browser): add browser viewing UI wi..." | Re-trigger Greptile}

greptile-apps · 2026-03-31T21:48:07Z

crates/browser/src/screencast.rs

+        if tx.send(frame).is_err() {
+            debug!(session_id = %session_id, "no screencast subscribers, stopping relay");
+            break;


Relay task permanently exits when no subscribers, silently breaking future subscriptions

When tx.send(frame).is_err(), the relay task exits (line 239). This becomes a problem because of how the initial receiver is handled in start_screencast (manager.rs:799–802): the returned _rx is discarded immediately when that function returns. The relay task then becomes the sole holder of the tx side. If even a single CDP frame arrives between _rx being dropped and spawn_screencast_relay subscribing (api.rs:1141-1142), the relay task finds no receivers, exits, and is gone permanently.

After this point, any call to subscribe_screencast() → registry.subscribe() will successfully hand out a new receiver from handle.tx.subscribe(), but since the relay task is dead no more frames will ever be sent into that channel. From the caller's perspective the screencast appears active but produces no frames.

A simple fix is to keep the relay running even when there are temporarily no receivers:

if tx.send(frame).is_err() { debug!(session_id = %session_id, "no screencast subscribers, frame dropped"); // do NOT break — a subscriber may connect shortly }

Alternatively, hold a permanent "keep-alive" receiver inside ActiveScreencast.

greptile-apps · 2026-03-31T21:48:08Z

crates/web/src/api.rs

+            if is_start_screencast {
+                if let Some(sid) = session_id {
+                    spawn_screencast_relay(state, &sid).await;
+                }


Multiple relay tasks spawned per session causes duplicate frame delivery

Every successful StartScreencast request unconditionally calls spawn_screencast_relay, which in turn calls subscribe_screencast (services.rs:1398-1429). subscribe_screencast always spawns a brand-new adapter task and returns a new Value receiver. spawn_screencast_relay then always spawns an additional broadcast relay task.

There is no guard to check whether a relay is already running for the given session_id. Calling StartScreencast N times for the same session creates N adapter tasks and N relay tasks — every CDP frame gets forwarded N times to all WebSocket clients.

In the UI, startScreencast is called from the "View" button with no dedup, so a user double-clicking or page-refreshing will trigger duplicate relays.

Consider tracking active relay sessions server-side, e.g. via a DashSet<String> in AppState, and skipping spawn_screencast_relay if an entry already exists.

greptile-apps · 2026-03-31T21:48:09Z

crates/browser/src/manager.rs

+    pub async fn list_sessions(&self) -> Vec<crate::pool::BrowserSessionInfo> {
+        let mut sessions = self.pool.list_sessions().await;
+        // Annotate which sessions have active screencasts.
+        let screencast_sessions = self.screencasts.active_sessions().await;
+        for session in &mut sessions {
+            // The BrowserSessionInfo doesn't have a screencast field yet,
+            // but clients can check the screencast endpoint separately.
+            let _ = screencast_sessions.contains(&session.session_id);
+        }
+        sessions
+    }


Dead code: screencast_sessions.contains(...) result is silently discarded

The loop body computes whether a session is screencasting but throws the result away with let _ = .... The BrowserSessionInfo struct has no screencasting field, so the information never reaches callers.

The enrichment is correctly performed independently in services.rs, so the actual API response is accurate — but the code inside this loop is completely useless: it performs an async allocation (active_sessions() returning a Vec<String>) and O(n) lookup per session for no effect.

Either add screencasting: bool to BrowserSessionInfo and populate it here (removing the duplicate logic in services.rs), or remove this loop entirely:

pub async fn list_sessions(&self) -> Vec<crate::pool::BrowserSessionInfo> { self.pool.list_sessions().await }

greptile-apps · 2026-03-31T21:48:10Z

crates/browser/src/screencast.rs

+        // If already active, return a new subscriber.
+        {
+            let active = self.active.read().await;
+            if let Some(handle) = active.get(session_id) {
+                debug!(session_id, "screencast already active, adding subscriber");
+                return Ok(handle.tx.subscribe());
+            }
+        }
+
+        // Start CDP screencast via the builder pattern.
+        let params = StartScreencastParams {
+            format: Some(StartScreencastFormat::Jpeg),
+            quality: Some(i64::from(quality.min(100))),
+            max_width: Some(i64::from(max_width)),
+            max_height: Some(i64::from(max_height)),
+            every_nth_frame: Some(1),
+        };
+
+        page.execute(params)
+            .await
+            .map_err(|e| crate::error::Error::Cdp(format!("failed to start screencast: {e}")))?;
+
+        let (tx, rx) = broadcast::channel(FRAME_CHANNEL_CAPACITY);
+
+        // Spawn background task to relay CDP screencast frame events.
+        let tx_clone = tx.clone();
+        let sid = session_id.to_string();
+        let page_clone = page.clone();
+
+        let task = tokio::spawn(async move {
+            relay_screencast_frames(page_clone, tx_clone, sid).await;
+        });
+
+        let inner = Arc::new(ActiveScreencast {
+            tx: tx.clone(),
+            abort: task.abort_handle(),
+        });
+
+        self.active
+            .write()
+            .await
+            .insert(session_id.to_string(), inner);
+
+        debug!(session_id, "screencast started");
+        Ok(rx)


TOCTOU race: two concurrent start() calls for the same session can both spawn relay tasks

The existence check (line 107, read lock) and the insert (line 143, write lock) are not atomic. Two concurrent callers for the same session_id can both pass the read-lock check, both send the CDP StartScreencast command, and both spawn a relay task. The second insert overwrites the first Arc<ActiveScreencast>, dropping it and aborting the first relay task, while two CDP start commands have been sent.

Fix by holding the write lock for the entire operation:

let mut active = self.active.write().await; if let Some(handle) = active.get(session_id) { return Ok(handle.tx.subscribe()); } // … execute CDP command, spawn task … active.insert(session_id.to_string(), inner);

greptile-apps · 2026-03-31T21:48:11Z

crates/web/src/assets/js/page-browser.js

+		function onMouse(e) {
+			relayMouseEvent(e, canvas);
+		}
+		canvas.addEventListener("mousedown", onMouse);
+		canvas.addEventListener("mouseup", onMouse);
+		canvas.addEventListener("mousemove", onMouse);
+
+		// Keyboard: focus the canvas to receive key events
+		canvas.setAttribute("tabindex", "0");
+		canvas.addEventListener("keydown", relayKeyEvent);
+		canvas.addEventListener("keyup", relayKeyEvent);
+
+		return () => {
+			canvas.removeEventListener("mousedown", onMouse);
+			canvas.removeEventListener("mouseup", onMouse);
+			canvas.removeEventListener("mousemove", onMouse);
+			canvas.removeEventListener("keydown", relayKeyEvent);


Scroll/wheel events not relayed — MouseInputType::Wheel is unreachable from the UI

The canvas event listeners cover mousedown, mouseup, and mousemove, but there is no wheel listener. The backend has full support for MouseInputType::Wheel mapped to CDP MouseWheel, but the frontend never generates it. Users cannot scroll pages within the browser viewer.

Suggested change

function onMouse(e) {

relayMouseEvent(e, canvas);

}

canvas.addEventListener("mousedown", onMouse);

canvas.addEventListener("mouseup", onMouse);

canvas.addEventListener("mousemove", onMouse);

// Keyboard: focus the canvas to receive key events

canvas.setAttribute("tabindex", "0");

canvas.addEventListener("keydown", relayKeyEvent);

canvas.addEventListener("keyup", relayKeyEvent);

return () => {

canvas.removeEventListener("mousedown", onMouse);

canvas.removeEventListener("mouseup", onMouse);

canvas.removeEventListener("mousemove", onMouse);

canvas.removeEventListener("keydown", relayKeyEvent);

canvas.addEventListener("mousedown", onMouse);

canvas.addEventListener("mouseup", onMouse);

canvas.addEventListener("mousemove", onMouse);

canvas.addEventListener("wheel", onMouse, { passive: false });

Also add a wheel case in relayMouseEvent and pass e.deltaX/e.deltaY in the action payload, mapping them to DispatchMouseEventParams on the Rust side.

codecov · 2026-03-31T22:01:29Z

Codecov Report

❌ Patch coverage is 19.48052% with 1054 lines in your changes missing coverage. Please review.

Files with missing lines	Patch %	Lines
crates/browser/src/manager.rs	8.68%	494 Missing ⚠️
crates/browser/src/screencast.rs	34.01%	130 Missing ⚠️
crates/gateway/src/browser_session_store.rs	0.00%	106 Missing ⚠️
crates/gateway/src/services.rs	0.00%	89 Missing ⚠️
crates/browser/src/pool.rs	28.68%	87 Missing ⚠️
crates/web/src/api.rs	0.00%	87 Missing ⚠️
crates/browser/src/types.rs	74.78%	30 Missing ⚠️
crates/gateway/src/server.rs	0.00%	23 Missing ⚠️
crates/service-traits/src/lib.rs	0.00%	8 Missing ⚠️

📢 Thoughts on this report? Let us know!

Three fixes for the browser viewing UI: 1. Strip explicit null values from LLM tool call params before deserialization — serde(default) only handles missing keys, not null values, causing "invalid type: null, expected u64" errors. 2. Share a single BrowserManager between BrowserTool and RealBrowserService via Arc<OnceCell>. Previously each had its own manager, so sessions created by agents were invisible to the UI. 3. Add "New Session" button to the browser page so users can create sessions directly from the UI (navigate to about:blank + auto-start screencast). Agents share the same cookie profile. Entire-Checkpoint: 6719842d0d2c

Use provider-btn-sm and provider-btn-secondary classes consistent with the MCP and Skills pages. Entire-Checkpoint: 769d0e7ad9b2

…on styles - Allow about:blank in URL validation so "New Session" can create an empty browser session without navigating to an external URL. - Use provider-btn / provider-btn-sm classes on all session card buttons to match the style used on MCP, Skills, and other pages. Entire-Checkpoint: 0402a72e2760

Entire-Checkpoint: 4100596d19db

- Don't auto-start screencast on about:blank (no frames generated) - Show navigate bar immediately so user can enter a URL - Auto-start screencast after navigating to a real page - Show "Enter a URL above" hint in the canvas placeholder Entire-Checkpoint: 17379e8abfdf

The service's request() now injects sandbox=true when the field is absent, matching the behavior of the BrowserTool which reads from the sandbox router. Without this, clicking "New Session" in the UI would always launch a host browser even when sandbox is enabled. Entire-Checkpoint: af07d8ac7130

- Auto-prepend https:// for bare domains (e.g. "lemonde.fr") - Fall back to Google search for non-URL queries - Show Google suggestions, session URL history, and direct URL matches in a dropdown with keyboard navigation - Check res.success on navigate responses to surface errors - Match input and Go button heights using provider-btn-sm sizing Entire-Checkpoint: eb57efdb4ba2

browserAction() now checks res.success on every response and throws on failure. Previously errors returned as 200 with success=false were silently swallowed (e.g. invalid URLs, connection failures). Entire-Checkpoint: 76e24ac2897a

The relay task exited immediately when tx.send() had no receivers, which races with the UI subscribing shortly after. CDP sends the first frame before spawn_screencast_relay can subscribe, causing the relay to exit and all subsequent frames to be lost. The relay now drops frames when no one is listening instead of exiting. The task is still properly cleaned up via its abort_handle when the screencast is stopped. Entire-Checkpoint: 28b9900be3c1

- Add integration test that verifies CDP frames arrive through the ScreencastRegistry broadcast channel (passes in host mode). - Add info/warn tracing at every stage of the screencast relay chain: CDP frame listener, service subscribe, API relay spawn. This will surface exactly where frames get stuck. Entire-Checkpoint: fcdec9d1c2d8

The v4 protocol requires explicit event subscription. The browser.screencast.frame event was missing from the subscription list, so all frames were silently dropped by the broadcast filter. Entire-Checkpoint: 42a252e4a873

Rust integration tests (crates/browser): - screencast_host: verifies CDP frames arrive in host mode - screencast_sandbox: verifies CDP frames arrive in container mode (auto-skips when no container runtime available) Playwright e2e tests (crates/web/ui/e2e/specs/browser.spec.js): - Browser page renders with heading and buttons - Empty state message shown when no sessions - New session button shows creating state - Navigate bar appears after session creation - Bare domains auto-normalized with https:// - Screencast delivers frames after navigation (canvas visible) - Session can be closed - Sandbox badge rendered when sandbox enabled Smoke test: added /settings/browser route. Entire-Checkpoint: 404370604901

…on fixes Interaction improvements: - Add mouse wheel/scroll relay via CDP mouseWheel events with delta_x/delta_y support - Prevent default on mousedown (no text selection/drag on canvas) - Block context menu on canvas - Auto-focus canvas on mousedown for keyboard capture Session list UX: - Clicking a session card selects it and auto-starts screencast - Active session highlighted with accent border - Removed redundant View/Stop Viewing buttons — card click handles it - Buttons (Export Cookies, Close) use stopPropagation to not trigger card selection Entire-Checkpoint: 872e5b20d786

When switching sessions, the canvas was blank until the first screencast frame arrived via WebSocket. Now fetches an immediate screenshot via REST API when selecting a session, displaying it on the canvas while the screencast stream connects. Also tracks frame MIME type (PNG for screenshots, JPEG for screencast frames) for correct canvas rendering. Entire-Checkpoint: 4cf5c9dfbb6b

- Show "Fetching browser view..." while taking screenshot on session switch instead of a blank canvas - Show "paused" badge on sessions that have a URL but aren't being actively viewed — clicking them makes them live - Clear frame data when switching sessions to avoid showing stale content from the previous session Entire-Checkpoint: b8875017d863

…ify cards - Prefetch screenshots for all sessions in the background after fetching the session list — switching is instant from cache - Clear frame data, screencast state, and stop previous screencast when creating a new session (no stale canvas from previous session) - Reset URL bar to the selected session's URL when switching - Remove Export Cookies button, keep only a small Close link - Cache screenshots on first fetch to avoid re-fetching Entire-Checkpoint: 3da290c6da8b

When clicking "New Session", a placeholder card with "creating" badge and "Starting browser..." appears immediately in the session list, selected and highlighted. Once the backend finishes, the placeholder is replaced with the real session. On failure, the placeholder is removed. Entire-Checkpoint: 325a71fa44b7

The canvas is conditionally rendered — placeholders show when there's no frame data. The useEffect with [] deps ran once on mount when the canvas didn't exist yet, so mouse/keyboard/scroll listeners were never attached. Switched to a ref callback that fires whenever the canvas DOM element appears or disappears, properly attaching and cleaning up event listeners. Clicks, scrolling, and keyboard input now work. Entire-Checkpoint: a2823835c7b2

Wheel events fire at 60fps and each one was creating a separate HTTP request, overwhelming the server and causing scroll to feel broken. - Batch wheel deltas into a single request every 50ms - Throttle mousemove to one event per 50ms (was flooding at 60fps) - Click (mousedown/mouseup) events remain unthrottled for accuracy Entire-Checkpoint: 3ce2f51400c7

Major fixes: - Session switching no longer resets activeSession to null (was causing blank canvas and lost state) - Frame listener stays active across session switches (no gap where frames are dropped) - fetchSessions preserves placeholder "creating" entries - Removed noisy start/stop screencast toasts during switches - Screenshot fetch errors trigger session list refresh (handles dead sessions gracefully) - Guard against stale async results when user switches rapidly - Canvas auto-focuses when it appears for immediate keyboard input - Click coordinates use offsetX/offsetY for accurate mapping (fixes clicks landing too high due to border offset) - Invalidate screenshot cache when navigating to new URL Entire-Checkpoint: 5d1ab2347def

Root cause of broken scrolling: CDP rejects mouseWheel events when deltaX or deltaY are missing. The code conditionally set them only when non-zero, so vertical-only scrolls omitted deltaX and failed with "deltaX and deltaY are expected for mouseWheel event". Fix: always set both deltas for mouseWheel events. Click accuracy fixes: - Add offset_top from screencast metadata to y-coordinate (accounts for browser chrome/infobars) - Use dynamic aspect-ratio from frame metadata instead of hardcoded 16/10 (viewport is 16:9, mismatch caused y-axis distortion) New integration tests: - click_dispatches: verifies mousePressed+mouseReleased succeed - scroll_dispatches: verifies mouseWheel with deltas succeeds - screencast_metadata_valid: captures actual viewport dimensions for coordinate mapping validation Tests use ephemeral profiles to avoid SingletonLock conflicts. Entire-Checkpoint: f818e6d815b2

Root cause of captcha/content appearing tiny and off-center: viewport was 2560x1440@2x (5K physical pixels). Websites laid out for that resolution, then screencast squished everything down. Changed default viewport to 1440x900@1x — a standard laptop resolution. Content renders at a normal size and fills the canvas. Other improvements: - rAF-gated canvas rendering: frames are drawn at display refresh rate instead of on every signal change, avoiding wasted draws - Screenshot coordinate mapping uses actual image natural dimensions (img.naturalWidth/Height) instead of hardcoded assumptions - Canvas aspect-ratio follows actual frame dimensions dynamically instead of hardcoded 16/10 Entire-Checkpoint: 992c2ba77b21

The session ID was truncated to 12 chars + "..." but the card already has CSS truncate class which handles overflow naturally. Entire-Checkpoint: 7ba05c18a610

The URL bar now tracks the remote browser's current URL: - Switching sessions instantly shows that session's URL - Clicking links in the remote browser updates the URL bar (polled via CDP get_url every 2 seconds) - Navigating via the URL bar updates it after page loads - While typing, the live URL is paused; pressing Escape or clicking away reverts to the live URL This makes the browser viewer feel like a real browser tab — the URL bar always reflects what page you're looking at. Entire-Checkpoint: 602407a16bde

Sessions were dying from a single CDP error (e.g. Chrome briefly busy during mouse event) and then flooding logs with cleanup warnings from every queued event hitting the dead session. Fixes: - Idle timeout: 5 min → 30 min (interactive browsing needs more time) - Hard TTL: 30 min → 2 hours (matches container TIMEOUT via browserless_session_timeout_ms) - cleanup_stale_session checks has_session() before closing, so only the first event triggers cleanup — subsequent queued events skip silently instead of spamming warnings - Added BrowserPool::has_session() for the guard check Entire-Checkpoint: 603c4e91eae2

… to idle Entire-Checkpoint: 82625f83d2c6

… lazily Two root causes fixed: 1. action_name from Display includes params (e.g. "mouse_input(x=1,y=2)") but the is_input_event check used exact matches. Changed to starts_with() so mouse_input/keyboard_input/evaluate are correctly recognized as non-fatal. 2. Action hook was set at startup when manager_if_ready() returned None (manager lazy-inits on first use). Moved hook registration into the manager() init closure so it's applied as soon as the manager exists. Session history now works for both agent and UI sessions. Entire-Checkpoint: b696fa3b9617

Each session switch called start_screencast which spawns a new WebSocket relay task. Since we stopped calling stop_screencast on switch, old relay tasks were never cleaned up. After several switches, multiple relays broadcast duplicate frames, flooding the WebSocket and freezing the UI. Fix: check the session's screencasting field from the API before starting a new screencast. If already running, just ensure the frame listener is active without spawning another relay. Entire-Checkpoint: 2201c4b39939

Two issues fixed: 1. Sessions dying from connection errors were never marked as closed in the database — cleanup_stale_session removed from pool but didn't update SQLite. Now fires the action hook with "close" action and "connection lost" error so closed_at gets set. 2. History section only showed sessions with closed_at set. Sessions that died without proper close never appeared. Now shows all past sessions not currently active in the pool. Entire-Checkpoint: 7148fca2419c

Replaced the stacked layout (live sessions + history below) with a tabbed interface: - "Live" tab shows active browser sessions with count badge - "History" tab shows past sessions (closed or lost) with count - Switching to History tab refreshes the list - Past sessions show "closed" or "lost" badge based on whether they were explicitly closed or died from connection loss - Clicking a history session shows its action log in the right panel Entire-Checkpoint: 4f9536295073

Entire-Checkpoint: c1bb8003a2af

…croll after switch Entire-Checkpoint: da2f35c55cc4

Clicking a session in the History tab now creates a new browser session and navigates to the last URL from the dead session. The view switches to the Live tab automatically. "View Log" link at the bottom of each history card still shows the action log in the right panel. Sessions without a URL (about:blank) show the action log instead of reviving. Entire-Checkpoint: 7815d593f70b

URL changes are now detected via CDP Page.frameNavigated events instead of polling get_url every 2 seconds. The screencast relay listens for navigation events alongside screencast frames and includes the new URL in the frame payload when it changes. This eliminates the get_url HTTP request every 2 seconds. Scroll info polling remains at 5-second intervals (reduced from 2s). Entire-Checkpoint: 72716eae41b2

The evaluate calls every 5 seconds for scroll info are gone. Scroll position now comes from the screencast frame metadata (scroll_offset_y) which is already included in every frame at zero cost. Page height (scrollHeight) is queried once via evaluate when a navigation event occurs, not polled. This means zero background HTTP requests while viewing a browser session. Entire-Checkpoint: e42cba470a49

Chrome by default discards session cookies (no expiry) when it exits. Sites like LinkedIn use session cookies for auth, so logins were lost between moltis restarts even with persistent profile directories. Added --restore-session-cookies flag to both host browser launches and containerized launches (via CHROME_FLAGS env var). This tells Chrome to save session cookies to the Cookies database on exit and restore them on next launch. Entire-Checkpoint: f7b599438e01

Only truly fatal actions (navigate retry failure, close, snapshot) trigger session cleanup on connection error. Screenshot, screencast, mouse, keyboard, evaluate, and get_url/get_title are all non-fatal. This was causing sessions to die when switching sessions triggered a screenshot that timed out. Entire-Checkpoint: 7fc3fb8d154f

When creating a new session with the same profile_id as an existing running session, the pool now creates a new tab in the existing browser instead of launching a new container. This shares cookies, local storage, and login state between sessions in real time. Implementation: BrowserInstance.browser is now Arc<Browser> so the handle can be shared. get_or_create checks for an existing sandboxed instance with the same profile_id and creates a new BrowserInstance pointing to the same Browser (new tab). This means: log into LinkedIn in one session, open a new session with the same profile → already logged in. Entire-Checkpoint: a1d6b26e98c8

Entire-Checkpoint: ef2938c1da7c

Entire-Checkpoint: 39f54ce3df66

Removed all connection-error cleanup from execute_action. Sessions were being killed by screenshot timeouts, refresh failures, and other transient errors. Navigate already has its own retry logic. Now sessions only die from: - Explicit close action - Idle/TTL timeout (pool cleanup) - Navigate retry failure (its own logic) All other errors are returned to the caller without destroying the session. Chrome may recover on the next request. Entire-Checkpoint: 71776737468e

New config option [tools.browser] sandbox = false forces browsers to run on the host instead of in containers. Saves ~200-400MB per browser instance. Defaults to following the global sandbox mode. Example in moltis.toml: [tools.browser] sandbox = false # run browsers on host, not in containers Entire-Checkpoint: 210224bcb5ef

Unit tests covering every major bug encountered during development: types.rs (4 tests): - null timeout_ms fails without stripping (Bug 1) - all optional fields null after stripping (Bug 1) - mouseWheel defaults deltas to zero (Bug 6) - MouseInput Display includes params (Bug 13) manager.rs (5 tests): - key_to_vk Backspace=8, Enter=13 (Bug 7) - key_to_vk arrows 37/38/39/40 (Bug 7) - key_to_vk printable returns None (Bug 7) - key_to_vk Delete=46, Escape=27, Tab=9 (Bug 7) pool.rs (3 tests): - dangling symlink detected by symlink_metadata (Bug 11) - same profile_id produces same path (Bug 19) - different profile_ids produce different paths (Bug 19) screencast.rs (1 test): - frame url serialized only when Some (Bug 15) Entire-Checkpoint: b639577544f3

6 new e2e tests covering specific bugs: - Bug 2: sessions created via REST API appear in UI list (validates shared BrowserManager between tool and service) - Bug 15: URL bar shows target URL immediately on navigation (validates no flicker from poll overwriting) - Bug 16: switching sessions rapidly causes no JS errors (validates no relay task accumulation) - Bug 17: closed session appears in History tab (validates session history persistence and UI) - Delayed highlight: creating session shows placeholder immediately - Bug 12: dead session shows error and recovers (validates no stuck "Fetching browser view..." state) Entire-Checkpoint: 0db97ecdd06f

The [tools.browser] sandbox config only affected UI sessions (via RealBrowserService). The BrowserTool always used the SandboxRouter (global exec sandbox mode), ignoring the browser-specific config. Added sandbox_override to BrowserTool that takes precedence over the router. Wired from config.tools.browser.sandbox in server.rs. Also added stealth Chrome flags to reduce headless detection: - --disable-blink-features=AutomationControlled (removes navigator.webdriver) - --headless=new (more realistic headless mode) - Realistic macOS Chrome user agent - --disable-infobars, --disable-automation-extension Entire-Checkpoint: f3232341748e

When the browser tool returns a result with a session_id, a clickable link appears below the tool card in the chat: "🌐 View browser session" → navigates to /settings/browser with the session auto-selected. The browser page reads the ?session= parameter from the URL hash and auto-selects the matching session (with screencast). Entire-Checkpoint: 9b682dff16a6

…ndering Entire-Checkpoint: be560945e6c6

Entire-Checkpoint: 7ec81c9e0cbc

Entire-Checkpoint: 368d96bc925a

Entire-Checkpoint: 418e519b6f11

…f query params The settings router strips query strings, so ?session=xxx was lost. Now uses navigateToBrowserSession() to set a pending session ID before calling navigate("/settings/browser"). The browser page reads it on init and auto-selects the session. Works for both live tool cards and history re-rendering. Entire-Checkpoint: fa56d7b1add5

Entire-Checkpoint: 40c737a0671b

Entire-Checkpoint: cf84cfb8472d

Two fixes: 1. Reviving a dead session now passes the old session_id to the navigate action. get_or_create registers the new browser under the same ID, so the session keeps its identity in the UI and history. 2. Host-mode browser launches now clean up stale SingletonLock/ Cookie/Socket files before starting. Previously only sandbox launches did this, causing "Failed to create SingletonLock" errors when sandbox=false. Entire-Checkpoint: ab20339a2d2b

greptile-apps bot reviewed Mar 31, 2026

View reviewed changes

penso added 25 commits April 1, 2026 09:09

style(browser): match button styles with other pages

fc38379

Use provider-btn-sm and provider-btn-secondary classes consistent with the MCP and Skills pages. Entire-Checkpoint: 769d0e7ad9b2

fix(browser): show creating state on new session button

aca05c1

Entire-Checkpoint: 4100596d19db

fix(browser): surface all browser action errors to user

ff9ba86

browserAction() now checks res.success on every response and throws on failure. Previously errors returned as 200 with success=false were silently swallowed (e.g. invalid URLs, connection failures). Entire-Checkpoint: 76e24ac2897a

fix(browser): subscribe to screencast frame events on WebSocket

2bf0150

The v4 protocol requires explicit event subscription. The browser.screencast.frame event was missing from the subscription list, so all frames were silently dropped by the broadcast filter. Entire-Checkpoint: 42a252e4a873

style(browser): show full session ID instead of truncated

3c5a521

The session ID was truncated to 12 chars + "..." but the card already has CSS truncate class which handles overflow naturally. Entire-Checkpoint: 7ba05c18a610

penso added 20 commits April 2, 2026 10:14

fix(browser): use server screencasting state for badge, rename paused…

2a2e232

… to idle Entire-Checkpoint: 82625f83d2c6

fix(browser): move stopPropagation from Close wrapper to button only

448fdfa

Entire-Checkpoint: c1bb8003a2af

fix(browser): remove screencasting guard from input handlers to fix s…

e0c5c63

…croll after switch Entire-Checkpoint: da2f35c55cc4

fix(browser): extend container lifetime to 24 hours for tab reuse

d7c3b6a

Entire-Checkpoint: ef2938c1da7c

fix(browser): agents use default profile to share UI cookies

ff40242

Entire-Checkpoint: 39f54ce3df66

github-actions bot mentioned this pull request Apr 3, 2026

🦞 OpenClaw 生态日报 2026-04-03 gsscsd/big_model_radar#126

Open

penso added 9 commits April 3, 2026 11:58

fix(browser): show view-session link in both live and history chat re…

985bf33

…ndering Entire-Checkpoint: be560945e6c6

fix(browser): use app router for view-session links instead of raw hrefs

193d76b

Entire-Checkpoint: 7ec81c9e0cbc

fix: remove duplicate navigate import in sessions.js

8a6cef6

Entire-Checkpoint: 368d96bc925a

fix: remove duplicate navigate import in websocket.js

7fcf702

Entire-Checkpoint: 418e519b6f11

feat(browser): auto-revive dead sessions when navigating via chat link

d47c40c

Entire-Checkpoint: 40c737a0671b

fix(browser): revive dead session directly with URL, skip blank session

f2bc881

Entire-Checkpoint: cf84cfb8472d

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(browser): interactive browser viewing UI with CDP screencast#531

feat(browser): interactive browser viewing UI with CDP screencast#531
penso wants to merge 78 commits intomainfrom
claude/plan-browser-viewing-HOHno

penso commented Mar 31, 2026 •

edited

Loading

Uh oh!

codspeed-hq bot commented Mar 31, 2026 •

edited

Loading

Uh oh!

greptile-apps bot commented Mar 31, 2026

Uh oh!

greptile-apps bot Mar 31, 2026

Uh oh!

greptile-apps bot Mar 31, 2026

Uh oh!

greptile-apps bot Mar 31, 2026

Uh oh!

greptile-apps bot Mar 31, 2026

Uh oh!

greptile-apps bot Mar 31, 2026

Uh oh!

codecov bot commented Mar 31, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

penso commented Mar 31, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Validation

Completed

Remaining

Manual QA

Future Work

Uh oh!

codspeed-hq bot commented Mar 31, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Merging this PR will improve performance by 21.75%

Performance Changes

Footnotes

Uh oh!

greptile-apps bot commented Mar 31, 2026

Greptile Summary

Confidence Score: 4/5

Important Files Changed

Sequence Diagram

Uh oh!

greptile-apps bot Mar 31, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Mar 31, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Mar 31, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Mar 31, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Mar 31, 2026

Choose a reason for hiding this comment

Uh oh!

codecov bot commented Mar 31, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

penso commented Mar 31, 2026 •

edited

Loading

codspeed-hq bot commented Mar 31, 2026 •

edited

Loading

codecov bot commented Mar 31, 2026 •

edited

Loading