Skip to content

Conversation

@hannesrudolph
Copy link
Collaborator

@hannesrudolph hannesrudolph commented Oct 25, 2025

New capability: keyboard key-press actions.
Navigation reliability: enforced single-tab behavior.
UX improvements: top-bar navigation, unified details, inline reasoning, and cost display.
Bug fix: base64 output leakage to chat is resolved; binary/screenshot data no longer appears in the chat stream.


Important

Enhances browser interactions with keyboard actions, single-tab navigation, UX improvements, and fixes base64 leakage in chat.

  • Behavior:
    • Adds keyboard key-press actions to browser interactions, supporting single keys and combinations in browserActionTool() in browserActionTool.ts.
    • Enforces single-tab navigation by modifying forceLinksToSameTab() in BrowserSession.ts.
    • Fixes base64 output leakage to chat by ensuring binary/screenshot data does not appear in the chat stream.
  • UX Improvements:
    • Enhances top-bar navigation and unifies details in BrowserSessionRow.tsx and BrowserActionRow.tsx.
    • Adds inline reasoning and cost display in ChatView.tsx.
    • Introduces auto-expand setting for browser actions in BrowserSettings.tsx.
  • Testing:
    • Adds tests for coordinate scaling in browserActionTool.coordinateScaling.spec.ts.
    • Adds tests for browser session row behavior in BrowserSessionRow.aspect-ratio.spec.tsx and BrowserSessionRow.disconnect-button.spec.tsx.
  • Misc:
    • Updates i18n files for new browser session and action texts in multiple languages.

This description was created by Ellipsis for 8cecb79. You can customize this summary. It will automatically update as commits are pushed.

Copilot AI review requested due to automatic review settings October 25, 2025 04:55
@dosubot dosubot bot added the size:XL This PR changes 500-999 lines, ignoring generated files. label Oct 25, 2025
@roomote
Copy link

roomote bot commented Oct 25, 2025

Review Summary

The latest commit (8cecb79) has resolved 3 of the 6 previously flagged issues. The 2 remaining issues are still present.

Issues Status

  • Missing icon for "press" action - RESOLVED. The getActionIcon function now includes a case for the "press" action in both BrowserActionRow.tsx (lines 95-96) and BrowserSessionRow.tsx (lines 99-100).

  • Missing translation key for "press" action - RESOLVED. The translation key "press": "Press {{key}}" is now present in all locale files (e.g., en/chat.json line 348).

  • Missing browser action text display - RESOLVED. The browser action text is properly displayed using the actionText computed value in BrowserActionRow.tsx (line 217) and the getBrowserActionText function is called in BrowserSessionRow.tsx (line 633).

  • Browser actions translation section removed - This was intentional and is now confirmed as the correct implementation. The translations are hardcoded in the components rather than using i18n keys.

Unresolved Issues

  • Hardcoded text breaks internationalization - BrowserActionRow.tsx line 294 uses hardcoded "Console Logs" instead of t("chat:browser.consoleLogs") like BrowserSessionRow.tsx does (line 662). This creates inconsistency and prevents translation.

  • Type parameter should use BrowserAction - BrowserActionRow.tsx line 90 uses string parameter instead of the stricter BrowserAction type used in BrowserSessionRow.tsx (line 94) for type safety and consistency.


Follow Along on Roo Code Cloud

@dosubot dosubot bot added bug Something isn't working UI/UX UI/UX related or focused labels Oct 25, 2025
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR adds keyboard key-press actions, enforces single-tab navigation behavior for browser sessions, improves UX with unified details display and cost tracking, and fixes a bug where base64 output was leaking to the chat stream.

  • Added new "press" browser action for keyboard key presses
  • Implemented single-tab navigation enforcement to prevent new tabs/popups
  • Redesigned browser session UI with collapsible details, inline reasoning, navigation controls, and API cost display
  • Fixed base64/binary data from appearing in chat by filtering browser action/result rendering

Reviewed Changes

Copilot reviewed 9 out of 9 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
webview-ui/src/i18n/locales/en/chat.json Removed "title" translation key for browser actions
webview-ui/src/components/chat/ChatView.tsx Added "reasoning" to browser session message types
webview-ui/src/components/chat/ChatRow.tsx Prevented raw JSON browser action/result from rendering
webview-ui/src/components/chat/BrowserSessionRow.tsx Major UI refactor with collapsible details, navigation controls, cost display, and reasoning support
src/shared/ExtensionMessage.ts Added "press" to browser actions enum
src/services/browser/tests/BrowserSession.spec.ts Added tests for single-tab behavior and keyboard press action
src/services/browser/BrowserSession.ts Implemented forceLinksToSameTab and press methods
src/core/tools/browserActionTool.ts Added "press" action handling
src/core/prompts/tools/browser-action.ts Updated documentation to include "press" action

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@hannesrudolph hannesrudolph added the Issue/PR - Triage New issue. Needs quick review to confirm validity and assign labels. label Oct 25, 2025
Copy link

@roomote roomote bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review complete. Found 3 issues that should be addressed before merging. Please see the inline comments and checklist above.

@hannesrudolph hannesrudolph force-pushed the feat/keyboard-actions-single-tab-ux-b64-fix branch from 1151ddd to 8c41675 Compare October 25, 2025 05:36
Copy link

@roomote roomote bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Re-reviewed the new commits. Two previous issues were resolved, but 3 issues remain that need to be addressed before approval.

@hannesrudolph hannesrudolph changed the title Keyboard actions, single-tab nav, UX improvements, and base64 leakage fix Browser Use Update: Keyboard actions, single-tab nav, UX improvements, and base64 leakage fix Oct 25, 2025
@hannesrudolph hannesrudolph force-pushed the feat/keyboard-actions-single-tab-ux-b64-fix branch from d576c30 to 82ee0f5 Compare October 25, 2025 06:46
@dosubot dosubot bot removed the size:XL This PR changes 500-999 lines, ignoring generated files. label Oct 28, 2025
@dosubot dosubot bot added the size:XXL This PR changes 1000+ lines, ignoring generated files. label Oct 28, 2025
@hannesrudolph hannesrudolph force-pushed the feat/keyboard-actions-single-tab-ux-b64-fix branch from 0cd3672 to e3b8dde Compare October 28, 2025 05:11
}}>
<SquareTerminal className="w-3" />
<span className="text-xs" style={{ fontWeight: 500 }}>
Console Logs
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hardcoded text breaks internationalization. This uses "Console Logs" directly instead of t("chat:browser.consoleLogs") like BrowserSessionRow.tsx does (line 662). This creates inconsistency and prevents translation into other languages supported by the application.

}

// Get icon for each action type
const getActionIcon = (action: string) => {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Type parameter should be BrowserAction instead of string for type safety. BrowserSessionRow.tsx uses the stricter BrowserAction type at line 94, and this should match for consistency and to catch invalid action types at compile time.

Copy link

@roomote roomote bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Latest commit reviewed. No new issues found, but 2 previously flagged issues remain unresolved.

Copy link

@roomote roomote bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The latest commit has no new issues. However, 2 previously flagged issues remain unresolved and should be addressed before approval.

…mprovements

fix: prevent base64/binary output leakage to chat stream

test(prompts): update system prompt snapshot after removing browser session persistence guidance

chore(prompts): remove outdated browser session persistence guidance

Update webview-ui/src/i18n/locales/hi/settings.json

fix(webview-ui): align disconnect aria-label; remove unused vars; replace useSize to avoid timers in tests

fix: remove unused browser actions from chat localization
@hannesrudolph hannesrudolph force-pushed the feat/keyboard-actions-single-tab-ux-b64-fix branch from 5296e3b to 8cecb79 Compare October 28, 2025 13:49
}}>
<SquareTerminal className="w-3" />
<span className="text-xs" style={{ fontWeight: 500 }}>
Console Logs
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hardcoded text breaks internationalization. This uses "Console Logs" directly instead of t("chat:browser.consoleLogs") like BrowserSessionRow.tsx does (line 662). This creates inconsistency and prevents translation into other languages supported by the application.

}

// Get icon for each action type
const getActionIcon = (action: string) => {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Type parameter should be BrowserAction instead of string for type safety. BrowserSessionRow.tsx uses the stricter BrowserAction type at line 94, and this should match for consistency and to catch invalid action types at compile time.

Copy link

@roomote roomote bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review complete. The latest commit resolved 3 of the 6 previously flagged issues. However, 2 issues remain unresolved and should be addressed before approval.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working Issue/PR - Triage New issue. Needs quick review to confirm validity and assign labels. size:XXL This PR changes 1000+ lines, ignoring generated files. UI/UX UI/UX related or focused

Projects

Status: Triage

Development

Successfully merging this pull request may close these issues.

2 participants