Skip to content

Rescue coordinate taps with AX button press#690

Merged
shaun0927 merged 1 commit intodevelopfrom
autopilot/ax-snap-coordinate-buttons
Apr 28, 2026
Merged

Rescue coordinate taps with AX button press#690
shaun0927 merged 1 commit intodevelopfrom
autopilot/ax-snap-coordinate-buttons

Conversation

@shaun0927
Copy link
Copy Markdown
Owner

Summary

  • Extends app_tap's AX rescue path beyond modal buttons.
  • If a non-raw coordinate falls inside an enabled AXButton frame, app_tap now invokes AXPress instead of relying only on OS-level coordinate injection.
  • Keeps existing modal radius snapping and raw: true coordinate-only opt-out.
  • Adds unit coverage for the non-modal AXButton rescue path.

Why

While validating Omofictions-App App Review issues, app_tree could reliably see Flutter semantics buttons, but app_tap/SimHID coordinate taps returned success without changing the app state and often failed post-tap context probing. PointerService probing also returned success without delivering the Flutter tap. The target controls were already exposed as actionable AXButton nodes, so a semantic AXPress fallback is the safer headless behavior.

Verification

  • npm test -- --runTestsByPath tests/unit/app-tap-safety.test.ts --runInBand
  • npm run lint -- --quiet

Notes

I did not live-retest through the Codex MCP host after this change because that host is already running the previously built OpenSafari server. The code path is covered at the tool unit level and should take effect after build/restart.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 13a8d97b00

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/tools/app-tap.ts Outdated
Comment on lines 509 to 510
if (!best || distance < best.distance) {
best = { x: cx, y: cy, elementPath: node.path, distance };
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Choose innermost AXButton when multiple frames contain tap

The new non-modal rescue path selects candidates only by center distance (distance < best.distance), so if a tap point is inside overlapping/nested AXButton frames the algorithm can pick an ancestor button (or whichever is visited first when distances tie) instead of the intended inner control. In this flow that means AXPress can fire on the wrong element even though the coordinate was inside the correct target; this is especially likely in composed accessibility trees where wrapper and child buttons share similar centers. Consider adding a containment tie-breaker (e.g., smaller area/deeper path) before distance.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request enhances the app_tap tool by introducing a rescue path for non-modal screens, allowing coordinates within an enabled AXButton to trigger an AX press. This change aims to improve reliability for headless Flutter and semantics buttons when OS-level coordinate injection fails. Feedback suggests refining the selection heuristic for non-modal buttons to prioritize the smallest containing element rather than the one with the closest center, ensuring more accurate targeting of nested components.

Comment thread src/tools/app-tap.ts Outdated
Comment on lines 509 to 511
if (!best || distance < best.distance) {
best = { x: cx, y: cy, elementPath: node.path, distance };
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The implementation uses the closest center heuristic (distance < best.distance) for both modal snapping and non-modal rescue. However, the docstring (line 481) and PR description specify that for non-modal screens, it should pick the smallest enabled AXButton.

Using distance to center can lead to incorrect results when a small button is nested within a larger actionable container (e.g., a cell that is also a button). If the tap is closer to the large container's center than the small button's center, the larger one will be selected. Consider using the frame area as the primary selection criterion when modals.length === 0 to ensure the most specific element is targeted.

OpenSafari validation against a Flutter iOS app showed SimHID/PointerService coordinate taps returning success while GestureDetector surfaces did not receive input. The accessibility tree already exposed those surfaces as AXButton nodes, so app_tap now uses AXPress when a non-raw coordinate falls inside an enabled AXButton frame. Modal snapping remains radius-based; raw=true still preserves coordinate-only dispatch.\n\nConstraint: Xcode 26/iOS 26 coordinate injection remains experimental and can be a no-op even when the bridge returns success.\nRejected: Promote PointerService as the default | live probe still returned success without delivering the Flutter tap in this validation run.\nConfidence: medium\nScope-risk: moderate\nDirective: Keep raw=true available for callers that require exact coordinate injection rather than semantic button activation.\nTested: npm test -- --runTestsByPath tests/unit/app-tap-safety.test.ts --runInBand\nTested: npm run lint -- --quiet\nNot-tested: Full MCP live tap after rebuilding/restarting the OpenSafari server in the current Codex tool host.
@shaun0927 shaun0927 force-pushed the autopilot/ax-snap-coordinate-buttons branch from 13a8d97 to bf4af9f Compare April 28, 2026 11:03
@shaun0927
Copy link
Copy Markdown
Owner Author

Addressed Gemini feedback in commit bf4af9f: non-modal AXButton rescue now prefers the smallest containing button frame, with a nested-button unit test covering the selection heuristic. Re-ran npm test -- --runTestsByPath tests/unit/app-tap-safety.test.ts --runInBand and npm run lint -- --quiet locally.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: bf4af9f017

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/tools/app-tap.ts
Comment on lines +531 to +532
if (nodeArea !== bestArea) return nodeArea < bestArea;
return distance < best.distance;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Break ties by depth for non-modal AX snap candidates

In non-modal mode, candidate selection only compares area and then center distance; if nested AXButtons share the same frame (a common wrapper/child semantics pattern), both values tie and the strict < check preserves the first node visited, so preorder traversal can still press an ancestor instead of the intended inner control. Fresh evidence: isBetterAXSnapCandidate returns false when nodeArea === bestArea and distance === best.distance, leaving the choice order-dependent. Please add a deterministic descendant/depth tie-breaker (for example, prefer longer path) so equal-frame overlaps do not trigger the wrong AXPress.

Useful? React with 👍 / 👎.

@shaun0927 shaun0927 merged commit c905083 into develop Apr 28, 2026
3 checks passed
@shaun0927 shaun0927 deleted the autopilot/ax-snap-coordinate-buttons branch April 28, 2026 13:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant