chore(release): v0.6.1 — develop → main catch-up + AXSecureTextField paste fix (#761)#762
Conversation
…ication fix(#39): Tier-3 readback verification for app_type_element
Round-trip through the simulator pasteboard + Cmd+V via sim-hid to bypass the software keyboard entirely. Unicode-safe (Korean/Japanese/Chinese/emoji) regardless of the simulator's active keyboard layout — the primary failure mode that #39 was filed against. - Opt-in via new `backend: "pasteboard"` param on app_type_element (default `"auto"` preserves current simhid behaviour) - Auto-accepts iOS 16+ paste-permission dialog using a localised label corpus (en/ko/ja/zh) with negative-dismiss filtering so "Don't Allow" variants are never tapped by accident - Preserves original pasteboard content by default (opt-out via `restorePasteboard: false`) - Exposes `keyChord(key, modifier)` on SimulatorKitHIDInputBackend for any future chord use (Cmd+A, Cmd+X, etc.) Tier 1 (force-Latin keyboard) from #39 is deliberately deferred — the pasteboard path covers every locale without requiring a layout switch, and Tier 1's detection signal was flagged as investigative in #39's v2 addendum. Tier 3 (readback verification) will land in a follow-up. Live-verified on iPhone 16 / iOS 26.4 against flutter-qa-app: typing "안녕 [email protected] 🎉" produced the exact same string in the Flutter TextField (confirmed via mobile-context textFieldLabels). Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
…kend feat: pasteboard typing backend for app_type_element (#39 Tier 2)
fix(ax-bridge): recursive scored content-root search (#40, part 1/3)
test(ax-bridge): TS reference scorer + 6-fixture unit suite (#40, part 2/3)
Bumps the package to 0.6.0. Headline change is the ax-bridge recursive scored content-root search (#40, follow-up to #4), which closes the silent-empty-content bug on Xcode 26.4 / iOS 26.4 by refusing to return a chrome-only tree. The raw bridge now emits a typed DEVICE_CONTENT_ROOT_EMPTY error (exit 1) when no subtree exposes any app-level accessibility semantics, instead of silently falling back to the bare AXWindow. Version bumped in package.json and package-lock.json. CHANGELOG.md gains a [0.6.0] entry summarizing the scoring rubric, chrome denylist, typed-error contract, and the TS reference scorer + 6 fixture unit suite that land alongside. The previous Unreleased entries (app_tap_element coordinate fallback contract + raw mobile-bridge context diagnostics) are carried forward into 0.6.0. Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
chore(release): 0.6.0 — ax-bridge recursive scored content-root (#40, part 3/3)
…ly-field fix(ax-bridge): emit chromeOnly on dump/query/inspect (#41)
…snapshot fix(ax-bridge): wrapper single-snapshot promotion via chromeOnly (#41)
…adme feat(ax-bridge): --help output and raw-bridge README contract (#42)
feat(alert): localized label corpus + reason enum + service hints (#43 pt.1/4)
feat(alert): isLikelyDialogButton detection predicate (#43 pt.2/4)
feat(alert): wire Tier 1 AX-scan + Tier 2 AppleScript + diagnostics (#43 pt.3/4)
…reset feat(alert): Tier 3 permission_reset fallback (#43 pt.4/4)
) Distinguishes "expected app is running but its UI is still loading / stuck on a spinner" from "no AX data at all" (simulator shell crashed, probe broken). Both currently collapse to FOREGROUND_CONTEXT_UNAVAILABLE, so downstream callers cannot choose between "retry the navigation step" and "abort the test run". The promotion is a wrapper-layer concern: the surface classifier in src/tools/raw-mobile-context.ts stays surface-scoped and never emits the new variant itself. Option A (re-probe) implementation: IF first probe == FOREGROUND_CONTEXT_UNAVAILABLE AND --expect-bundle is set AND expectBundle is in runningApps AND --max-settle-retries >= 1 re-probe once after another settleMs window if still UNAVAILABLE -> promote to TRANSITIONAL_STATE_TIMEOUT - src/tools/raw-mobile-context.ts: extend RawMobileClassification union. - src/tools/sim-hid-bridge-wrapper.ts (NEW): pure applyTransitionalPromotion helper + normalizeMaxSettleRetries. Keeps the wrapper behavior directly unit-testable without spawning the native bridge. - cli/sim-hid-bridge.ts: parse --max-settle-retries <0|1|2|3> (default 1), wire the promotion into both the `context` command and the tap/swipe post-action probe path. - tests/unit/sim-hid-bridge-wrapper.test.ts (NEW): 4-case spec from the issue — promotion, first-ok-wins, no-bundle-match, opt-out — plus the short-circuit guard and retry-range normalization. - tests/unit/raw-mobile-context.test.ts: add empty-tree regression guard and a compile-time type-literal assertion so removal of the new union member fails the build. - docs/headless-architecture.md, CHANGELOG.md: document the new classification value and --max-settle-retries flag. --max-settle-retries 0 restores the pre-issue single-probe behavior byte-for-byte for performance-sensitive callers. Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
…imeout feat(sim-hid-bridge): TRANSITIONAL_STATE_TIMEOUT classification + --max-settle-retries (#46)
…MEOUT live test (#46) Adds the deterministic integration fixture and live suite called for in issue #46's Post-deployment verification checklist, stacked on the core classification PR. Fixture — tests/fixtures/flutter_spinner/: - lib/main.dart: minimal Flutter app that renders a spinner-only phase for exactly 8000ms after first build (no Scaffold, no AppBar, no labels — matches the AX signature sim-hid-bridge keys on), then swaps to a 'Content ready' Text widget. - Bundle ID: com.opensafari.fixtures.flutterSpinnerQa (distinct from the existing flutter-qa-app fixture so both can coexist on one simulator). - build.sh + README document the flutter create bootstrap and the flutter build / simctl install commands. Live test — tests/integration/sim-hid-transitional.live.test.ts: - Gated behind OPENSAFARI_LIVE_TRANSITIONAL=1 (same pattern as OPENSAFARI_LIVE_SIMHID), so npm test / default integration runs stay green on machines without the fixture installed. - Test 1 (spinner phase): probing 500ms after launch returns classification === 'TRANSITIONAL_STATE_TIMEOUT', verified === false, with warnings naming the bundle and "transitional timeout". - Test 2 (content ready): probing after SPINNER_WINDOW + 1000ms buffer does NOT yield TRANSITIONAL_STATE_TIMEOUT. - Test 3 (no-regression): Settings.app (com.apple.Preferences) after 1500ms does not false-positive as TRANSITIONAL_STATE_TIMEOUT — guards against the promotion firing on normal fast launches. - Test 4 (opt-out): --max-settle-retries 0 keeps the classification at FOREGROUND_CONTEXT_UNAVAILABLE (pre-issue byte-for-byte behavior). Closes part 2 of #46. Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
…xture test(sim-hid-bridge): flutter_spinner fixture + TRANSITIONAL_STATE_TIMEOUT live test (#46)
The `require.main === module` guard does not survive the webpack transform used to bundle `dist/ax-bridge`. After the webpack pass, the generated condition (`__webpack_require__.c[__webpack_require__.s] === module`) never evaluates to true when the bundle is run as `node dist/ax-bridge`, which meant `main()` was never called — every invocation exited 0 with no output. The regression broke `tests/unit/ax-bridge-help.test.ts` and silently disabled the raw CLI's `--help` surface and every subcommand. Since this file is only ever shipped as the CLI entry point, drop the guard and call `main()` unconditionally. Verified: all 10 help tests pass. Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
…inel boots (#637) The previous help-surface fix made the ax-bridge bundle execute `main()` on module import, which broke the wrapper unit suite by treating pure test imports like real CLI invocations. At the same time, the HID sentinel still used a hard 30s exec timeout even though the suite budget had already been raised to 45s for slow macOS runners, so macos-15 could time out with empty stdout and fail the JSON-output assertion. This keeps the webpack-safe CLI detection by checking `process.argv[1]` for the ax-bridge entry name, preserving `node dist/ax-bridge ...` behavior without executing during Jest imports. It also aligns the sentinel subprocess timeout with `SLOW_BRIDGE_TIMEOUT_MS` so the workflow and the local sentinel command use the same budget. Constraint: Webpack breaks `require.main === module` for the bundled ax-bridge entry Constraint: macos-15 GitHub runners can exceed 30s on the first sentinel probe Rejected: Unconditional main() invocation | breaks import-based unit tests Rejected: Leave sentinel at 30s and relax assertions | hides the timeout instead of fixing it Confidence: high Scope-risk: narrow Reversibility: clean Directive: Keep ax-bridge import-safe; any future entry guard must still allow unit tests to import helpers without executing the CLI Tested: npm test -- --runInBand tests/unit/ax-bridge-wrapper.test.ts tests/unit/ax-bridge-help.test.ts Tested: npx jest tests/ci/sim-hid-sentinel.test.ts --runTestsByPath --testPathIgnorePatterns=/node_modules/ --runInBand Tested: npm run lint Tested: npx tsc --noEmit Tested: npm run test:ci Tested: workflow-equivalent sentinel command with --json output file Not-tested: GitHub-hosted macos-15 runner after merge
- Adds 'pointer-service' to `ALLOWED_BACKEND_KINDS` in `src/ci/audit-log-posture.ts` alongside `ax-press`, `flutter-vm`, `simhid`, and `webkit`. - Keeps `applescript` and `simctl` rejected. - Extends `tests/unit/audit-log-posture.test.ts` with a positive case for `pointer-service` and adds it to the top-level extraction fixture. ## Why PR #615 introduced `PointerServiceInputBackend` (`kind = 'pointer-service'`) as the Phase-1 opt-in Tier-1 backend for Xcode 26+ coordinate taps, gated behind `OPENSAFARI_ENABLE_POINTERSERVICE=1`. PR #612 admitted `ax-press` to the #540 posture allow-list but did not enumerate `pointer-service`, so any CI job that opts into the backend and then runs `assertAuditLogPosture` fails with `disallowed backendKind(s) present: pointer-service`. This blocks the PointerService promotion path (`#590` Phase 1 → Phase 2) in every pipeline that enforces the #540 criterion. The backend itself is headless by design: it shells out to `sim-hid-bridge tap-ps`, which uses the PointerService IPC surface with no foreground dependency on Simulator.app. That posture is attested by the sentinel symbol probe (PR #621) and the live integration harness (PR #629), so treating it as a first-class headless backend matches the observed behavior. ## Test plan - [x] `npx jest tests/unit/audit-log-posture.test.ts --runInBand` — 12/12 passing. - [x] `npx tsc --noEmit` clean. - [x] `npx eslint src/ci/audit-log-posture.ts tests/unit/audit-log-posture.test.ts` clean. ## Related - Refs: #540 - Follows: #612 (ax-press allow-list admission), #615 (PointerService Phase 1), #590 (PointerService epic)
…in ko-KR IAP recipe ## Summary - Replace `flutter build ios --simulator --profile` with `--debug` in both variants (GitHub Actions at L120 and shell at L283) and in gotcha #4. The Flutter toolchain rejects `--profile` on simulator targets with 'Profile mode is not supported for simulators.' - Add a prominent version-pin banner at the top noting the recipe only works on `[email protected]` and the three `app_storekit_*` tools it invokes were removed in 0.5.0 (PR #623 on #588). - Reword the prereq Xcode row ('for simctl storekit' implied a CLI that does not exist) and the 'Tool availability note' so the three storekit tools are no longer implied to be current. ## Why 1. `--profile --simulator` is a hard Flutter toolchain failure. PR #630 corrected `docs/ci-recipes.md` and `docs/flutter-inspector.md` to use `--debug` for simulator targets and documented the split (`--debug` for simulator, `--profile` for physical device). The ko-KR IAP recipe landed in PR #628, after #630 was merged, but kept the broken `--profile` command in three places. Anyone copy-pasting the recipe today hits an immediate Flutter CLI exit. 2. The recipe legitimately ships as a pinned-to-0.4.9 artifact, but the cross-references and "Tool availability note" read as if the `app_storekit_*` tools are current MCP surfaces. A reader who installed `opensafari-mcp` from npm (→ 0.5.0) and followed the recipe's `docs/storekit-automation.md` link lands on the post-#627 rewrite that explicitly says those tools do not exist — a contradiction the recipe does not prepare them for. The banner makes the pin and its expiry condition unambiguous. ## Test plan - [x] `git diff --stat` — 1 file, 8 insertions / 6 deletions. - [x] No remaining `--profile` occurrences on simulator commands: `grep -n "simulator --profile" docs/recipes/flutter-iap-ko-kr.md` → no matches. - [x] Banner renders cleanly in the doc preview. ## Related - Refs: #628 (recipe introduction), #630 (`--debug` for simulator targets), #623 (`app_storekit_*` removal), #627 (post-removal storekit-automation doc), #588 (storekit epic), #589 (ko-KR alert handle).
…ebview_connect
## Summary
- Track whether `app_webview_connect` freshly registered a
`WebKitClient` on the current call, and deregister via
`setWebKitClient(null, resolvedDeviceId)` in the `listTargets` catch
path so the next call does not reuse a stale, never-successfully-used
instance.
- Pre-existing registrations are intentionally left alone — they may be
owned by a concurrent tool surface that manages its own lifecycle.
- Adds two unit-test cases: one asserts the freshly-created client is
cleared after a `listTargets` failure; the mirror case asserts a
caller-registered client is retained.
## Why
Before this change, when `getWebKitClient(deviceId)` returned no client,
`app_webview_connect` would construct and register a new `WebKitClient`,
then immediately call `listTargets()`. If the proxy was briefly
unreachable the call threw `ECONNREFUSED`, the catch branch returned an
`isError` payload, and the newly-registered client stayed in the global
map. Every subsequent `app_webview_connect` (or other tool that goes
through `getWebKitClient`) then short-circuited to the failed client,
propagating the transient outage as a permanent state for the rest of
the server's lifetime.
The fix scopes the cleanup to what we created: concurrent callers that
had already registered a working client keep it.
## Test plan
- [x] `npx jest tests/unit/app-webview-connect.test.ts --runInBand` — 15/15 passing
(was 13/13; +2 new cases covering freshly-registered cleanup and
pre-existing preservation).
- [x] `npx tsc --noEmit` clean.
- [x] `npx eslint src/tools/app-webview-connect.ts tests/unit/app-webview-connect.test.ts` clean.
## Related
- Related: #592 (bundle-aware target classification), #606 (bundle-aware classifier), #593 (WebView cross-context).
Adds `dumpTreeWithRecovery` around `AccessibilityBridge.dumpTree` to absorb transient bridge failures that previously poisoned the verification lane. - Classify recoverable (`DEVICE_CONTENT_ROOT_EMPTY`, `AX_TIMEOUT`, `BRIDGE_EXEC_FAILED`, `AX_ERROR`) vs non-recoverable (`BRIDGE_NOT_FOUND`, `AX_PERMISSION_DENIED`). - Between attempts, sleep per the configured backoff schedule ([200, 500, 1200] ms by default) and force a fresh `ensureSemanticsActive` to rematerialise Flutter/native semantics. - Return a chronological `AxBridgeRecoveryReport` on every call so tool diagnostics can surface attempts/stages uniformly. - Best-effort reactivation: a failing `ensureSemanticsActive` is recorded in the report but does not abort the retry loop. - Thrown errors carry the report attached for downstream telemetry. No call sites change in this PR; wiring the wrapper into the tools happens in a follow-up so the module can be reviewed in isolation. Tests: 11 unit tests cover happy path, single/double-failure recovery, retry exhaustion, non-recoverable short-circuit, reactivation disable, reactivation failure, custom schedule, missing deviceId, and the error attachment contract. Refs: #643 Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
WebKitClient.listTargets previously used `parsed.every(isDeviceRedirect)` as an all-or-nothing guard, so a mixed /json response (some redirect stubs alongside inline page targets) fell through to the pass-through branch and leaked stubs with undefined webSocketDebuggerUrl to callers. That produced opaque WebSocket errors in app_webview_connect without any diagnostic pointing at the unresolved redirect. Resolve each entry independently: redirect stubs are expanded via their inner /json, inline targets pass through untouched, and a failure on one redirect only contributes [] rather than collapsing the whole array. Adds tests/unit/webkit-client-list-targets.test.ts covering all-inline, all-redirect, mixed, mixed-with-failure, and empty inputs, exercising the real code path by spying on the private httpGet rather than stubbing listTargets. Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
… tap via Notes (#641) Apple review treats `xcrun simctl openurl` as indirect evidence for a Universal Link deep-link. Reviewers themselves paste links into Notes and tap the Data-Detector-produced link, which routes through the same `LSOpenURL` path as the reviewer does in production. This new tool automates that path end-to-end: 1. Launches Notes (`com.apple.mobilenotes`). 2. Focuses the note body (first AXTextArea/AXTextView/AXTextField). 3. Paste-injects the URL via the existing pasteboard-input backend (handles the iOS 16+ paste-permission dialog + restores the original pasteboard). 4. Polls the accessibility tree for an AXLink whose label contains the URL or its host. 5. Presses the detected link via the AX bridge. Returns a structured JSON result with `notesLaunchedAt`, `linkTappedAt`, the matched `linkElement`, and `durationMs`. Timeouts, Notes-launch failures, and `AXPress`-not-advertised failures surface as `isError: true` with a descriptive message. Unit tests cover happy path (URL match, host-only match), missing-url, invalid-scheme, editor-never-appears, link-never-detected, Notes-launch failure, and press-failed. Typecheck + jest both green. Ref: #641 (sub-task T1) Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
The inline comment on PointerServiceInputBackend.swipe claimed that a failure on Xcode 26+ would "surface via the tier chain when we bubble back up". That is not the runtime behaviour: getInputBackend caches PointerServiceInputBackend as the selected backend once OPENSAFARI_ENABLE_POINTERSERVICE=1 resolves, so a HeadlessInputUnavailable thrown by the SimHID delegate propagates straight back to the tool layer without any tier-chain re-entry. Option A (documentation-only) per the decision record on issue #649. Option B (an actual in-tool tier downgrade) is deferred to #590 Phase 2 alongside sim-hid-bridge swipe-ps so the fallback can be tested against a real alternative. Changes: - src/tools/pointer-service-input-backend.ts: rewrite the swipe() leading comment to describe the shipped behaviour and point callers at the two working escape hatches (leave the env flag unset, or use an element-targeted swipe so the AX-press tier handles the gesture). Amend the class-level Scope docstring so the "keeps non-tap input paths on their existing Tier-1 route" sentence is qualified for Xcode 26+. - CHANGELOG.md: add an Unreleased / Fixed bullet noting that only documentation changed and directing readers to the escape hatches. - tests/unit/pointer-service-input-backend.test.ts: add a guard assertion that PointerServiceInputBackend.swipe propagates HeadlessInputUnavailableError unchanged and invokes the delegate exactly once, so any future change that tries to silently retry or swallow the error has to update the comment and the test together. No runtime behaviour changed. Build + tests still green on this worktree. Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
Register two new MCP tools that will expose a simulator-level network toggle so native (Flutter / UIKit) apps see real SocketException / NSURLErrorNotConnectedToInternet. Unlike network_offline (which only shims window.fetch / XHR inside WebKit page contexts), these tools are intended to operate on the iOS Simulator host network path via pfctl or Network Link Conditioner. Issue #640 tracks the overall design; this PR ships only the scaffold: - src/tools/device-network.ts — handlers + input schemas, in-memory state map keyed by device UDID, mode === "online" is idempotent, mode === "offline" | "airplane" returns a structured not_implemented error pointing readers at the follow-up PRs (2-6) that wire the pfctl / NLC backends. - src/tools/index.ts — registerDeviceNetworkTools() called alongside the existing WebKit-scoped network tools. - tests/unit/device-network.test.ts — 12 Jest specs covering schema shape, mode/mechanism validation, online idempotence, UDID resolution (explicit vs getSoleDeviceId vs listBooted fallback), and device_network_get default + after-set behavior. No behavior change for existing tools. Existing 114 unit-test suites (1693 tests) continue to pass. Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
Both tools now accept an optional `captureLogs` object that
synchronously collects `os_log` entries around the URL-open event. This
lets close-gate tooling assert the presence of deep-link handler output
(e.g. `[UniversalLink] Resolved …`) in a single MCP call instead of
juggling `xcrun simctl log stream` by hand.
iOS does not expose a kernel "app is quiescent" signal outside of
Instruments, so the helper uses a silence heuristic:
1. Capture `preOpenAt` before invoking `simctl openurl`.
2. Poll `log show --start <preOpenAt - prerollMs>` with the same
NSPredicate already used by `app_logs` (reuses `buildLogPredicate`).
3. Deduplicate entries by `timestamp + traceID + composedMessage`.
4. Stop once no new matching entry has arrived for `silenceMs`, or
`maxDurationMs` elapses. Report the reason.
Shared across `app_open_url` and `app_deeplink` via
`src/observability/capture-logs-window.ts`. The option is additive — when
`captureLogs` is omitted the tools behave exactly as before, and the
pre-existing unit test for `app_open_url` continues to pass.
Unit tests (`tests/unit/capture-logs-window.test.ts`) cover:
- silence stop path (entries stable across polls → stopReason='silence')
- max_duration stop path (entries keep arriving → stopReason='max_duration')
- preroll is applied (windowStart = preOpenAt - prerollMs)
- simctl rejection → returned as `{error, stopReason:'error'}` instead
of thrown (so the caller still gets the open result)
- NSPredicate threading (bundleId + search)
- dedup by `traceID + composedMessage` across polls
Ref: #641 (sub-task T4)
Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
…ormational stderr (#651) Address bug #2 from issue #651. `simctl io … screenshot` intermittently returns `Timeout waiting for screen surfaces` after high-velocity input events while the simulator is still compositing; a direct retry ~1-2s later succeeds. Two changes: 1. Retry the screenshot up to 2 times with a 1.5s backoff when the error message matches the transient-timeout signature. Surface a `retries` counter in the response metadata so operators can see how often the simulator asks to be backed off. 2. Strip the informational `Note: No display specified.` line from stderr before surfacing it as an error — it is emitted by simctl even on success paths and drowns out the real failure signal. Unit tests cover: - Retry once on transient error then succeed (retries: 1) - Retry-exhausted path still returns a clean error message - Non-transient errors are not retried - `Note:` line is filtered from error output - First-attempt success case preserved (retries: 0) Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
#706) Merge-readiness review completed for PR #732. This PR is valid for #706 because it keeps `WebKitClient` as the public compatibility facade while moving its internal responsibilities into focused modules: errors, protocol transport, target session state, browser commands / DOM input scripts, and typed event forwarding. That directly addresses the maintainability risk from the previous monolithic WebKit client without introducing CDP/Playwright abstractions or changing the external BrowserBackend-facing API. I also reviewed the automated feedback. The earlier source-comment cleanup and transport/event/cookie follow-ups were already addressed. Two latest Codex findings were still valid and are now fixed: - `selectOption` again guards on `SELECT`, so a mis-targeted selector cannot mutate an input/textarea while still reusing the shared prototype-walk value setter. - High-risk MCP audit logging now records `status: "error"` when a tool handler returns `isError: true` without throwing, preserving error telemetry. I also refreshed the PR title/body so they match the actual current scope, then merged the latest `develop` to remove the conflict while preserving develop's newer audit redaction policy. Validation evidence on the final head (`1d9abf4d`): - Local: `npm run lint -- --quiet` passed. - Local targeted regression coverage: 9 relevant suites / 152 tests passed after the develop merge. - Local full unit suite: 170 suites / 2599 tests passed. - Local build: `npm run build` passed. - Local whitespace check: `git diff --check` passed. - GitHub CI on the final head: lint, test, dependency-audit, and build all passed. Architectural review status: CLEAR. The remaining risk is limited to live Safari/simulator smoke behavior, but the PR is a behavior-preserving refactor with focused unit coverage around the extracted boundaries plus green CI/build. I consider this merge-ready.
refactor(input): split backends and resolver into focused modules (#707 a)
Constraint: Keep tracing dependency-free and opt-in so normal MCP automation behavior is unchanged. Rejected: Adopt Playwright trace or OpenTelemetry SDK directly | too heavy for the first OpenSafari-native artifact contract. Confidence: high Scope-risk: narrow Directive: Extend exporters only after the JSON schema proves useful in live simulator failures. Tested: npm test -- --runTestsByPath tests/unit/action-trace.test.ts; npm run lint -- --quiet src/observability/action-trace.ts src/orchestration/scenario-runner.ts tests/unit/action-trace.test.ts; npm run build; GitHub CI lint/test/build/dependency-audit. Not-tested: Live simulator trace artifact capture.
Constraint: Fast login support must not turn OpenSafari into an auth framework or secret store. Rejected: Add Auth.js/SimpleWebAuthn/IAM dependencies | those belong to target apps, not the automation MCP runtime. Confidence: high Scope-risk: narrow Directive: Implement tool gaps only after a non-secret fixture proves the recipe cannot be completed with existing tools. Tested: Documentation-only scope review; GitHub CI lint/test/build/dependency-audit/path-guard. Not-tested: Live passkey fixture run.
Constraint: Preserve existing app_wait_for defaults and add only optional diagnostics. Rejected: Introduce a new locator or action retry engine | broader behavior change than the wait-diagnostics issue needs. Confidence: high Scope-risk: narrow Directive: Keep timeout samples bounded and reset stability after unobserved polls so AX dumps do not become noisy or false-positive artifacts. Tested: npm test -- --runTestsByPath tests/unit/app-wait-for.test.ts tests/unit/app-wait-for-diagnostics.test.ts; npm run lint -- --quiet src/tools/app-wait-for.ts tests/unit/app-wait-for.test.ts tests/unit/app-wait-for-diagnostics.test.ts; npm run build; GitHub CI lint/test/build/dependency-audit. Not-tested: Live simulator intermittent AX query failure.
Constraint: Use existing VM Service allocation data without app-side dependencies. Rejected: Require leak_tracker or embed Flutter DevTools | would force changes in target apps and exceed OpenSafari's external QA role. Confidence: high Scope-risk: narrow Directive: Keep budgets caller-configured until live fixtures establish stable defaults. Tested: npm test -- --runTestsByPath tests/unit/flutter-memory-budget.test.ts; npm run lint -- --quiet src/metrics/flutter-memory-budget.ts tests/unit/flutter-memory-budget.test.ts; npm run build; GitHub CI lint/test/build/dependency-audit. Not-tested: Live Flutter debug/profile fixture budget run.
Constraint: Preserve existing network_intercept block/modify behavior while preventing cross-session rule leakage. Rejected: Keep module-global activeRules | long-lived MCP processes can leak stale rules into unrelated sessions. Confidence: high Scope-risk: narrow Directive: Future per-rule removal should build on session-scoped interceptors rather than reintroducing module-global state. Tested: npm test -- --runTestsByPath tests/unit/network-interceptor.test.ts tests/unit/network-intercept-tool.test.ts; npm run lint -- --quiet src/tools/network-intercept.ts src/tools/network-offline.ts tests/unit/network-intercept-tool.test.ts; npm run build Not-tested: Live two-session Safari interception.
Constraint: Preserve synchronous registry APIs and existing lock timeout/retry semantics. Rejected: Keep a JavaScript busy-spin loop | it wastes CPU during simulator cleanup contention. Confidence: high Scope-risk: narrow Directive: Any future async lock migration should preserve active-simulator protection semantics first. Tested: npm test -- --runTestsByPath tests/unit/zombie-cleanup-lock.test.ts tests/unit/zombie-cleanup-cross-session.test.ts; npm run lint -- --quiet src/reliability/zombie-cleanup.ts tests/unit/zombie-cleanup-lock.test.ts; npm run build Not-tested: Live multi-process lock contention CPU profile.
Constraint: Live OpenSafari validation for #752 showed device_boot could return success while qa_session_create failed with NO_WEBKIT_CLIENT and diagnose reported device=null.\nRejected: Making WebKit mandatory for device_boot | native simulator tools should remain usable when Safari target discovery is temporarily unavailable.\nConfidence: high\nScope-risk: narrow\nDirective: Preserve best-effort boot semantics; surface WebKit unavailability through diagnostics instead of hiding a booted device.\nTested: npm test -- --runInBand tests/unit/device-boot-webkit-retry.test.ts; npm run lint -- --quiet; npm run build:src\nNot-tested: Full iOS WebKit live retry after installing this branch into the active MCP runtime.
Constraint: PR #753 moves network_intercept and offline behavior onto the shared NetworkInterceptor implementation.\nRejected: Leave XHR send wrapped after clear/offline restore | clear would not fully restore page network behavior and repeated injections could wrap wrappers.\nConfidence: high\nScope-risk: narrow\nDirective: Keep fetch, XHR open, and XHR send restoration in sync when editing injected network hooks.\nTested: npm test -- --runTestsByPath tests/unit/network-interceptor.test.ts tests/unit/network-intercept-tool.test.ts; npm run lint -- --quiet src/network-interceptor.ts tests/unit/network-interceptor.test.ts tests/unit/network-intercept-tool.test.ts\nNot-tested: Live Safari XHR interception.
Scope network interception to MCP sessions
Avoid busy-spin waits in zombie cleanup lock
Recover device_boot when WebKit registration is late
Coalesce per-device WebInspectorProxy startup and make teardown wait for in-flight starts so parallel device boot/retry/shutdown paths preserve one proxy and one reserved port per simulator.\n\nConstraint: OpenSafari advertises parallel simulator sessions, so proxy lifecycle must remain idempotent under concurrent callers.\nRejected: relying on callers to serialize device boot | duplicates the invariant and leaves retry/shutdown paths unsafe.\nConfidence: high\nScope-risk: narrow\nDirective: Keep getProxyForDevice, stopProxyForDevice, and stopAll single-flight aware when changing proxy lifecycle state.\nTested: CI build/lint/test/dependency-audit; npm test -- --runTestsByPath tests/unit/proxy-manager.test.ts; npx eslint src/simulator/proxy-manager.ts tests/unit/proxy-manager.test.ts
Share per-device Flutter VM resolution while native input tools are entering Tier 0 so debug/profile Flutter automation does not issue overlapping discovery/connect calls against the same client.\n\nConstraint: Parallel app tools can start simultaneously against one Flutter simulator before the resolver cache is populated.\nRejected: letting FlutterVMClient absorb concurrent connect calls | connect mutates WebSocket and state, so resolver-level single-flight ownership is safer.\nConfidence: high\nScope-risk: narrow\nDirective: Keep Tier 0 Flutter VM discovery single-flight when changing input backend resolution.\nTested: CI build/lint/test/dependency-audit; npm test -- --runTestsByPath tests/unit/input-backend-resolver.test.ts; npx eslint src/input/flutter-resolver.ts tests/unit/input-backend-resolver.test.ts
…alse (#760) (#761) * fix(pasteboard): skip readback for AXSecureTextField + honor verify:false (#760) The #639 PR C readback contract in `assertPasteApplied` cannot succeed against `AXSecureTextField` (password input) because iOS masks the AXValue with bullet characters regardless of the underlying plaintext. Every successful password paste was being rejected as `PASTE_NOT_APPLIED`, and the simhid path's `verifyTypedText` was escalating the same OS-mask divergence into `TEXT_INPUT_DROPPED` / `TEXT_INPUT_LAYOUT_MISMATCH` with `isError: true`. Net effect: no working backend for typing into a password field, blocking every signup / login / password-reset automation that follows the #639 Problem 1 docstring's "use pasteboard for non-Latin layouts" recommendation. Secondary: `backend: "pasteboard"` was silently ignoring the `verify: false` opt-out — `app-type-element.ts` always forwarded `expected` + `focusedElementPath` to `typeViaPasteboard`, so callers had no way to work around the secure-field bug. The simhid path already honoured `verify: false` via `verifyOptIn`; pasteboard now does too. Changes: - `pasteboard-input.ts`: `assertPasteApplied` accepts an optional `{ role, traits }` descriptor and returns silently when it signals a secure text field. New `isSecureFieldDescriptor` helper exported for callers that need to surface the skip in their own response shape. `typeViaPasteboard` passes the inspected node's role/traits into the assert and adds `secureField?: true` to `PasteboardTypeResult` so the tool layer can echo the skip cause. - `app-type-element.ts` (pasteboard branch): only forward `expected` + `focusedElementPath` to `typeViaPasteboard` when `verifyOptIn === true`. Echo `pasteResult.secureField` in the success response so callers can distinguish "no readback because secure field" from "no readback because verify opted out". - `app-type-element.ts` (`verifyTypedText`, simhid branch): detect `AXSecureTextField` via role or trait (also accepting the lower-case `"secure text field"` alias) and return `{ verified: 'unknown', verify_method: 'ax-value-not-readable' }` with a secure-field reason, instead of falling through to the mismatch path. Tests: - `pasteboard-input.test.ts`: 10 new cases — `assertPasteApplied` with bullet-masked actual + secure-field descriptor (role variant, traits variant, lower-case-alias variant), non-secure regression, legacy no-descriptor path. `isSecureFieldDescriptor` truth table. - `app-type-element.test.ts`: 2 new cases inside the Tier-3 readback describe — secure field via trait, secure field via role. Both must return `verified: 'unknown'` and must NOT escalate to a structured input error. No public schema change beyond honouring the existing `verify` parameter on the pasteboard backend. No new tool, no backend auto-switching — those were considered and rejected as out of scope (the issue body lists them explicitly). Closes #760. * refactor(app-type-element): delegate secure-field check to isSecureFieldDescriptor The trait-alias corpus (`AXSecureTextField`, `secure text field`) was duplicated between `pasteboard-input.ts::isSecureFieldDescriptor` and `app-type-element.ts::verifyTypedText`. If a new alias surfaces from a future iOS release, the duplicated copy could rot unnoticed and re-introduce the #760 regression on the simhid path. Import and reuse the helper so the corpus lives in exactly one place. Behaviour is unchanged — all 2646 unit tests still pass. Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]> --------- Co-authored-by: Claude Opus 4.7 (1M context) <[email protected]>
Bump package version to 0.6.1 and stamp the full release note into CHANGELOG.md. 0.6.1 is a *catch-up release* — v0.6.0 was tagged on 2026-05-16 but never merged to `main` and never published to npm (npm registry latest was 0.5.0). 0.6.1 rolls the entire 0.6.0 body forward onto `main` together with one targeted pasteboard fix uncovered after the 0.6.0 tag. Headline fix (post-0.6.0): - app_type_element pasteboard backend now works on AXSecureTextField (password) elements (#760, #761) — the #639 PR C readback contract was rejecting every successful password paste as PASTE_NOT_APPLIED because iOS masks AXSecureTextField AXValue with bullet chars regardless of plaintext. Verifier now skips for secure-field signals (role/trait) and the pasteboard backend honours `verify: false` symmetrically with simhid. Headline change (rolled forward from 0.6.0): - ax-bridge recursive scored content-root search (#40, follow-up to #4). Closes the silent-empty-content bug on Xcode 26.4 / iOS 26.4 with a typed DEVICE_CONTENT_ROOT_EMPTY error instead of falling back to bare AXWindow. Also includes (full notes in CHANGELOG.md): - Boot/lifecycle/network reliability batch (#752, #753, #754, #755, #756, #757) - WebKit decomposition (#706 1/5–5/5), simulator decomposition (#708 1/4–4/4), input layer split (#707 a/b), protocol typing (#710 a/b), DOM-input centralization (#709) - Performance: CLI lazy-load (#700), WebKit fast paths (#702), native-input batching (#705), proxy readiness split (#701), bootstatus-aware polling (#703), Web Inspector socket cache (#704) - Security: HTTP MCP transport hardening (#714 series), auth profile persistence (#716), audit log retention (#711, #717) - Tap coordinate scaling macOS-pt → iOS-pt (#693 WU3, #720) with regression coverage (#722); ax-bridge --debug stderr (#660) - TRANSITIONAL_STATE_TIMEOUT classification + --max-settle-retries for sim-hid-bridge (#46); PointerService swipe semantics clarified (#649) No breaking changes since 0.5.0. No tool name or schema changes. verify: false now honoured on pasteboard backend (was a silent no-op); response gains optional `secureField: true` field on AXSecureTextField paste — review the migration notes section. Tests: 2646 unit pass; tsc clean.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: d0e37d978b
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| if (!interceptor) { | ||
| interceptor = new NetworkInterceptor(); | ||
| interceptorsBySession.set(key, interceptor); | ||
| } |
There was a problem hiding this comment.
Remove interceptor state when MCP session is deleted
The new per-session interceptor cache only ever inserts into interceptorsBySession and never removes entries, so HTTP deployments that churn sessions (e.g., clients that open/close sessions or omit Mcp-Session-Id and get a new one) will leak one NetworkInterceptor object per session for the process lifetime. clear only disables rules but keeps the map entry, and there is no wiring from HTTP session deletion back to this map, so memory/state accumulates over time in long-running servers.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
Deferred to a focused follow-up issue: #763. The cleanest fix touches MCP-server-side session-lifecycle plumbing (subscribing to HTTPTransport.onSessionDelete, which is defined at src/transports/http.ts:65-67 but currently has no callers), and that integration is out of scope for a release catch-up PR — adding new server wiring on top of the 355-commit develop→main bundle would dilute review scope. The deferral is intentional, not a dismissal: #763 has full repro, code references, and a proposed fix outline so it can be picked up against develop independently. The leak only applies to long-running HTTP deployments; STDIO sessions (the default) are unaffected since they live for the process lifetime.
main carried 3 commits not yet in develop (PR #696 audit auto-spawn proxy, PR #638 v0.5.0 sentinel-release-ci, sentinel timeout-budget follow-up). Folding them into the 0.6.1 release so the develop→main catch-up lands cleanly without losing main-only hotfixes. Conflict resolution (same content as the prior non-merge commit 14684a7, re-recorded with both parents so GitHub recognises it as a true merge): - cli/index.ts: keep develop's lazy-import pattern (#700 a perf work); add WebInspectorProxy to the lazy import block (needed by #696's audit auto-spawn path); add audit-port helper as eager import (used by commander option parser at module init time). - tests/unit/proxy-timing.test.ts: keep develop's renamed private method shape (waitForTarget / waitForProcessReady); drop the stray waitForForwarding test that duplicated existing waitForTarget coverage with the pre-rename API; add isProxyHealthy to the ProxyPrivate type cast (still present at src/simulator/proxy.ts:395); update two process-readiness mocks from <html>iOS Devices</html> to [] to match main's /json probe (#696 — -F-safe) and the body.trimStart().startsWith('[') predicate now used by isProxyHealthy / waitForProcessReady; switch two forwardingPolled assertions from u.includes('/json') (now matches both ports) to u.includes('localhost:${proxy.port}') (forwarding port only). Verification: 2656 unit tests pass; tsc clean.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: e2ac93fe0a
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| const interceptorsBySession = new Map<string, NetworkInterceptor>(); | ||
|
|
||
| export function getNetworkInterceptorForSession(sessionId?: string): NetworkInterceptor { | ||
| const key = sessionId || DEFAULT_INTERCEPTOR_SCOPE; |
There was a problem hiding this comment.
Scope interceptor state by device, not only by session
The interceptor cache key ignores device_id (getNetworkInterceptorForSession only uses sessionId), but the tool now accepts per-call device_id. In a single MCP session that targets multiple simulators, toggling network_offline/network_intercept on device B mutates the same interceptor state used for device A, so disabling or clearing on one device can leave stale JS hooks active on the other device while the server believes interception is disabled. This creates hard-to-recover cross-device state bleed for multi-device workflows.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
Addressed in fabe0ab. getNetworkInterceptorForSession now takes an optional second deviceId argument and keys the cache by ${sessionId}|${deviceId}, with stable default scopes when either is omitted. Both call sites (network_intercept tool handler in this file and network_offline in src/tools/network-offline.ts) plumb device_id through, so multi-device sessions get distinct interceptors. Three new unit tests in tests/unit/network-intercept-tool.test.ts cover the device-scoping contract (distinct interceptors per (session, device); same interceptor on repeated access with the same pair; omitted-deviceId default scope distinct from explicit deviceIds).
| const action = ((params.action as string) || 'block') as 'block' | 'modify'; | ||
| if (action === 'block') return { urlPattern, action: 'block' }; | ||
|
|
There was a problem hiding this comment.
Reject unknown
action values instead of treating them as mock
mapRule treats any non-"block" action as a mock response path, so malformed inputs like "blok" are silently converted into action: "mock" with a default 200/empty body. Because MCP schemas are not enforced at runtime for all clients, this can turn a typo or untrusted payload into unintended request rewriting instead of a validation error, changing network behavior in ways callers won’t expect.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
Addressed in fabe0ab. mapRule now rejects any action value other than "block", "modify", or undefined (the latter keeps the existing default-to-block behaviour) with Error('action must be "block" or "modify" (got <JSON>)'). Typos like "blok" and non-string actions (number / boolean / object) now surface as visible errors instead of silently rewriting requests. mapRule is exported so the schema-validation contract is unit-testable directly; seven new cases in tests/unit/network-intercept-tool.test.ts lock the contract down.
… — Codex P1/P2 on #762 Addresses two of the three Codex review findings on PR #762. The third (session-deletion leak) is split out to follow-up issue because the cleanest fix touches MCP-server-side session-lifecycle plumbing and is out of scope for a release catch-up PR. Codex P1 — Scope interceptor state by device, not only by session ------------------------------------------------------------------ `getNetworkInterceptorForSession` was keyed by sessionId only, but the tool accepts per-call `device_id`. In a single MCP session that targets multiple simulators, toggling `network_intercept` or `network_offline` on device B mutated the same interceptor used for device A — disabling or clearing on one device could leave stale JS hooks active on the other while the server reported "interception disabled". The composite key now combines sessionId + deviceId (`${sessionId}|${deviceId}`), with stable default scopes when either is omitted, so per-device state no longer bleeds across sibling devices. Callers updated: - `src/tools/network-intercept.ts` tool handler now plumbs `device_id` into `getNetworkInterceptorForSession`. - `src/tools/network-offline.ts` (which shares the same module) does the same so the offline toggle targets the per-device interceptor. Codex P2 — Reject unknown `action` values instead of silently mocking -------------------------------------------------------------------- `mapRule` treated any non-"block" action as the mock path, so a typo like `"blok"` was silently converted into `action: "mock"` with a default 200/empty body. The JSON Schema declares `enum: ['block', 'modify']` but MCP runtime enforcement is not guaranteed for every client. `mapRule` now throws a clear error when `action` is supplied with any value other than `"block"` or `"modify"` (or `undefined`, which keeps the existing default-to-block behaviour), turning misuse into a visible error instead of unintended request rewriting. Also exported so the schema-validation contract is unit- testable directly. Tests ----- `tests/unit/network-intercept-tool.test.ts` adds 10 new cases: - Device scoping (3): distinct interceptors per (session, device); same interceptor for repeated access with the same pair; omitted deviceId is a stable default scope distinct from explicit deviceIds. - Action validation (7): default-to-block, accepts "block", accepts "modify" with default 200 / empty body, threads explicit statusCode + body, rejects typo "blok", rejects non-string actions (number / boolean / object), still requires urlPattern. Verification: 2666 unit tests pass (+10 from baseline 2656); tsc clean. Refs: Codex review on PR #762 (commit e2ac93f).
|
@codex review Re-review request for commit fabe0ab, which addresses two of the three findings from the earlier review on
Verification: |
|
Codex Review: Didn't find any major issues. More of your lovely PRs please. ℹ️ About Codex in GitHubYour team has set up Codex to review pull requests in this repo. Reviews are triggered when you
If Codex has suggestions, it will comment; otherwise it will react with 👍. Codex can also answer questions or update the PR. Try commenting "@codex address that feedback". |
Ready to merge — approve summaryCodex Review on this PR has signaled approval: 👍 reaction on the PR and a comment of "Didn't find any major issues. More of your lovely PRs please." as of 2026-05-17T08:48:13Z, against commit What this PR isA catch-up release that lands every change on Scope / direction review
Iteration history
Verification on
|
| Check | Status |
|---|---|
lint |
✅ SUCCESS |
test |
✅ SUCCESS |
dependency-audit |
✅ SUCCESS |
qa-lane |
⏭️ SKIPPED (intended — release PRs do not run the live QA lane) |
path-guard (Omofictions-App private-route QA) |
.github/workflows/omofictions-qa.yml updates with src/** changes (the guard is designed to keep dedicated QA-lane PRs from sneaking runtime code in). Not a blocker. |
build |
⏳ IN_PROGRESS (Swift bridge + webpack — historically 5–15 min) |
| Codex review | ✅ Approved (👍 reaction + "Didn't find any major issues" comment on fabe0ab1) |
Why merge is OK
- Scope discipline: every change either (a) already shipped through its own PR on
developor (b) is a Codex-driven follow-up landed in a single focused commit (fabe0ab1). The third Codex finding was triaged with full rationale and split to Evict NetworkInterceptor map entries on MCP session deletion (HTTP transport) #763 instead of being silently dropped or scope-creeping the release. - No new public-API breakage: optional-only signature additions; existing callers compile and behave unchanged.
- Test coverage strengthened: net +10 unit tests vs. baseline (
pasteboard-inputsecure-field,app-type-elementsecure-field,network-interceptdevice scoping + action validation). 2666 tests pass under a cleantsc. - Release sequencing is conservative: no
npm publishtriggers from this PR — that step is left for the maintainer to run frommainafter tag, so a bad release cannot escape automatically.
Post-merge to-do (not part of this PR)
- Tag
v0.6.1on the merge SHA and push. npm publishfrommain.- Pick up Evict NetworkInterceptor map entries on MCP session deletion (HTTP transport) #763 against
develop(HTTP session-deletion cleanup forinterceptorsByKey). - Backmerge
main→developso the[Unreleased]header on develop aligns with main's[0.6.1]history.
Approving on the strength of the Codex verdict, the green-everywhere CI (modulo the advisory path-guard), and the scope discipline above.
Final PR review —
|
| Commit | Files | LOC | What |
|---|---|---|---|
d0e37d97 |
3 (package.json, package-lock.json, CHANGELOG.md) |
+61 / −5 | Version bump 0.6.0 → 0.6.1 and the categorized release-notes section. |
e2ac93fe |
2 (cli/index.ts, tests/unit/proxy-timing.test.ts) |
+303 / −70 — merge | Two-parent merge commit; conflict resolution only. Folds the 3 main-only hotfixes (PR #696 / #638 / sentinel timeout follow-up) back into release/0.6.1. |
fabe0ab1 |
3 (src/tools/network-intercept.ts, src/tools/network-offline.ts, tests/unit/network-intercept-tool.test.ts) |
+156 / −12 | Codex review fixes: P1 device-scoping + P2 action validation. |
That's the entire net-new diff surface for this PR.
2. Code correctness
2.1 Version stamping (d0e37d97)
package.jsonandpackage-lock.jsonboth bumped to0.6.1(lock got auto-updated bynpm install, expected). No mismatch.webpack.config.jsinjects version viaDefinePluginreadingpackage.jsonat build time, so no other source file needs updating (verified againstsrc/version.ts:5).- CHANGELOG
[Unreleased]section emptied of its prior contents (TRANSITIONAL_STATE_TIMEOUT Story 1D.5: MVP Integration Test — Claude Code End-to-End #46 and PointerService [P2] PointerService swipe on Xcode 26+ — align tier-chain expectation with observed code path #649 entries are now categorised under[0.6.1]"Added"/"Fixed"). Headers retain their hierarchy. - No issues.
2.2 Merge conflict resolution (e2ac93fe)
Conflict 1 — cli/index.ts top-level imports: kept develop's lazy-import pattern (#700 a perf work), added one eager import for parseAuditProxyPort/resolveAuditProxyPort (correct — the commander --port option parser at line 353 needs them at module init time). Added WebInspectorProxy to the audit-action Promise.all([imports]) lazy-loader (correct — main's PR #696 introduced auto-spawn logic at lines 400–442 that needs the class).
Conflict 2 — tests/unit/proxy-timing.test.ts:
ProxyPrivatetype now haswaitForTarget,waitForProcessReady,isProxyHealthy. The first two match the rename in develop (Separate proxy process readiness from Safari target readiness #701 /Split process readiness from target readiness);isProxyHealthyconfirmed still present atsrc/simulator/proxy.ts:395so the type is in sync with the runtime API.- One stray test (the legacy
waitForForwardingvariant) was deleted — it was a duplicate of an existingwaitForTargettest that the merge had brought in twice. Removal preserves coverage. - Two mock fixtures updated from
<html>iOS Devices</html>(the old/-root probe response shape) to[](the new/jsonprobe response shape from main's PR fix(cli): audit auto-spawns proxy; doctor probes /json for -F compat #696). TwoforwardingPolledassertions switched fromu.includes('/json')(now matches both ports under -F mode) tou.includes(\localhost:${proxy.port}`)` (forwarding port only) — that's the precise behavioural distinction the original test was asserting; the rewording preserves intent. 28/28 tests in this file pass post-merge.
Verdict on the merge: correct, well-scoped, well-reasoned.
2.3 Codex-driven fixes (fabe0ab1)
P1 device scoping (src/tools/network-intercept.ts:9-37):
- New constants:
DEFAULT_INTERCEPTOR_SCOPE(existing) +DEFAULT_DEVICE_SCOPE(new).makeInterceptorKey(sessionId, deviceId)returns${s}|${d}with defaults when either is omitted. getNetworkInterceptorForSession(sessionId?, deviceId?)— second argument is optional so existing call sites that pass onlysessionIdkeep working (verifiedsrc/tools/network-offline.tsis the only other caller). Backward-compatible.- Pipe-character (
|) is a safe key separator: simulator UDIDs are hex strings, MCP session IDs in this repo are UUIDs or__default__; neither contains|. No key-collision risk.
P2 action validation (src/tools/network-intercept.ts:53-72):
- Pre-condition explicit:
rawActionmust beundefined,'block', or'modify'. Anything else (typo, number, boolean, object) throws with the actual offending value JSON-stringified into the error message. This matches the project's pattern of surfacing structured errors over silent coercion (compareapp-type-element.ts:170for the symmetric backend-choice validation). mapRuleis now exported with@internalJSDoc — appropriate marker for "exported for unit testing, not for runtime consumers".
Same fix applied to src/tools/network-offline.ts:18-26 — extracts deviceId once, threads through to getNetworkInterceptorForSession, with comment explaining the cross-device-bleed motivation.
Verdict on fabe0ab1: minimal, targeted, no over-reach.
3. Project conventions
| Convention | Adherence |
|---|---|
| Commit message style (subject + body + verification + refs) | ✅ All three release commits follow it; matches the existing develop-side cadence (fix(scope): subject (#PR), body explains why with file:line refs). |
| CHANGELOG format (Headline / Fixed / Added / Refactored / Performance / Security / Migration notes) | ✅ Matches [0.5.0] and [0.6.0] sections verbatim in structure. Per-PR refs included. |
| JSDoc on exported helpers | ✅ mapRule and isSecureFieldDescriptor both carry intent-explaining JSDoc rather than restating the signature. |
| Inline rationale comments tie back to issue/PR numbers | ✅ Codex-driven fixes carry "Codex review on PR #762" markers; pasteboard secure-field carries "#760" markers. Future readers can trace the why. |
import type discipline |
✅ Type-only imports preserved where existing (no unnecessary value imports of types). |
4. Performance implications
- CLI lazy-load (
d0e37d97indirectly, via inherited Lazy-load CLI commands and MCP tool handlers to improve startup speed #700 a) — confirmed still in place post-merge; the audit action's lazy-import block incli/index.ts:361-385actually grows by one entry (WebInspectorProxy), but the import shape staysPromise.all([import(...), ...])so cold-start cost remains proportional to the slowest module. network-interceptmap keying — composite key string concat is O(1); theMap<string, NetworkInterceptor>lookup is unchanged in complexity. No measurable perf delta.mapRulevalidation — three equality checks per call before the existing branch logic. Negligible; this is a per-tool-invocation cost.- No new hot paths introduced.
5. Test coverage
| Layer | New tests | Notes |
|---|---|---|
tests/unit/pasteboard-input.test.ts (#761) |
+10 | secure-field skip via role/traits/lower-case alias + non-secure regression + legacy 3-arg compat + isSecureFieldDescriptor truth table. |
tests/unit/app-type-element.test.ts (#761) |
+2 | simhid verifyTypedText returns 'unknown' on AXSecureTextField via trait and via role. |
tests/unit/network-intercept-tool.test.ts (fabe0ab1) |
+10 | device scoping (3) + action validation (7), including non-string action rejection and urlPattern required precondition. |
tests/unit/proxy-timing.test.ts (merge) |
+0 net (1 stray deleted, 2 fixtures updated) | Coverage intent preserved by the existing waitForTarget describe block. |
Total: +22 unit tests vs. baseline at the v0.6.0 tag. Suite is 2666 / 177 green.
Gap (acknowledged in #761 issue body and #763):
- No live integration test for AXSecureTextField paste (
tests/integration/app-type-element-secure-field.live.test.ts) — explicitly deferred per fix(pasteboard): skip readback for AXSecureTextField + honor verify:false (#760) #761; unit suite proves the four failure modes are corrected. - No live integration test for the new device-scoped interceptor — same rationale; unit tests assert the contract.
Both gaps are documented and ticketed. Not blockers for a catch-up release.
6. Security considerations
mapRuleaction validation is itself a defense-in-depth measure — it closes a soft path where a malformed MCP payload could silently turn anetwork_interceptcall into a request-rewrite. After this PR, the only way to reach the mock-response code path isaction === 'modify'exactly. Net security improvement.- Composite
(sessionId, deviceId)keying prevents a cross-device interceptor-state bleed that could in theory leak request bodies between simulators under multi-device MCP sessions. Net security improvement. - AXSecureTextField skip (fix(pasteboard): skip readback for AXSecureTextField + honor verify:false (#760) #761) does NOT weaken any security boundary: the only thing it changes is the verifier no longer compares against an OS-mandated bullet mask. The actual paste still happens through the same
simctl pbcopy+Cmd+Vpath with the same iOS paste-permission dialog (already auto-handled). - HTTP MCP transport hardening (inherited from develop fix(security): harden HTTP MCP transport defaults #714 series) lands here. Already reviewed on develop; no new HTTP-side surface introduced in
fabe0ab1. - No new auth surfaces, no new file-system write paths, no new shell-out invocations in the net-new commits.
7. Risks
| Risk | Severity | Mitigation |
|---|---|---|
| Catch-up release diff is large (269 files) | Low for review difficulty / Medium for rollback complexity | Every commit pre-reviewed on develop; rollback path is git revert --no-commit of the merge into main then re-tag — clean because the commit is a true two-parent merge. |
path-guard CI advisory failure |
None | Documented in the approve summary; release PRs always trip this guard because they legitimately bundle omofictions-qa.yml updates with src/** changes. Not a required check. |
| HTTP session-leak still unfixed (Codex P2 #3) | Low | Deferred to #763 with full repro + fix outline. Only affects long-running HTTP deployments; STDIO (default) unaffected. Will be addressed against develop independently. |
build CI still in progress at review time |
None | lint / test / dep-audit all green; the build job exercises Swift bridge compilation + webpack, which has run successfully on every prior commit in this chain. |
8. Verdict
APPROVE.
Rationale, in order of weight:
- Codex review verdict already approved: 👍 reaction on the PR + "Codex Review: Didn't find any major issues. More of your lovely PRs please." on commit
fabe0ab1(2026-05-17T08:48:13Z). - Net-new code is minimal and tightly scoped: three release commits totaling ~520 lines added across 8 files; the rest is inherited develop work that already passed its own review.
- All Codex line-level findings handled: two fixed inline with new unit tests; one deferred to Evict NetworkInterceptor map entries on MCP session deletion (HTTP transport) #763 with a public rationale and a fix outline — no findings dismissed.
- Test suite is healthier than baseline: +22 unit tests vs.
v0.6.0, all green; tsc clean. - No public-API breakage, no schema changes, no new tool surfaces. The only behavioural delta callers can observe is that
verify: falsenow actually takes effect on the pasteboard backend (which was already its documented contract). - Project conventions followed: commit style, CHANGELOG structure, JSDoc, error-shape, comment-ties-to-issue-number — all match the develop-side cadence.
Recommendation: merge once build CI turns green. path-guard advisory failure is non-blocking and is the expected behaviour of that guard on release PRs.
— /review skill, senior-engineer pass.
Release 0.6.1 — develop → main catch-up + secure-field paste fix
OpenSafari 0.6.1 is a catch-up release.
v0.6.0was tagged on 2026-05-16 but never merged tomainand never published to npm (npm view opensafari-mcp versionsconfirms0.5.0was the latest published until this PR). Rather than ship a separate 0.6.0 → 0.6.1 pair, this PR rolls the entire 0.6.0 body forward together with one targeted pasteboard fix uncovered after the 0.6.0 tag (#761).This PR contains 356 commits from
develop(= the v0.6.0 release body + #761). No breaking changes since 0.5.0.Headline
app_type_elementpasteboard backend now works onAXSecureTextField(password) elements. The iOS Simulator host→app text input lane is non-deterministic (IME reinterpretation, segmented-field drops, silent paste, AX recovery) — blocks Flutter signup/auth verification #639 PR C readback contract was rejecting every successful password paste asPASTE_NOT_APPLIEDbecause iOS masksAXSecureTextFieldAXValue with bullet chars (••••…) regardless of plaintext. Verifier now skips for secure-field signals (role / trait) and the pasteboard backend honoursverify: falsesymmetrically with simhid. Response gains optionalsecureField: truefield.ax-bridgerecursive scored content-root search (Story 1C.3: DOM Query — querySelector, querySelectorAll, inspect #40, follow-up to Epic 1A: SimulatorManager #4). Closes the silent-empty-content bug on Xcode 26.4 / iOS 26.4 with a typedDEVICE_CONTENT_ROOT_EMPTYerror instead of falling back to bareAXWindow.Themes (full per-PR notes in
CHANGELOG.md[0.6.1]section)feat(tap)macOS-pt → iOS-pt scaling (#693 WU3),feat(ax-bridge)--debugmachine-readable stderr (#660),TRANSITIONAL_STATE_TIMEOUT+--max-settle-retries(#46)no-explicit-anyenforcement on migrated paths (#710 b)Migration notes (since 0.5.0)
app_*/webkit_*/ bridge tools keep parameter shapes and response envelopes.app_type_elementresponse shape gains an optionalsecureField: truefield on the pasteboard backend when the focused element is anAXSecureTextField. Callers that ignore unknown fields are unaffected.verify: falseparameter onapp_type_elementis now honoured on both backends (auto/simhidandpasteboard). Callers relying on the previous silent-no-op on pasteboard should review their flows.Verification
Post-merge release checklist
release/0.6.1→mainPR merged (this PR)v0.6.1on the merge commit and push:npm publishfrommain(handled by user out-of-band)main→developso develop's[Unreleased]lines up with main's[0.6.1]historyCloses #760 (via #761 already merged to
develop).