Summary
I'm using sapoto-codex:implement (the --background task pipeline) to delegate plan-driven work from Claude Code while Claude orchestrates. In ~30 min of usage with gpt-5.4 --effort high, I hit three distinct failure modes that all produce the same verdict: blocked summary even though the underlying work was completed and verified. Surfacing all three because they compound: a downstream consumer (Claude) gets a single "blocked" signal with no actionable distinction between "I couldn't do the work" and "I did the work but the contract refuses to let me commit it."
Plugin: @openai/codex-plugin-cc v1.0.4
Codex CLI: codex-cli 0.114.0
Auth: ChatGPT (madanapallikalyan@gmail.com)
Host: macOS 24.6.0 (darwin), Apple Silicon
Plugin install: /Users/kalyanmadanapalli/Desktop/sapoto-codex-plugin/
Companion: node .../codex/scripts/codex-companion.mjs task ...
Issue 1 — Sandbox blocks git commit inside a linked git worktree
Reproduction
Branch is on a linked git worktree (git worktree add ...). The worktree's .git is a file pointing to <main>/.git/worktrees/<name>/. Codex makes file changes inside the worktree's working tree successfully, then tries git commit:
fatal: Unable to create '/Users/<me>/Desktop/automatic-document-fetcher/.worktrees/resilient-retry-scheduler/.git/worktrees/resilient-retry-scheduler/index.lock': Operation not permitted
The sandbox grants write access to the worktree's working tree but blocks writes to the linked git directory at <main>/.git/worktrees/<name>/. So git commit can't acquire index.lock.
This appears to be the same general class as #240 (sandbox config issues) but specific to git worktrees on macOS. Workaround per project's CLAUDE.md is to have Codex write files only and commit yourself afterward, but that surfaces as verdict: blocked in the companion output, which forces the orchestrator to read the raw .log file to recover what actually happened.
Concrete log line
[2026-04-26T20:25:17.005Z] Assistant message
- verdict: blocked
- ...
- notes:
- `git commit` was blocked by sandbox permissions: `fatal: Unable to create '.../index.lock': Operation not permitted`.
- Acceptance items 1-4 are implemented and verified; the only incomplete part of the completeness contract is the required commit.
Suggested fix
- Detect the sandbox-permission flavor of
git commit failure separately and emit a distinct verdict like verdict: ready_to_commit or commit_blocked_by_sandbox so orchestrators can auto-commit. OR
- Allow the sandbox to write to
.git/worktrees/<name>/ paths under the user's repo root, the same way it allows writing to the working tree. OR
- Explicitly document in
CLAUDE.md-style guidance that linked-worktree commits will fail and the orchestrator should commit after a "blocked" report — and surface a structured field in the JSON result that flags this case.
Issue 2 — result.rawOutput returns null despite a final summary existing
Reproduction
After a job completes, codex-companion.mjs result <task-id> --json returns:
{
"job": {
"status": "completed",
"summary": "- verdict: blocked",
"result": null
}
}
But the corresponding job log file (~/.claude/plugins/data/codex-openai-codex/state/<workspace>/jobs/<task-id>.log) clearly contains a [Final output] block with the structured report. The companion just doesn't surface it. This makes the orchestrator unable to programmatically extract files_changed, tests results, and notes — which are all in the log but not in result.rawOutput.
This is adjacent to #264 ("per-job state JSON stuck at status=running ...") but specific to the case where status=completed but result is null.
Suggested fix
- Plumb the final assistant message through from the in-memory turn state into the persisted
result.rawOutput, even when the only emission is a structured summary. Or fall back to the last Assistant message in the log if rawOutput is not set.
Issue 3 — Codex runs the full test suite despite explicit "only run targeted tests" guidance
Reproduction
My brief said:
Pre-existing test failures (IMPORTANT — don't be blocked by these)
Pre-existing failures unrelated to this work in: browserWindowRegistry.test.ts, chromeSetupStatusResolver.test.ts, notificationService.test.ts, autoExportRepo.test.ts. Do NOT run the full pnpm test suite — only run the targeted tests above.
Plus a <verification> block listing the exact commands.
Codex ran the targeted tests (correctly) AND THEN ran pnpm test (full suite). The full suite hit the four pre-existing failures listed above. Codex treated these as part of the completeness contract → refused to commit → verdict: blocked. Files were correctly modified, targeted tests passed, typecheck clean — but the orchestrator gets a "blocked" verdict and has to manually verify and commit.
This is essentially the completeness contract failing to differentiate "tests I could have caused" from "tests already broken on the parent branch." On a large codebase with any flaky/red tests on main, this means every Codex implement task will go "blocked" unless the user is on a perfectly-green branch.
Suggested fix
- Have the verification step honor the brief's explicit list of commands and NOT run additional commands (especially
pnpm test) without a prompt-side opt-in. OR
- Compare which tests were red BEFORE the diff vs. AFTER, and only block on tests the diff caused to fail. Existing red tests = warn-with-context, not block.
- Allow the brief to set a
--no-full-suite flag or honor the project's CLAUDE.md "pre-existing failures" markers if present.
Why these compound into a usability problem
A consumer like Claude Code orchestrating a 20-task implementation plan via sapoto-codex:implement gets a verdict: blocked signal in three different shapes:
| Cause |
Real state |
What surfaces |
| Sandbox can't commit |
Done, just needs commit |
verdict: blocked |
| Pre-existing test red |
Done, tests pass for the diff |
verdict: blocked |
| Genuine implementation failure |
Not done |
verdict: blocked |
The orchestrator can't distinguish these without reading the log file every time. Today I'm running these three queries on every "blocked" return:
node codex-companion.mjs result <task-id> --json | jq '.job.summary'
git status --short # was anything actually changed?
tail -120 .../jobs/<task-id>.log # what's the real verdict?
A structured verdict enum (done / commit_blocked_by_sandbox / red_tests_outside_diff / blocked) would make the result programmatically actionable without log spelunking.
Reproducer
If useful I can share a sanitized log file + the exact brief that triggered all three failure modes in two consecutive Codex runs (T7 fix-up: triggered #3 → blocked despite green diff; T8 implement: triggered #1 → blocked on commit despite green tests + green typecheck + green diff).
Workarounds in use
- After every Codex completion, my orchestrator runs
git status --short && pnpm typecheck:all && pnpm test --run <targeted-files> and commits manually if everything is green. This is the documented pattern in our project's CLAUDE.md ("If git commit is blocked in the codex sandbox (common), have codex write files only and commit yourself afterward") but it requires the orchestrator to second-guess every "blocked" verdict.
🤖 Generated with Claude Code
Summary
I'm using
sapoto-codex:implement(the--backgroundtask pipeline) to delegate plan-driven work from Claude Code while Claude orchestrates. In ~30 min of usage withgpt-5.4 --effort high, I hit three distinct failure modes that all produce the sameverdict: blockedsummary even though the underlying work was completed and verified. Surfacing all three because they compound: a downstream consumer (Claude) gets a single "blocked" signal with no actionable distinction between "I couldn't do the work" and "I did the work but the contract refuses to let me commit it."Plugin:
@openai/codex-plugin-ccv1.0.4Codex CLI:
codex-cli 0.114.0Auth: ChatGPT (madanapallikalyan@gmail.com)
Host: macOS 24.6.0 (darwin), Apple Silicon
Plugin install:
/Users/kalyanmadanapalli/Desktop/sapoto-codex-plugin/Companion:
node .../codex/scripts/codex-companion.mjs task ...Issue 1 — Sandbox blocks
git commitinside a linked git worktreeReproduction
Branch is on a linked git worktree (
git worktree add ...). The worktree's.gitis a file pointing to<main>/.git/worktrees/<name>/. Codex makes file changes inside the worktree's working tree successfully, then triesgit commit:The sandbox grants write access to the worktree's working tree but blocks writes to the linked git directory at
<main>/.git/worktrees/<name>/. Sogit commitcan't acquireindex.lock.This appears to be the same general class as #240 (sandbox config issues) but specific to git worktrees on macOS. Workaround per project's CLAUDE.md is to have Codex write files only and commit yourself afterward, but that surfaces as
verdict: blockedin the companion output, which forces the orchestrator to read the raw.logfile to recover what actually happened.Concrete log line
Suggested fix
git commitfailure separately and emit a distinct verdict likeverdict: ready_to_commitorcommit_blocked_by_sandboxso orchestrators can auto-commit. OR.git/worktrees/<name>/paths under the user's repo root, the same way it allows writing to the working tree. ORCLAUDE.md-style guidance that linked-worktree commits will fail and the orchestrator should commit after a "blocked" report — and surface a structured field in the JSON result that flags this case.Issue 2 —
result.rawOutputreturnsnulldespite a final summary existingReproduction
After a job completes,
codex-companion.mjs result <task-id> --jsonreturns:{ "job": { "status": "completed", "summary": "- verdict: blocked", "result": null } }But the corresponding job log file (
~/.claude/plugins/data/codex-openai-codex/state/<workspace>/jobs/<task-id>.log) clearly contains a[Final output]block with the structured report. The companion just doesn't surface it. This makes the orchestrator unable to programmatically extractfiles_changed,testsresults, andnotes— which are all in the log but not inresult.rawOutput.This is adjacent to #264 ("per-job state JSON stuck at status=running ...") but specific to the case where
status=completedbutresultisnull.Suggested fix
result.rawOutput, even when the only emission is a structured summary. Or fall back to the lastAssistant messagein the log ifrawOutputis not set.Issue 3 — Codex runs the full test suite despite explicit "only run targeted tests" guidance
Reproduction
My brief said:
Plus a
<verification>block listing the exact commands.Codex ran the targeted tests (correctly) AND THEN ran
pnpm test(full suite). The full suite hit the four pre-existing failures listed above. Codex treated these as part of the completeness contract → refused to commit →verdict: blocked. Files were correctly modified, targeted tests passed, typecheck clean — but the orchestrator gets a "blocked" verdict and has to manually verify and commit.This is essentially the completeness contract failing to differentiate "tests I could have caused" from "tests already broken on the parent branch." On a large codebase with any flaky/red tests on
main, this means every Codex implement task will go "blocked" unless the user is on a perfectly-green branch.Suggested fix
pnpm test) without a prompt-side opt-in. OR--no-full-suiteflag or honor the project'sCLAUDE.md"pre-existing failures" markers if present.Why these compound into a usability problem
A consumer like Claude Code orchestrating a 20-task implementation plan via
sapoto-codex:implementgets averdict: blockedsignal in three different shapes:verdict: blockedverdict: blockedverdict: blockedThe orchestrator can't distinguish these without reading the log file every time. Today I'm running these three queries on every "blocked" return:
A structured verdict enum (
done/commit_blocked_by_sandbox/red_tests_outside_diff/blocked) would make the result programmatically actionable without log spelunking.Reproducer
If useful I can share a sanitized log file + the exact brief that triggered all three failure modes in two consecutive Codex runs (T7 fix-up: triggered #3 → blocked despite green diff; T8 implement: triggered #1 → blocked on commit despite green tests + green typecheck + green diff).
Workarounds in use
git status --short && pnpm typecheck:all && pnpm test --run <targeted-files>and commits manually if everything is green. This is the documented pattern in our project's CLAUDE.md ("If git commit is blocked in the codex sandbox (common), have codex write files only and commit yourself afterward") but it requires the orchestrator to second-guess every "blocked" verdict.🤖 Generated with Claude Code