Self-heal Symphony: Symphony is degraded after a poll/reconciliation failure: linear_mcp_tool_error:...<truncated>#9
Conversation
…, :socket_closed_remote...<truncated>
…symphony-api-is-unreachable-on-port-8765-error-socket_closed_rem' into dev
…failure: linear_mcp_too...<truncated>
WalkthroughComplete rewrite of Symphony from Python to Elixir. All core functionality—orchestration, agent running, workspace management, tracking, templating, configuration, and HTTP server—migrated to new Elixir modules. Build system changed from setuptools to Mix, with new deployment scripts and launchd configurations. Documentation updated to reflect Elixir structure and added self-healing workflow patterns. Changes
Sequence Diagram(s)sequenceDiagram
participant Orchestrator
participant AgentRunner
participant CodexClient
participant Workspace
participant Tracker
participant LinearAPI
Orchestrator->>Orchestrator: Poll for candidate issues
Orchestrator->>Tracker: Fetch active issues
Tracker->>LinearAPI: Query issues by state
LinearAPI-->>Tracker: Return issue list
Tracker-->>Orchestrator: Normalised issues
Orchestrator->>AgentRunner: Dispatch eligible issue
AgentRunner->>Workspace: Create/materialise workspace
Workspace->>Workspace: Clone repos, setup branches
Workspace-->>AgentRunner: Workspace ready
AgentRunner->>CodexClient: Start Codex session
CodexClient->>CodexClient: Spawn subprocess, init thread
loop Turn loop (max turns)
AgentRunner->>CodexClient: Run turn with prompt
CodexClient->>CodexClient: JSONL RPC exchange
CodexClient-->>AgentRunner: Turn result + agent text
AgentRunner->>CodexClient: Handle tool requests (linear_graphql)
CodexClient->>LinearAPI: Execute GraphQL (if tool called)
LinearAPI-->>CodexClient: GraphQL response
AgentRunner->>AgentRunner: Evaluate termination condition
break on max turns or exit active state
end
end
AgentRunner->>Tracker: Update issue state / save workpad
Tracker->>LinearAPI: Commit handoff state
LinearAPI-->>Tracker: Ack
AgentRunner-->>Orchestrator: AgentRunResult
Orchestrator->>Orchestrator: Persist runtime state
Estimated code review effort🎯 5 (Critical) | ⏱️ ~120 minutes Possibly related PRs
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Review rate limit: 0/1 reviews remaining, refill in 60 minutes.Comment |
There was a problem hiding this comment.
Actionable comments posted: 17
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
WORKFLOW.linear-mcp.example.md (1)
164-172:⚠️ Potential issue | 🟠 Major | ⚡ Quick winThis example still uses Liquid tags, but the runtime no longer does.
The prompt contract in this PR is strict variable rendering.
{% for %},{% unless %}, and{% if %}will not render the way this example implies, so anyone bootstrapping from it gets a broken prompt. Mirror the plain{{ attempt }}style used inWORKFLOW.mdand drop the control-flow tags.Suggested fix
-Labels: {% for label in issue.labels %}{{ label }}{% unless forloop.last %}, {% endunless %}{% endfor %} - -{% if attempt %} +Attempt: {{ attempt }} + Continuation context: -- This is retry/continuation attempt #{{ attempt }} because the issue was still in an active state. +- If `Attempt` is populated, this is a retry/continuation because the issue was still in an active state. - Resume from the current workspace and Linear workpad state. Do not restart from scratch. - Do not end the turn while the issue is still `Todo`, `In Progress`, `Rework`, or `Merging` unless a true external blocker remains. -{% endif %}🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@WORKFLOW.linear-mcp.example.md` around lines 164 - 172, The example uses Liquid control-flow tags which the runtime no longer supports; update the snippet that renders labels and the attempt block to use plain variable interpolation (e.g., use a single {{ issue.labels }} or a flattened labels variable and {{ attempt }} directly) instead of `{% for %}`, `{% unless %}`, and `{% if %}`; remove the control-flow blocks and mirror the simple `{{ attempt }}` style used in WORKFLOW.md so the Labels line and the "Continuation context" section render correctly without Liquid.
🧹 Nitpick comments (1)
lib/symphony/tracker.ex (1)
161-173: ⚡ Quick winGraphQL errors are detected but details are discarded.
When Linear returns errors, you raise a generic message without the actual error content. This makes debugging harder when things go wrong.
Include error details in the exception
defp validate_graphql_body(body) when is_map(body) do - if body["errors"], - do: raise(Error, code: :linear_graphql_errors, message: "Linear GraphQL returned errors") + if errors = body["errors"] do + summary = errors |> Enum.take(3) |> Jason.encode!() + raise Error, code: :linear_graphql_errors, message: "Linear GraphQL errors: #{summary}" + end body end🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@lib/symphony/tracker.ex` around lines 161 - 173, The validate_graphql_body/1 function raises a generic Error when body["errors"] exists, discarding the actual GraphQL error payload; update validate_graphql_body/1 to include the error details from body["errors"] in the raised Error (either in the message or as an additional field) so the exception contains the original errors for debugging—modify the branch in validate_graphql_body/1 that currently raises Error with code :linear_graphql_errors to incorporate body["errors"] (and/or a serialized representation) into the Error payload.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@launchd/com.caretta.symphony.local.plist`:
- Around line 10-17: The plist currently hardcodes the developer's home path in
the two string entries (the working directory string and the ProgramArguments cd
command), which pins the unit to one workstation and leaks local path metadata;
replace those hardcoded occurrences of /Users/omarelamin/... with a
configurable/install-time placeholder (e.g. ${INSTALL_DIR} or a known system
path like /opt/symphony) and update the ProgramArguments entry that runs cd ...
&& exec ./scripts/symphony-managed.sh run to use that placeholder or an
environment-safe reference (e.g. cd "${INSTALL_DIR}" or cd "$HOME/relative/path"
only if HOME is guaranteed), so the plist no longer contains user-specific paths
and can be populated at install/deploy time; ensure the same replacement is
applied to both the standalone string and the ProgramArguments array entry.
In `@launchd/com.caretta.symphony.watchdog.plist`:
- Around line 10-17: The plist currently hard-codes a developer home path in the
main <string> and ProgramArguments array (the path to
scripts/symphony-managed.sh), making the job non-portable; change those
hard-coded occurrences to use a portable reference (for example use ${HOME} (or
~) or a relative path) and/or add a WorkingDirectory key so launchd runs from
the repository directory instead of embedding /Users/omarelamin; update the
<string> value and the ProgramArguments entry that references
./scripts/symphony-managed.sh (and any other occurrences) to use ${HOME}/... or
a relative path and ensure ProgramArguments keeps /bin/zsh -lc and the exec
invocation unchanged.
In `@launchd/com.symphony.linear-mcp.example.plist`:
- Around line 14-15: The plist hard-codes /opt/symphony/runner/symphony and the
workflow path in the two <string> entries which conflicts with the runtime
resolver used by mix.exs and scripts/symphony-managed.sh; update the plist to
call the same resolver or wrapper used by scripts/symphony-managed.sh (or
reference an environment-derived install path) instead of the fixed
/opt/symphony/runner/symphony so the LaunchDaemon launches the same binary that
mix.exs and scripts expect and the WORKFLOW file path is resolved consistently.
In `@lib/symphony/cli.ex`:
- Around line 17-30: The OptionParser.parse call is discarding the third return
value (_invalid) so malformed flags (e.g. --port abc) silently fall back to
defaults; modify the CLI parsing to detect a non-empty invalid list returned by
OptionParser.parse/2 (the third element currently bound to _invalid) and
explicitly handle it by printing a clear error and exiting (or raising), or
replace the call with OptionParser.parse!/2 to let it raise on bad args; ensure
you update the block where opts, args, _invalid are assigned so invalid entries
are not ignored and the process terminates with a non-zero status when invalid
arguments are present.
In `@lib/symphony/codex_client.ex`:
- Around line 314-323: The receive block currently enforces
Utils.jsonl_read_limit_bytes() only against the incoming chunk `data`, allowing
`session.buffer` to grow unbounded across many chunks; change the check inside
the `{port, {:data, data}}` clause to compute and enforce the limit on the
combined size (byte_size(session.buffer) + byte_size(data)) before
concatenating, and raise the same Error (code: :response_error, message:
"app-server JSONL message exceeded reader limit") if the combined size exceeds
Utils.jsonl_read_limit_bytes(); then proceed to call read_message with the
appended buffer as before (i.e., `%{session | buffer: session.buffer <> data}`).
In `@lib/symphony/dashboard_summary.ex`:
- Line 56: The current line uses a loose truthiness conversion for needs_human
(needs_human: !!data["needs_human"]) which treats any non-empty string like
"false" as true; replace it with an explicit normalisation: read the raw value
(data["needs_human"]) and convert to boolean by checking for true, 1, "true", or
"1" (after downcasing/trimming strings) and return true only for those,
otherwise false; update the map construction where needs_human is set to use
this explicit check so the key reliably becomes true or false.
In `@lib/symphony/http_server.ex`:
- Around line 184-199: persisted_issue_snapshot currently only searches
payload["completed"], so issues present under other persisted-state buckets like
"blocked" or "retrying" are missed; update the function that calls
persisted_payload() (persisted_issue_snapshot) to search across all relevant
buckets (e.g., payload["completed"], payload["blocked"], payload["retrying"]) by
concatenating or iterating those lists before running Enum.find_value, keeping
the same entry handling (the is_map check and fallback_completed_entry usage) so
blocked/retrying entries return the same snapshot as completed ones.
- Around line 57-79: The code currently reads from :gen_tcp once into
request_line then always calls drain_headers(socket), which can block if the
initial recv already contained the full header; update the logic so you don't
read headers twice: in the with block that binds request_line, check whether the
received chunk already contains the header terminator ("\r\n\r\n") and only call
drain_headers(socket) when it does not; alternatively refactor to accumulate the
header data (pass the already-read buffer into a new drain_headers/2 or a loop)
and stop as soon as "\r\n\r\n" is seen, then parse the request line and call
route(socket, method, path, orchestrator) (keep references to the existing
route/3 and drain_headers name to locate code).
In `@lib/symphony/orchestrator.ex`:
- Around line 413-416: The synchronous code path in tick/1 and
dispatch_issue_sync/4 ignores injected dependencies by calling
Tracker.make_tracker/1 and using Symphony.AgentRunner directly; change the
tracker creation to call make_tracker(orchestrator, config.tracker) and use
orchestrator.agent_runner instead of hard-coded Symphony.AgentRunner so the
:tracker_factory and :agent_runner options passed into new/2 are honored; apply
the same replacement in the other synchronous block noted (the similar code
around the later dispatch area mentioned in the comment).
- Around line 2383-2387: In repo_slug_for_path/2 update the lookup inside the
with block where entry.repo_plan |> RepoPlan.all_repos() |> Enum.find_value(...)
so that it does not assume repo.path_name exists: when checking each repo, use
repo.path_name if present, otherwise derive a fallback path name from repo.slug
(e.g. Path.basename(repo.slug) or splitting on "/" and taking the last segment)
and compare that to repo_dir; return repo.slug as before. Ensure the conditional
uses the fallback value for matching so valid "repos/app/..." entries resolve
correctly.
- Around line 1605-1627: process_due_retries is deleting the retry before
refresh and then calling dispatch_issue_sync, which can dispatch the synthetic
placeholder from refresh_retry_issue/2; instead, call
refresh_retry_issue(tracker, retry) first, then use
is_dispatch_eligible(orchestrator, issue, tracker, retry.attempt) to guard
dispatch; only when is_dispatch_eligible returns true delete the retry from
orchestrator.state.retry_attempts and call dispatch_issue_sync; if not eligible
(e.g., refresh failed or labels/blockers fail) do not consume the retry —
reschedule it by updating its due_at_monotonic (use the same backoff/reschedule
logic used elsewhere for retries) and put it back into
orchestrator.state.retry_attempts so the attempt is retried later, leaving
available_global_slots logic unchanged.
In `@lib/symphony/repo_planner.ex`:
- Around line 455-459: The planner treats string booleans like "false"/"0"/"no"
as true because truthy?/1 only checks non-nil/non-false; fix by normalizing
string boolean inputs before calling truthy?/1 (or replace truthy?/1 with a
dedicated parser). Add or reuse a helper (e.g., parse_string_bool/1 or
Utils.parse_bool/1) that returns true for "true","1","yes","on"
(case-insensitive), false for "false","0","no","off", nil => nil, and otherwise
falls back to truthy?/1; then use it for coding_task, needs_human, edit_allowed
(replace truthy?(Utils.map_get(...)) with parse_string_bool(Utils.map_get(...))
or Utils.parse_bool(Utils.map_get(...))). Ensure human_reason still uses
clean_truncated as before.
In `@lib/symphony/templating.ex`:
- Around line 66-67: The helper format_labels currently only handles an empty
list and will crash when passed nil (e.g., from issue.labels); update
format_labels to treat nil like an empty list by adding a clause for nil (or
pattern-matching both nil and []) so it returns "(none)" and keep the existing
clause for non-empty lists (Enum.join(labels, ", "))—this ensures callers like
issue.labels won't raise when labels are missing.
In `@lib/symphony/workflow.ex`:
- Around line 11-13: The resolve_workflow_path/2 function ignores the supplied
cwd for non-nil paths and expands relative paths against the process cwd; update
resolve_workflow_path(path, cwd) (used by load_workflow/2) to call Path.expand
with the cwd (use Path.expand(path_string, cwd)) so relative workflow paths are
resolved against the provided cwd rather than the process working directory.
In `@lib/symphony/workspace.ex`:
- Around line 241-277: When reusing an existing checkout (the branch that reads
current_branch via git_output and installs install_pre_push_guard) you must
ensure the working copy is aligned to expected_branch; update the logic that
runs when File.exists?(repo_path) to call prepare_expected_branch(repo_path,
repo_config, expected_branch, base_branch, timeout_ms) instead of only reading
current_branch (or at minimum raise an error if current_branch !=
expected_branch). This ensures you reuse repo_checkout_matches?(repo_path,
repo_config) but then actively checkout/refresh from origin/#{base_branch} and
set branch_prepared rather than leaving a stale branch that will only fail at
push time.
In `@WORKFLOW.md`:
- Around line 15-16: The committed WORKFLOW.md contains host-specific absolute
paths (e.g., mcp_command: /Applications/Codex.app/Contents/Resources/codex
app-server and mcp_server: codex_apps) which must be made portable: replace
those hardcoded values with environment-variable placeholders (e.g.,
${CODEX_APP_PATH} or ${MCP_SERVER}) or reference an untracked local
overlay/config file and document the required env vars; move any other absolute
paths mentioned in the file (the ranges you noted) into the same VAR-backed
placeholders or remove them from the repo-default and add guidance for creating
a local version, ensuring the checked-in WORKFLOW.md contains only portable
examples and instructions.
---
Outside diff comments:
In `@WORKFLOW.linear-mcp.example.md`:
- Around line 164-172: The example uses Liquid control-flow tags which the
runtime no longer supports; update the snippet that renders labels and the
attempt block to use plain variable interpolation (e.g., use a single {{
issue.labels }} or a flattened labels variable and {{ attempt }} directly)
instead of `{% for %}`, `{% unless %}`, and `{% if %}`; remove the control-flow
blocks and mirror the simple `{{ attempt }}` style used in WORKFLOW.md so the
Labels line and the "Continuation context" section render correctly without
Liquid.
---
Nitpick comments:
In `@lib/symphony/tracker.ex`:
- Around line 161-173: The validate_graphql_body/1 function raises a generic
Error when body["errors"] exists, discarding the actual GraphQL error payload;
update validate_graphql_body/1 to include the error details from body["errors"]
in the raised Error (either in the message or as an additional field) so the
exception contains the original errors for debugging—modify the branch in
validate_graphql_body/1 that currently raises Error with code
:linear_graphql_errors to incorporate body["errors"] (and/or a serialized
representation) into the Error payload.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: ddb764ea-3a27-4b45-a02c-1fbd1ebe2e8f
⛔ Files ignored due to path filters (1)
mix.lockis excluded by!**/*.lock
📒 Files selected for processing (73)
.formatter.exs.gitignoreREADME.mdWORKFLOW.linear-mcp.example.mdWORKFLOW.mddocs/IMPLEMENTATION.mdlaunchd/com.caretta.symphony.local.plistlaunchd/com.caretta.symphony.watchdog.plistlaunchd/com.symphony.linear-mcp.example.plistlib/symphony.exlib/symphony/agent_runner.exlib/symphony/cli.exlib/symphony/codex_client.exlib/symphony/coding_context.exlib/symphony/config.exlib/symphony/dashboard_summary.exlib/symphony/error.exlib/symphony/http_server.exlib/symphony/logging.exlib/symphony/models.exlib/symphony/orchestrator.exlib/symphony/repo_planner.exlib/symphony/review.exlib/symphony/self_heal.exlib/symphony/templating.exlib/symphony/tracker.exlib/symphony/utils.exlib/symphony/watchdog.exlib/symphony/workflow.exlib/symphony/workspace.exmix.exspyproject.tomlscripts/symphony-managed.shsymphony/__init__.pysymphony/__main__.pysymphony/agent_runner.pysymphony/cli.pysymphony/codex_client.pysymphony/coding_context.pysymphony/config.pysymphony/dashboard_summary.pysymphony/errors.pysymphony/http_server.pysymphony/logging.pysymphony/models.pysymphony/orchestrator.pysymphony/repo_planner.pysymphony/review.pysymphony/templating.pysymphony/tracker.pysymphony/utils.pysymphony/workflow.pysymphony/workspace.pytest/agent_runner_test.exstest/cli_test.exstest/codex_client_test.exstest/managed_script_test.exstest/orchestrator_test.exstest/repo_planner_http_test.exstest/review_test.exstest/self_heal_test.exstest/test_helper.exstest/tracker_test.exstest/watchdog_test.exstest/workflow_config_template_test.exstest/workspace_test.exstests/test_agent_runner.pytests/test_codex_client.pytests/test_orchestrator.pytests/test_review.pytests/test_tracker.pytests/test_workflow_config_template.pytests/test_workspace.py
💤 Files with no reviewable changes (21)
- pyproject.toml
- symphony/main.py
- symphony/logging.py
- symphony/errors.py
- symphony/http_server.py
- symphony/templating.py
- symphony/review.py
- symphony/workflow.py
- symphony/init.py
- symphony/dashboard_summary.py
- symphony/repo_planner.py
- symphony/coding_context.py
- symphony/cli.py
- symphony/workspace.py
- symphony/tracker.py
- symphony/codex_client.py
- symphony/models.py
- symphony/orchestrator.py
- symphony/config.py
- symphony/utils.py
- symphony/agent_runner.py
| <string>/Users/omarelamin/Documents/Codex/2026-04-28/implement-symphony-according-to-the-following</string> | ||
|
|
||
| <key>ProgramArguments</key> | ||
| <array> | ||
| <string>/bin/zsh</string> | ||
| <string>-lc</string> | ||
| <string>cd /Users/omarelamin/Documents/Codex/2026-04-28/implement-symphony-according-to-the-following && exec ./scripts/symphony-managed.sh run</string> | ||
| </array> |
There was a problem hiding this comment.
This local service file has the same host-locked path problem.
Line [10] and Line [16] are pinned to /Users/omarelamin/..., so the unit is unusable outside that workstation and exposes local user path metadata.
Proposed fix
- <string>/Users/omarelamin/Documents/Codex/2026-04-28/implement-symphony-according-to-the-following</string>
+ <string>/opt/symphony/runner</string>
@@
- <string>cd /Users/omarelamin/Documents/Codex/2026-04-28/implement-symphony-according-to-the-following && exec ./scripts/symphony-managed.sh run</string>
+ <string>cd /opt/symphony/runner && exec ./scripts/symphony-managed.sh run</string>🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@launchd/com.caretta.symphony.local.plist` around lines 10 - 17, The plist
currently hardcodes the developer's home path in the two string entries (the
working directory string and the ProgramArguments cd command), which pins the
unit to one workstation and leaks local path metadata; replace those hardcoded
occurrences of /Users/omarelamin/... with a configurable/install-time
placeholder (e.g. ${INSTALL_DIR} or a known system path like /opt/symphony) and
update the ProgramArguments entry that runs cd ... && exec
./scripts/symphony-managed.sh run to use that placeholder or an environment-safe
reference (e.g. cd "${INSTALL_DIR}" or cd "$HOME/relative/path" only if HOME is
guaranteed), so the plist no longer contains user-specific paths and can be
populated at install/deploy time; ensure the same replacement is applied to both
the standalone string and the ProgramArguments array entry.
| <string>/Users/omarelamin/Documents/Codex/2026-04-28/implement-symphony-according-to-the-following</string> | ||
|
|
||
| <key>ProgramArguments</key> | ||
| <array> | ||
| <string>/bin/zsh</string> | ||
| <string>-lc</string> | ||
| <string>cd /Users/omarelamin/Documents/Codex/2026-04-28/implement-symphony-according-to-the-following && exec ./scripts/symphony-managed.sh watchdog</string> | ||
| </array> |
There was a problem hiding this comment.
Hard-coded personal path makes this job non-portable and leaks local identity.
Line [10] and Line [16] tie the service to one developer machine (/Users/omarelamin/...). On any other host, launchd will break immediately.
Proposed fix
- <string>/Users/omarelamin/Documents/Codex/2026-04-28/implement-symphony-according-to-the-following</string>
+ <string>/opt/symphony/runner</string>
@@
- <string>cd /Users/omarelamin/Documents/Codex/2026-04-28/implement-symphony-according-to-the-following && exec ./scripts/symphony-managed.sh watchdog</string>
+ <string>cd /opt/symphony/runner && exec ./scripts/symphony-managed.sh watchdog</string>📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| <string>/Users/omarelamin/Documents/Codex/2026-04-28/implement-symphony-according-to-the-following</string> | |
| <key>ProgramArguments</key> | |
| <array> | |
| <string>/bin/zsh</string> | |
| <string>-lc</string> | |
| <string>cd /Users/omarelamin/Documents/Codex/2026-04-28/implement-symphony-according-to-the-following && exec ./scripts/symphony-managed.sh watchdog</string> | |
| </array> | |
| <string>/opt/symphony/runner</string> | |
| <key>ProgramArguments</key> | |
| <array> | |
| <string>/bin/zsh</string> | |
| <string>-lc</string> | |
| <string>cd /opt/symphony/runner && exec ./scripts/symphony-managed.sh watchdog</string> | |
| </array> |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@launchd/com.caretta.symphony.watchdog.plist` around lines 10 - 17, The plist
currently hard-codes a developer home path in the main <string> and
ProgramArguments array (the path to scripts/symphony-managed.sh), making the job
non-portable; change those hard-coded occurrences to use a portable reference
(for example use ${HOME} (or ~) or a relative path) and/or add a
WorkingDirectory key so launchd runs from the repository directory instead of
embedding /Users/omarelamin; update the <string> value and the ProgramArguments
entry that references ./scripts/symphony-managed.sh (and any other occurrences)
to use ${HOME}/... or a relative path and ensure ProgramArguments keeps /bin/zsh
-lc and the exec invocation unchanged.
| <string>/opt/symphony/runner/symphony</string> | ||
| <string>/opt/symphony/config/WORKFLOW.linear-mcp.example.md</string> |
There was a problem hiding this comment.
Executable path is likely inconsistent with the build/runtime resolver.
Line [14] hard-codes /opt/symphony/runner/symphony, but mix.exs (Line [11]) and scripts/symphony-managed.sh resolve different locations. This can make the job fail to start on a default install path.
Proposed fix
<key>ProgramArguments</key>
<array>
- <string>/opt/symphony/runner/symphony</string>
- <string>/opt/symphony/config/WORKFLOW.linear-mcp.example.md</string>
+ <string>/opt/symphony/runner/scripts/symphony-managed.sh</string>
+ <string>run</string>
</array>
@@
<key>EnvironmentVariables</key>
<dict>
+ <key>SYMPHONY_WORKFLOW_PATH</key>
+ <string>/opt/symphony/config/WORKFLOW.linear-mcp.example.md</string>
<key>PATH</key>
<string>/usr/local/bin:/opt/homebrew/bin:/usr/bin:/bin:/usr/sbin:/sbin</string>
</dict>🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@launchd/com.symphony.linear-mcp.example.plist` around lines 14 - 15, The
plist hard-codes /opt/symphony/runner/symphony and the workflow path in the two
<string> entries which conflicts with the runtime resolver used by mix.exs and
scripts/symphony-managed.sh; update the plist to call the same resolver or
wrapper used by scripts/symphony-managed.sh (or reference an environment-derived
install path) instead of the fixed /opt/symphony/runner/symphony so the
LaunchDaemon launches the same binary that mix.exs and scripts expect and the
WORKFLOW file path is resolved consistently.
| def fallback_workpad_body(%Issue{} = issue, agent_message_text, workspace_path) do | ||
| timestamp = Utils.isoformat_z(Utils.now_utc()) | ||
| summary = agent_message_text |> to_string() |> String.trim() |> Utils.truncate(5000) | ||
|
|
||
| """ | ||
| ## Codex Workpad | ||
|
|
||
| ```text | ||
| #{workspace_path} | ||
| ``` | ||
|
|
||
| ### Plan | ||
|
|
||
| - [x] Agent completed implementation work. | ||
| - [x] Agent attempted Linear workpad/state handoff. | ||
| - [x] Symphony applied tracker-owned delivery fallback after the in-agent Linear write was rejected. | ||
|
|
||
| ### Acceptance Criteria | ||
|
|
||
| - [x] #{issue.identifier} final agent handoff captured below. | ||
|
|
||
| ### Validation | ||
|
|
||
| - [x] See final agent handoff below. | ||
|
|
||
| ### Notes | ||
|
|
||
| - #{timestamp}: Symphony fallback created this workpad because the agent reported that Linear MCP writes were rejected inside the Codex turn. | ||
|
|
||
| #### Final Agent Handoff | ||
|
|
||
| ```text | ||
| #{summary} | ||
| ``` | ||
|
|
||
| ### Confusions | ||
|
|
||
| - In-agent Linear MCP writes were rejected; Symphony used tracker-owned Linear MCP writes for delivery. | ||
| """ |
There was a problem hiding this comment.
Fallback should not overwrite the existing workpad.
When comment_id already exists, this path replaces the whole ## Codex Workpad body with the generic fallback template. That throws away the agent’s earlier plan, validation notes, and any human-added context just because the final Linear write failed. Preserve the existing body and append/update only the fallback section instead of rewriting the comment wholesale.
Also applies to: 465-475
| {opts, args, _invalid} = | ||
| OptionParser.parse(argv, | ||
| switches: [ | ||
| port: :integer, | ||
| log_level: :string, | ||
| once: :boolean, | ||
| watchdog: :boolean, | ||
| self_heal_once: :boolean, | ||
| restart_managed: :boolean, | ||
| reason: :string, | ||
| help: :boolean | ||
| ], | ||
| aliases: [p: :port, h: :help] | ||
| ) |
There was a problem hiding this comment.
❓ Verification inconclusive
In the official Elixir documentation for OptionParser.parse/2, how are unknown options or invalid switch values returned in the third element, and does the parser ignore them unless the caller handles that list?
In the official Elixir documentation for OptionParser.parse/2, the function returns a three-element tuple {parsed, args, invalid}. The third element, invalid, is a list of invalid options as tuples {option_name, value}, where: - option_name is the raw option string (e.g., "--unknown") - value is nil if the option was not expected (unknown option), or the string value if the value did not match the expected type for a known option. Unknown options are returned in the invalid list when using :strict mode (preferred), where any unspecified switch goes to invalid. In :switches mode, unknown switches may be parsed dynamically if they correspond to existing atoms (set to true or their string value), but strict mode ensures unknowns are captured in invalid. The parser does not ignore them; it always includes them in the third element. The caller must handle (e.g., check or process) this list if they want to act on invalid options; otherwise, they can ignore it. The parse!/2 variant raises on invalid options instead of returning them.
Citations:
- 1: https://hexdocs.pm/elixir/main/OptionParser.html
- 2: https://github.com/elixir-lang/elixir/blob/master/lib/elixir/lib/option_parser.ex
- 3: https://hexdocs.pm/elixir/OptionParser.html
- 4: https://hexdocs.pm/elixir/1.13.4/OptionParser.html
Script executed:
# First, let's see the full context of how opts is used after parsing
cd lib/symphony && grep -n "opts" cli.ex | head -20Repository: CarettaAI/caretta-symphony
Repository: CarettaAI/caretta-symphony
Exit code: 0
stdout:
17: {opts, args, _invalid} =
34: if Keyword.get(opts, :help, false) do
41: port = Keyword.get(opts, :port) || config.server.port
44: Keyword.get(opts, :watchdog, false) ->
47: Keyword.get(opts, :self_heal_once, false) ->
50: reason: Keyword.get(opts, :reason) || "manual self-heal",
57: Keyword.get(opts, :restart_managed, false) ->
78: Keyword.get(opts, :once, false) ->
Script executed:
# Let's look at the full file to understand the flow
wc -l lib/symphony/cli.exRepository: CarettaAI/caretta-symphony
Repository: CarettaAI/caretta-symphony
Exit code: 0
stdout:
137 lib/symphony/cli.ex
Script executed:
# Get more context around the parsing and how opts is used
sed -n '1,80p' lib/symphony/cli.exRepository: CarettaAI/caretta-symphony
Repository: CarettaAI/caretta-symphony
Exit code: 0
stdout:
defmodule Symphony.CLI do
`@moduledoc` false
alias Symphony.Config.ConfigManager
alias Symphony.HTTPServer
alias Symphony.Orchestrator
alias Symphony.SelfHeal
alias Symphony.Watchdog
def main(argv) do
argv
|> run()
|> System.halt()
end
def run(argv) do
{opts, args, _invalid} =
OptionParser.parse(argv,
switches: [
port: :integer,
log_level: :string,
once: :boolean,
watchdog: :boolean,
self_heal_once: :boolean,
restart_managed: :boolean,
reason: :string,
help: :boolean
],
aliases: [p: :port, h: :help]
)
workflow_path = List.first(args)
if Keyword.get(opts, :help, false) do
IO.puts(usage())
0
else
try do
manager = ConfigManager.new(workflow_path)
{manager, _workflow, config} = ConfigManager.load_startup(manager)
port = Keyword.get(opts, :port) || config.server.port
cond do
Keyword.get(opts, :watchdog, false) ->
Watchdog.run(manager)
Keyword.get(opts, :self_heal_once, false) ->
result =
SelfHeal.run_once(config,
reason: Keyword.get(opts, :reason) || "manual self-heal",
force: true
)
IO.puts(Jason.encode!(SelfHeal.result_to_map(result), pretty: true))
if result.status == :ok, do: 0, else: 1
Keyword.get(opts, :restart_managed, false) ->
case SelfHeal.restart_managed(config) do
{:ok, results} ->
IO.puts(
Jason.encode!(%{"status" => "ok", "commands" => command_results(results)},
pretty: true
)
)
0
{:error, results} ->
IO.puts(
Jason.encode!(%{"status" => "error", "commands" => command_results(results)},
pretty: true
)
)
1
end
Keyword.get(opts, :once, false) ->
orchestrator =
manager
Reject invalid CLI arguments instead of silently falling back to defaults.
The invalid list from OptionParser.parse/2 is discarded here. When a user passes --port abc, it lands in the ignored _invalid list, leaving opts without a :port key. Downstream calls like Keyword.get(opts, :port) || config.server.port then silently fall back to the configuration file value. The process appears healthy but runs in the wrong mode. In an unattended daemon context, this is the worst kind of failure. Explicitly reject when the invalid list is non-empty. Alternatively, use OptionParser.parse!/2, which raises immediately on bad arguments.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@lib/symphony/cli.ex` around lines 17 - 30, The OptionParser.parse call is
discarding the third return value (_invalid) so malformed flags (e.g. --port
abc) silently fall back to defaults; modify the CLI parsing to detect a
non-empty invalid list returned by OptionParser.parse/2 (the third element
currently bound to _invalid) and explicitly handle it by printing a clear error
and exiting (or raising), or replace the call with OptionParser.parse!/2 to let
it raise on bad args; ensure you update the block where opts, args, _invalid are
assigned so invalid entries are not ignored and the process terminates with a
non-zero status when invalid arguments are present.
| coding_task = | ||
| truthy?(Utils.map_get(data, "coding_task", Utils.map_get(data, "is_coding_task", true))) | ||
|
|
||
| needs_human = truthy?(Utils.map_get(data, "needs_human")) | ||
| human_reason = clean_truncated(Utils.map_get(data, "human_reason"), 500) |
There was a problem hiding this comment.
String booleans from the planner are interpreted backwards.
truthy?/1 makes any non-nil, non-false value true, so "false", "0", or "no" coming back from the LLM are treated as truthy. That can flip coding_task, needs_human, and edit_allowed in exactly the wrong direction. Parse common string booleans explicitly before you trust this output.
Suggested fix
- defp truthy?(false), do: false
- defp truthy?(nil), do: false
- defp truthy?(_), do: true
+ defp truthy?(value) when is_binary(value) do
+ case value |> String.trim() |> String.downcase() do
+ "false" -> false
+ "0" -> false
+ "no" -> false
+ "true" -> true
+ "1" -> true
+ "yes" -> true
+ _ -> true
+ end
+ end
+
+ defp truthy?(false), do: false
+ defp truthy?(nil), do: false
+ defp truthy?(0), do: false
+ defp truthy?(_), do: trueAlso applies to: 503-503, 537-539
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@lib/symphony/repo_planner.ex` around lines 455 - 459, The planner treats
string booleans like "false"/"0"/"no" as true because truthy?/1 only checks
non-nil/non-false; fix by normalizing string boolean inputs before calling
truthy?/1 (or replace truthy?/1 with a dedicated parser). Add or reuse a helper
(e.g., parse_string_bool/1 or Utils.parse_bool/1) that returns true for
"true","1","yes","on" (case-insensitive), false for "false","0","no","off", nil
=> nil, and otherwise falls back to truthy?/1; then use it for coding_task,
needs_human, edit_allowed (replace truthy?(Utils.map_get(...)) with
parse_string_bool(Utils.map_get(...)) or Utils.parse_bool(Utils.map_get(...))).
Ensure human_reason still uses clean_truncated as before.
| defp format_labels([]), do: "(none)" | ||
| defp format_labels(labels), do: Enum.join(labels, ", ") |
There was a problem hiding this comment.
Handle nil labels in the continuation prompt.
This helper only treats [] as empty. If issue.labels is nil, Enum.join/2 raises and the continuation path falls over. The rest of the codebase is already defensive around missing labels.
Suggested fix
+ defp format_labels(nil), do: "(none)"
defp format_labels([]), do: "(none)"
defp format_labels(labels), do: Enum.join(labels, ", ")📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| defp format_labels([]), do: "(none)" | |
| defp format_labels(labels), do: Enum.join(labels, ", ") | |
| defp format_labels(nil), do: "(none)" | |
| defp format_labels([]), do: "(none)" | |
| defp format_labels(labels), do: Enum.join(labels, ", ") |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@lib/symphony/templating.ex` around lines 66 - 67, The helper format_labels
currently only handles an empty list and will crash when passed nil (e.g., from
issue.labels); update format_labels to treat nil like an empty list by adding a
clause for nil (or pattern-matching both nil and []) so it returns "(none)" and
keep the existing clause for non-empty lists (Enum.join(labels, ", "))—this
ensures callers like issue.labels won't raise when labels are missing.
| def resolve_workflow_path(path, cwd \\ File.cwd!()) | ||
| def resolve_workflow_path(nil, cwd), do: cwd |> default_workflow_path() |> Path.expand() | ||
| def resolve_workflow_path(path, _cwd), do: path |> to_string() |> Path.expand() |
There was a problem hiding this comment.
Resolve relative workflow paths against the supplied cwd.
Line 13 ignores the cwd argument. Relative workflow paths will be expanded against the process cwd instead, which makes load_workflow/2 behave differently from what its API promises.
Suggested fix
def resolve_workflow_path(path, cwd \\ File.cwd!())
def resolve_workflow_path(nil, cwd), do: cwd |> default_workflow_path() |> Path.expand()
- def resolve_workflow_path(path, _cwd), do: path |> to_string() |> Path.expand()
+ def resolve_workflow_path(path, cwd), do: path |> to_string() |> Path.expand(cwd)📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| def resolve_workflow_path(path, cwd \\ File.cwd!()) | |
| def resolve_workflow_path(nil, cwd), do: cwd |> default_workflow_path() |> Path.expand() | |
| def resolve_workflow_path(path, _cwd), do: path |> to_string() |> Path.expand() | |
| def resolve_workflow_path(path, cwd \\ File.cwd!()) | |
| def resolve_workflow_path(nil, cwd), do: cwd |> default_workflow_path() |> Path.expand() | |
| def resolve_workflow_path(path, cwd), do: path |> to_string() |> Path.expand(cwd) |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@lib/symphony/workflow.ex` around lines 11 - 13, The resolve_workflow_path/2
function ignores the supplied cwd for non-nil paths and expands relative paths
against the process cwd; update resolve_workflow_path(path, cwd) (used by
load_workflow/2) to call Path.expand with the cwd (use Path.expand(path_string,
cwd)) so relative workflow paths are resolved against the provided cwd rather
than the process working directory.
| if File.exists?(repo_path) do | ||
| unless File.dir?(repo_path), | ||
| do: | ||
| raise(Error, | ||
| code: :repo_path_not_directory, | ||
| message: "repo path exists and is not a directory: #{repo_path}" | ||
| ) | ||
|
|
||
| unless repo_checkout_matches?(repo_path, repo_config) do | ||
| raise Error, | ||
| code: :repo_checkout_mismatch, | ||
| message: | ||
| "repo path exists but remote does not match #{repo_config.slug}: #{repo_path}" | ||
| end | ||
|
|
||
| current_branch = | ||
| git_output( | ||
| repo_path, | ||
| ["branch", "--show-current"], | ||
| timeout_ms, | ||
| :repo_branch_read_failed, | ||
| "failed reading current branch for #{repo_config.slug}" | ||
| ) | ||
|
|
||
| install_pre_push_guard(repo_path, expected_branch) | ||
|
|
||
| %{ | ||
| "base_branch" => base_branch, | ||
| "base_ref" => "origin/#{base_branch}", | ||
| "base_sha" => nil, | ||
| "expected_branch" => expected_branch, | ||
| "expected_ref" => "refs/heads/#{expected_branch}", | ||
| "current_branch" => current_branch, | ||
| "branch_prepared" => false, | ||
| "pre_push_guard" => true, | ||
| "remote_url" => git_remote(repo_path) | ||
| } |
There was a problem hiding this comment.
Reusing an existing checkout never re-aligns the branch.
When repo_path already exists, this path only reads current_branch and installs the pre-push guard. It never checks out expected_branch or refreshes from origin/#{base_branch}. So a resumed workspace can keep editing a stale or wrong branch, and you only find out at push time.
At minimum, fail fast when current_branch != expected_branch; better, reuse prepare_expected_branch/5 here.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@lib/symphony/workspace.ex` around lines 241 - 277, When reusing an existing
checkout (the branch that reads current_branch via git_output and installs
install_pre_push_guard) you must ensure the working copy is aligned to
expected_branch; update the logic that runs when File.exists?(repo_path) to call
prepare_expected_branch(repo_path, repo_config, expected_branch, base_branch,
timeout_ms) instead of only reading current_branch (or at minimum raise an error
if current_branch != expected_branch). This ensures you reuse
repo_checkout_matches?(repo_path, repo_config) but then actively
checkout/refresh from origin/#{base_branch} and set branch_prepared rather than
leaving a stale branch that will only fail at push time.
| mcp_command: /Applications/Codex.app/Contents/Resources/codex app-server | ||
| mcp_server: codex_apps |
There was a problem hiding this comment.
Do not commit a host-specific workflow as the repo default.
These absolute macOS paths hardwire main to one operator machine and expose the local username/path layout. On any other host, Codex startup, skill loading, and repo materialisation will fail before Symphony does useful work. Move these to $VAR-backed config or keep them in an untracked local overlay, and keep the committed WORKFLOW.md portable.
Also applies to: 24-24, 46-46, 66-66, 89-179
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@WORKFLOW.md` around lines 15 - 16, The committed WORKFLOW.md contains
host-specific absolute paths (e.g., mcp_command:
/Applications/Codex.app/Contents/Resources/codex app-server and mcp_server:
codex_apps) which must be made portable: replace those hardcoded values with
environment-variable placeholders (e.g., ${CODEX_APP_PATH} or ${MCP_SERVER}) or
reference an untracked local overlay/config file and document the required env
vars; move any other absolute paths mentioned in the file (the ranges you noted)
into the same VAR-backed placeholders or remove them from the repo-default and
add guidance for creating a local version, ensuring the checked-in WORKFLOW.md
contains only portable examples and instructions.
Self-Heal Summary
Symphony detected a local failure and produced a validated generic repair.
Trigger:
Symphony is degraded after a poll/reconciliation failure: linear_mcp_tool_error: {"content":[{"text":"Unknown tool: linear mcp server_list_issues","type":"text"}],"isError":true}
Local validated commit:
788b892
Evidence artifact:
/Users/omarelamin/Documents/Codex/2026-04-28/implement-symphony-according-to-the-following/.symphony-self-heal/runs/20260501T083453-symphony-is-degraded-after-a-poll-reconciliation-failure-linear_/evidence.json
Validation:
mix format --check-formattedmix testmix escript.buildNotes:
main.main.Summary by CodeRabbit
Release Notes
New Features
WORKFLOW.mdconfiguration system with comprehensive execution contracts.Refactor
Documentation