CarettaAI · omar-elamin · Apr 29, 2026 · Apr 29, 2026 · Apr 29, 2026 · Apr 29, 2026
diff --git a/.formatter.exs b/.formatter.exs
@@ -0,0 +1,3 @@
+[
+  inputs: ["{mix,.formatter}.exs", "{config,lib,test}/**/*.{ex,exs}"]
+]
diff --git a/.gitignore b/.gitignore
@@ -29,6 +29,13 @@ dist/
 build/
 *.egg-info/
 
+# Elixir
+_build/
+deps/
+.elixir_ls/
+tmp/
+/symphony
+
 # JavaScript and frontend build artifacts
 node_modules/
 npm-debug.log*

diff --git a/README.md b/README.md
@@ -2,7 +2,7 @@
 
 Caretta Symphony runs many Codex agents from Linear, across many repositories, with a live dashboard for the whole queue.
 
-It is an independent Python implementation of OpenAI's draft [Symphony service specification](https://github.com/openai/symphony/blob/main/SPEC.md). OpenAI defined the core pattern: poll Linear, create an isolated workspace, run a coding agent, and reconcile the issue state. Caretta Symphony keeps that model and adds the operating layer we needed for a real multi-repo product.
+It is an independent Elixir implementation of OpenAI's draft [Symphony service specification](https://github.com/openai/symphony/blob/main/SPEC.md). OpenAI defined the core pattern: poll Linear, create an isolated workspace, run a coding agent, and reconcile the issue state. Caretta Symphony keeps that model and adds the operating layer we needed for a real multi-repo product.
 
 The core value is control: get the work out of the issue tracker, put each agent in the right repos, and see what all of them are doing while they run.
 
@@ -57,11 +57,17 @@ Raw Linear GraphQL mode is also supported.
 
 Agents hand work to a review state. Symphony can move an issue to `Done` only after the required GitHub pull requests are merged into the configured base branch.
 
+While an issue is in a review state, Symphony also polls Linear comments and linked GitHub PR feedback. New human-authored feedback moves the issue to the configured `rework_state` so Codex can continue from the existing workspace.
+
+### Blocked issue escalation
+
+When an agent or repository plan cannot continue without human input, Symphony creates or updates one Linear comment headed `## Symphony Blocked Escalation`. It mentions the Linear assignee when the tracker payload includes one; otherwise it uses `tracker.blocked_escalation_mentions`. A later human comment on the issue releases the block so the normal dispatcher can retry.
+
 ## Relationship to OpenAI Symphony
 
 OpenAI's `openai/symphony` repository contains a language-agnostic spec and an experimental Elixir implementation.
 
-Caretta Symphony is not an official OpenAI project. It does not include OpenAI's reference implementation code. It implements the public Symphony spec in Python and extends it for:
+Caretta Symphony is not an official OpenAI project. It does not include OpenAI's reference implementation code. It implements the public Symphony spec in Elixir and extends it for:
 
 - multi-repo product work
 - repo planning before dispatch
@@ -73,7 +79,8 @@ Caretta Symphony is not an official OpenAI project. It does not include OpenAI's
 ## Install
 
 ```bash
-python3 -m pip install -e ".[dev]"
+mix deps.get
+mix escript.build
 ```
 
 ## Run
@@ -89,8 +96,11 @@ tracker:
   terminal_states: ["Closed", "Cancelled", "Canceled", "Duplicate", "Done"]
   review_states: ["In Review", "Merging"]
   handoff_state: In Review
+  rework_state: Rework
   done_state: Done
   merge_base_branch: dev
+  blocked_escalation_enabled: true
+  blocked_escalation_mentions: ["@operator"]
   required_labels: ["codex"]
   mcp_command: /Applications/Codex.app/Contents/Resources/codex app-server
 workspace:
@@ -107,6 +117,30 @@ codex:
     networkAccess: true
 server:
   port: 8765
+self_healing:
+  enabled: false
+  base_branch: main
+  branch_prefix: codex/self-heal
+  workspace_root: ./.symphony-self-heal
+  stale_poll_ms: 120000
+  cooldown_ms: 900000
+  max_attempts: 3
+  validation_commands:
+    - mix format --check-formatted
+    - mix test
+    - mix escript.build
+  codex:
+    model: gpt-5.5
+    effort: xhigh
+    approval_policy: never
+    thread_sandbox: workspace-write
+    turn_sandbox_policy:
+      type: workspaceWrite
+      networkAccess: true
+  restart:
+    tmux_session: symphony-elixir
+    port: 8765
+    workflow_path: ./WORKFLOW.md
 context:
   coding:
     enabled: true
@@ -152,13 +186,29 @@ Issue: {{ issue.identifier }} - {{ issue.title }}
 Start Symphony:
 
 ```bash
-symphony WORKFLOW.md --port 8765
+./symphony WORKFLOW.md --port 8765
 ```
 
 Open `http://127.0.0.1:8765` for the dashboard.
 
 For a larger anonymized workflow, see [`WORKFLOW.linear-mcp.example.md`](WORKFLOW.linear-mcp.example.md). For macOS background operation, adapt [`launchd/com.symphony.linear-mcp.example.plist`](launchd/com.symphony.linear-mcp.example.plist).
 
+## Self-healing watchdog
+
+Symphony can run a separate local watchdog when `self_healing.enabled: true` is configured. The watchdog polls `GET /api/v1/state`; when Symphony is unreachable, degraded, or stale, it runs a high-reasoning Codex repair agent in `.symphony-self-heal/worktrees/<run-id>`, validates the repair, deploys the validated escript artifact locally, restarts the managed tmux session, pushes a `codex/self-heal/...` branch, opens a ready PR into `main`, and requests GitHub auto-merge without bypassing branch protection.
+
+Local deployment intentionally comes from the validated artifact, not from `main`. The PR is the audit and sync path back to GitHub.
+
+Useful commands:
+
+```bash
+./symphony WORKFLOW.md --watchdog
+./symphony WORKFLOW.md --self-heal-once --reason "manual diagnosis"
+./symphony WORKFLOW.md --restart-managed
+```
+
+On macOS, `scripts/symphony-managed.sh` and `launchd/com.caretta.symphony.watchdog.plist` provide a launcher path that goes through `/bin/zsh -lc`, which avoids direct launchd escript startup failures.
+
 ## Status API
 
 When `server.port` is set, Symphony exposes:
@@ -173,13 +223,14 @@ The status surface is unauthenticated. Keep `server.host` bound to `127.0.0.1` u
 ## Testing
 
 ```bash
-python3 -m pytest
+mix test
 ```
 
 ## Project layout
 
-- `symphony/` - service implementation
-- `tests/` - pytest coverage
+- `lib/symphony/` - service implementation
+- `test/` - ExUnit coverage
+- `mix.exs` - Elixir project metadata and escript configuration
 - `docs/IMPLEMENTATION.md` - implementation notes and conformance summary
 - `WORKFLOW.linear-mcp.example.md` - anonymized multi-repo Linear MCP workflow
 - `launchd/` - example macOS launch agent

diff --git a/WORKFLOW.linear-mcp.example.md b/WORKFLOW.linear-mcp.example.md
@@ -6,8 +6,11 @@ tracker:
   terminal_states: ["Closed", "Cancelled", "Canceled", "Duplicate", "Done"]
   review_states: ["In Review", "Merging"]
   handoff_state: In Review
+  rework_state: Rework
   done_state: Done
   merge_base_branch: dev
+  blocked_escalation_enabled: true
+  blocked_escalation_mentions: ["@operator"]
   required_labels: ["codex"]
   mcp_command: /Applications/Codex.app/Contents/Resources/codex app-server
   mcp_server: codex_apps
@@ -27,6 +30,31 @@ codex:
     networkAccess: true
 server:
   port: 8765
+self_healing:
+  enabled: false
+  base_branch: main
+  branch_prefix: codex/self-heal
+  workspace_root: ./.symphony-self-heal
+  stale_poll_ms: 120000
+  cooldown_ms: 900000
+  max_attempts: 3
+  validation_commands:
+    - mix format --check-formatted
+    - mix test
+    - mix escript.build
+  codex:
+    command: /Applications/Codex.app/Contents/Resources/codex app-server
+    model: gpt-5.5
+    effort: xhigh
+    approval_policy: never
+    thread_sandbox: workspace-write
+    turn_sandbox_policy:
+      type: workspaceWrite
+      networkAccess: true
+  restart:
+    tmux_session: symphony-elixir
+    port: 8765
+    workflow_path: ./WORKFLOW.md
 context:
   coding:
     enabled: true
@@ -61,12 +89,12 @@ repositories:
       local_path: /opt/symphony/example-repos/client/desktop-runtime
       remote_url: https://github.com/ExampleOrg/desktop-runtime.git
       aliases: ["desktop-runtime", "electron shell", "overlay", "live workflow", "local transcription", "runtime orchestrator"]
-      description: Desktop shell and live-session runtime; local capture, transcript batching, in-app suggestions, and host-side provider calls.
+      description: Desktop shell and live in-call runtime; local capture, transcript batching, in-app suggestions, and host-side provider calls. Do not choose this for saved-call history pages, post-call detail tabs, or follow-up email drafts unless the issue explicitly says desktop overlay or live runtime.
     - slug: ExampleOrg/web-console
       local_path: /opt/symphony/example-repos/product/web-console
       remote_url: https://github.com/ExampleOrg/web-console.git
-      aliases: ["web-console", "web app", "Next.js", "onboarding", "settings", "history", "calendar", "CRM", "in-app assistant"]
-      description: Customer-facing web console, authenticated routes, calendar/CRM settings, history views, folders, and browser-side gateway proxy.
+      aliases: ["web-console", "web app", "Next.js", "onboarding", "settings", "history", "history tab", "post-call", "saved call", "call details", "follow-up email", "email draft", "calendar", "CRM", "in-app assistant"]
+      description: Customer-facing web console, authenticated routes, calendar/CRM settings, saved-call history views, post-call detail tabs, follow-up email drafts/templates, folders, and browser-side gateway proxy.
     - slug: ExampleOrg/shared-contracts
       local_path: /opt/symphony/example-repos/libs/shared-contracts
       aliases: ["shared-contracts", "shared schema", "shared types", "API contracts"]
@@ -157,6 +185,16 @@ Symphony only releases the issue when it leaves the configured active states. Yo
 - Do not post separate completion summary comments.
 - Final assistant message should report completed actions and blockers only. Do not ask the human to do routine follow-up work.
 
+## Credentialed And Data Operations
+
+- You run under the same macOS user context as Symphony. Before declaring missing non-GitHub auth, inspect configured local auth and secret sources without printing secret values:
+  - `which supabase && supabase projects list`
+  - `which aws && aws sts get-caller-identity`
+  - local repo `.env*` files, Vercel env, AWS Secrets Manager/SSM names, Supabase project links, and connected MCP tools when relevant.
+- Never paste secret values into Linear, PRs, terminal summaries, or final messages. Load credentials into the command environment or an untracked temporary file only when required for the operation.
- You run under the same macOS user context as Symphony. Before declaring missing non-GitHub auth, inspect configured local auth and secret sources without printing secret values:
-  - `which supabase && supabase projects list`
-  - `which aws && aws sts get-caller-identity`
-  - local repo `.env*` files, Vercel env, AWS Secrets Manager/SSM names, Supabase project links, and connected MCP tools when relevant.
- Never paste secret values into Linear, PRs, terminal summaries, or final messages. Load credentials into the command environment or an untracked temporary file only when required for the operation.
+- You run under the same macOS user context as Symphony. Before declaring missing non-GitHub auth, inspect configured local auth and secret sources without printing secret values:
+  - `which supabase && supabase projects list`
+  - `which aws && aws sts get-caller-identity`
+  - local repo `.env*` filenames and variable names only (never values), Vercel env names, AWS Secrets Manager/SSM secret names, Supabase project links, and connected MCP tools when relevant.
+- Never paste secret values into Linear, PRs, terminal summaries, or final messages. Load credentials into the command environment or an untracked temporary file only when required for the operation.
- You run under the same macOS user context as Symphony. Before declaring missing non-GitHub auth, inspect configured local auth and secret sources without printing secret values:
-  - `which supabase && supabase projects list`
-  - `which aws && aws sts get-caller-identity`
-  - local repo `.env*` files, Vercel env, AWS Secrets Manager/SSM names, Supabase project links, and connected MCP tools when relevant.
- Never paste secret values into Linear, PRs, terminal summaries, or final messages. Load credentials into the command environment or an untracked temporary file only when required for the operation.
+- You run under the same macOS user context as Symphony. Before declaring missing non-GitHub auth, inspect configured local auth and secret sources without printing secret values:
+  - `which supabase && supabase projects list`
+  - `which aws && aws sts get-caller-identity`
+  - local repo `.env*` filenames and variable names only (never values), Vercel env names, AWS Secrets Manager/SSM secret names, Supabase project links, and connected MCP tools when relevant.
+- Never paste secret values into Linear, PRs, terminal summaries, or final messages. Load credentials into the command environment or an untracked temporary file only when required for the operation.
+- For Supabase/Postgres data migrations, a PR or migration script alone is not completion. Record dry-run output and either apply output or a concrete verified reason the data operation must not be run.
+- If the issue explicitly asks to move, copy, backfill, delete, or repair production rows or cloud resources, do not move it to `In Review` just because code was written. Move it to `In Review` only after the operation has been executed and verified, or after the requester explicitly converts the issue to a code-only preparatory task.
+
 ## State Routing
 
 - `Backlog`: out of scope. Do not modify the issue. Stop.
@@ -175,20 +213,21 @@ Symphony only releases the issue when it leaves the configured active states. Yo
    - add or refine the implementation plan,
    - mirror any issue-provided validation/test-plan items as required checklist items,
    - record a compact environment stamp with host, absolute workspace path, and short commit SHA when available.
-4. Reproduce or inspect the current behavior enough to make the fix target explicit, then implement the requested change.
+4. Reproduce or inspect the current behavior enough to make the fix target explicit, then implement the requested change. For credentialed data or cloud operations, prove the access path first using configured CLIs, local env files, or secret stores, then run the required dry-run/apply or read-only verification without exposing secrets.
 5. Run validation appropriate to the changed surface. Treat issue-provided validation instructions as mandatory.
 6. Commit and push only the Symphony-prepared branch recorded in `.symphony-workspace.json` when changes are ready. Never push an inherited source checkout branch; if the current branch differs from the expected branch, stop and report the mismatch. Open or update the PR and attach/link the PR to the Linear issue. Prefer Linear attachments/links; use the workpad only if attachments are unavailable.
 7. Before handoff, sweep existing PR feedback and checks:
    - address or explicitly respond to actionable comments,
    - confirm checks/validation are green or document a real external blocker,
    - refresh the workpad so plan, acceptance criteria, validation, commit, and PR status match reality.
-8. Move the issue to `In Review` only after the handoff bar below is satisfied. If blocked by missing non-GitHub auth, permissions, or required tooling, document the blocker in the workpad and move to `In Review` with a concise unblock note. `Done` is reserved for Symphony's merge gate after every required PR has merged into `dev`.
+8. Move the issue to `In Review` only after the handoff bar below is satisfied. If blocked by missing non-GitHub auth, permissions, or required tooling after checking the configured local CLIs/env/secret stores, document the blocker in the workpad, leave the issue active, and report the blocker in the final message. `Done` is reserved for Symphony's merge gate after every required PR has merged into `dev`.
 
 ## Handoff Bar Before `In Review`
 
 - Workpad exists and is current.
 - Implementation is complete for the issue scope.
 - Required validation/test-plan items are complete and recorded.
+- For credentialed data or cloud operations, the requested operation is executed and verified, with dry-run/apply output or read-only verification recorded. A code-only helper script is not enough unless the requester explicitly asked only for a helper script.
 - Symphony-prepared branch is pushed and PR is linked on the issue.
 - PR feedback has been swept; no known actionable comments remain unaddressed.
 - PR checks are passing, or any failure is documented as an external blocker that cannot be resolved in-session.