Skip to content

fix(cli): preserve soundwave-authored slug across handoff to decepticon#180

Closed
Lempkey wants to merge 2 commits intoPurpleAILAB:mainfrom
Lempkey:fix/cli-handoff-preserve-slug
Closed

fix(cli): preserve soundwave-authored slug across handoff to decepticon#180
Lempkey wants to merge 2 commits intoPurpleAILAB:mainfrom
Lempkey:fix/cli-handoff-preserve-slug

Conversation

@Lempkey
Copy link
Copy Markdown
Contributor

@Lempkey Lempkey commented May 8, 2026

Summary

  • Adds queuedEngagementSlugRef to capture the slug emitted by Soundwave's complete_engagement_planning tool via the engagement_ready custom event (useAgent.ts:303)
  • Extracts buildSubmitInput() as a pure helper that applies slug precedence: the queued handoff slug wins over DECEPTICON_ENGAGEMENT env var, which may be stale or empty for engagements created during the Soundwave interview
  • Clears the queued slug in handleStreamComplete (alongside the existing pendingHandoffRef clear) so it is not re-injected on subsequent Decepticon submits on the same thread
  • Leaves pendingHandoffRef boolean semantics untouched; the handoff still fires even when the engagement_ready event carries no slug

Root cause

pendingHandoffRef captured the slug at L303 but the slug string was discarded at L535 (pendingHandoffRef.current = null) before the next Decepticon submit at L687 could use it. The submit fell back to process.env.DECEPTICON_ENGAGEMENT, which is the launcher-set value and is stale when the operator created a new engagement during the interview.

Test plan

  • npm run typecheck — TypeScript clean
  • npm test — 8 new tests in buildSubmitInput.test.ts pass, 24 total pass
  • End-to-end: run a full Soundwave interview (make dev + make cli), complete planning, verify the first Decepticon message includes the correct engagement_name (requires full stack — impractical for CI)

Note on downstream failures

Filesystem tool failures (#152, #173, #175) are downstream symptoms that still require a follow-up PR fixing _normalize_engagement_workspace at decepticon/middleware/filesystem.py:32 — that function currently rejects /workspace as "no engagement workspace" and will continue to do so until repaired. This PR fixes the slug-drop at the handoff point; the middleware fix unblocks the filesystem tools.

Closes #159

On the engagement_ready event, the CLI captured the freshly-authored
engagement slug into pendingHandoffRef but then discarded it before the
next Decepticon submit could use it. The submit fell back to the
DECEPTICON_ENGAGEMENT env var, which is stale or empty for engagements
created during the Soundwave interview.

Fix: add queuedEngagementSlugRef (set on engagement_ready, cleared in
handleStreamComplete) and extract buildSubmitInput() as a pure helper
that applies slug precedence — handoff slug wins over env var. The
helper is independently unit-tested.

Closes PurpleAILAB#159
@Lempkey Lempkey requested a review from PurpleCHOIms as a code owner May 8, 2026 19:52
…e teardown

handleStreamComplete() was nulling queuedEngagementSlugRef during the
Soundwave stream's teardown — before the auto-submitted Decepticon run
could read it. The slug was always null by the time submit() reached
buildSubmitInput(), so it fell back to the stale DECEPTICON_ENGAGEMENT
env var.

Fix: capture the slug into a local before the stream starts, and clear
the ref only after processStream() resolves successfully. On failure the
ref is left intact, preserving the slug for a safe retry.
@Lempkey
Copy link
Copy Markdown
Contributor Author

Lempkey commented May 8, 2026

Thanks for the thorough review. The high-severity finding was correct — `queuedEngagementSlugRef` was being nulled inside `handleStreamComplete()` (during the Soundwave teardown), which ran before the auto-submitted Decepticon `submit()` could consume it. The ref arrived as `null` at `buildSubmitInput`, so it always fell back to the stale env var.

Fix (7247f00):

Two changes in `useAgent.ts`:

  1. Removed `queuedEngagementSlugRef.current = null` from `handleStreamComplete`. A comment explains the intent: the ref must survive until the Decepticon run consumes it.

  2. In `submit()`, capture the slug into a local before the stream starts, then clear the ref only after `processStream()` resolves successfully:

```typescript
const handoffSlug = queuedEngagementSlugRef.current;
const { input, streamConfig } = buildSubmitInput({ handoffSlug, ... });

try {
const stream = client.runs.stream(...);
await processStream(stream, abortController);
if (handoffSlug) queuedEngagementSlugRef.current = null; // clear after success
} catch (err) {
// slug retained on failure — safe to retry
...
}
```

On the lifecycle test gap: Agreed the existing tests don't cover the `engagement_ready → handleStreamComplete → next submit` sequence. Adding those requires `@testing-library/react` + jsdom (the current vitest config uses `environment: "node"`), which is out of scope for this fix. Happy to follow up in a separate PR if the maintainers want to expand the test infrastructure.

@PurpleCHOIms
Copy link
Copy Markdown
Member

PurpleCHOIms commented May 9, 2026

After re-examining the root cause of this regression (#159), I concluded that the architecture itself — letting the LLM be the source of truth for the engagement slug — is the actual problem. Replacing this PR with #182 (refactor/engagement-handoff-signal-only), which takes the opposite approach.

Why the redirect

The natural product flow:

  1. Launcher (Go picker UI / web env) decides the engagement slug — the user picks or types it once at session start.
  2. Soundwave writes RoE/CONOPS/Deconfliction into the already-mounted /workspace/plan/.
  3. Decepticon picks up the same /workspace after handoff.

Under this flow the slug is launcher-owned. The slug Soundwave's LLM emits via complete_engagement_planning(engagement_name=…) is just an echo, and there is no reason to treat it as authoritative. This PR's queuedEngagementSlugRef + queue/clear lifecycle was designed to privilege the LLM-emitted slug, but if the launcher-mounted directory and the LLM-emitted slug ever diverge (hallucination, normalization variance), state and filesystem can mismatch — a worse failure mode than the original drop.

What #182 does instead

  • Drops the engagement_name argument from complete_engagement_planning → pure boolean handoff signal.
  • The CLI injects launcher's env-set slug into config.configurable on every submit.
  • EngagementContextMiddleware.before_agent hydrates configurable into state.
  • The "slug drop" bug class no longer exists — there is no slug channel to drop from.
  • queuedEngagementSlugRef becomes unnecessary.

What survives

Your regression tests (buildSubmitInput.test.ts) and root-cause analysis were spot on — #159 was diagnosed correctly here, and #182 was designed so that the same issue closes. If you'd like to dispatch follow-up work after this closes, that would be welcome.

Closing this PR.

PurpleCHOIms added a commit that referenced this pull request May 9, 2026
…able (#182)

* refactor(engagement): launcher-authoritative slug via config.configurable

Move engagement_name and workspace_path from input state to runnable
config.configurable. The launcher (and web terminal-server env) is the
single source of truth; the LLM no longer decides the engagement slug.

Backend:
- complete_engagement_planning: drop engagement_name arg; pure boolean
  handoff signal. The emitted engagement_ready event carries no slug.
- EngagementContextMiddleware: add before_agent hydration that copies
  engagement_name + workspace_path from configurable into state on the
  first agent step. Idempotent — state wins when present. Downstream
  middleware (OPPLAN, filesystem) read state as before.
- filesystem._normalize_engagement_workspace: strict 4-case contract
  (empty -> None, /workspace accepted as engagement root for launcher
  mode, /workspace/<slug> normalized, traversal/invalid -> None, no
  silent coercion).

Frontend:
- useAgent.ts: send engagement_name/workspace_path via config.configurable
  on every run; pendingHandoffRef downgraded to boolean.
- types.ts: drop engagement? field from SubagentCustomEvent.
- soundwave.md: prompt no longer instructs the LLM to pass a slug.

Tests:
- test_engagement.py: +6 tests covering hydration paths and workspace
  resolution precedence.
- test_filesystem.py: +5 tests for the new contract; flips the old
  test_root_workspace_fails_closed to test_root_workspace_accepted_as
  _engagement_root.

Closes #152, #159, #173. Addresses the filesystem failure in #175.
Supersedes PR #180 (slug queuing) and PR #181 (filesystem normalize).

* style(scripts): ruff format render_benchmark_charts.py

Pre-existing format drift on main; CI's repo-wide ruff format --check
was failing on this file unrelated to the refactor. Whitespace-only.

* fix(subagent_streaming): guard None tool_call_id from corrupting active dict

LangChain's ToolCall.id is `str | None` per the spec. The previous code
keyed `active_tool_calls[tc["id"]] = tc` directly, so multiple id-less
tool calls in the same iteration would collide under the same `None`
key — the second overwrite wins and the subsequent ToolMessage lookup
returns the wrong tool's name and args. Tool-result events were not
dropped (they were emitted regardless), but they surfaced under the
wrong tool identity.

Changes:
- Skip the active_tool_calls write when tc.get("id") is None and emit a
  warning so we can spot subagent/provider configurations that drop the
  id. Subsequent ToolMessage lookups for those calls fall through to the
  "unknown" branch instead of mis-attributing.
- Tighten the active_tool_calls type from dict[str, dict] to
  dict[str, ToolCall] (the actual TypedDict) — both signature and
  call-site annotations.
- Add three regression tests: None-id storage skipped + warning emitted,
  ToolMessage with tc-less id falls through to "unknown", and a valid
  id keeps its mapping when a concurrent None-id call is present.

Supersedes the corresponding fix in PR #178 (which closed without
landing) — the diagnosis "results dropped from the streaming pipeline"
was inaccurate; the practical effect is wrong-name attribution. The
type annotation also lands as `dict[str, ToolCall]` rather than the
laxer `dict[str, Any]` from that PR.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Soundwave→Decepticon handoff drops engagement slug — workspace_path lost on thread transition

2 participants