Skip to content

Add agent suspend/resume for out-of-process tool results#137

Merged
myzie merged 19 commits intomainfrom
agent-suspend-resume
Apr 12, 2026
Merged

Add agent suspend/resume for out-of-process tool results#137
myzie merged 19 commits intomainfrom
agent-suspend-resume

Conversation

@myzie
Copy link
Copy Markdown
Collaborator

@myzie myzie commented Apr 10, 2026

Summary

Tools can return dive.NewSuspendResult(prompt) to pause the agent mid-turn and hand control back to the caller. The caller resumes by calling CreateResponse with the missing outputs supplied via WithToolResults.

Sessions are optional. Suspend/resume works with no Session, a plain Session, or a SuspendableSession for auto-persistence:

// Stateless — no session at all
resp, _ := agent.CreateResponse(ctx, dive.WithMessages(history...))
if resp.Status == dive.ResponseStatusSuspended {
    saveHistoryAndState(history, resp.OutputMessages, resp.Suspension)
    return
}

// Resume — still no session
resp, _ = agent.CreateResponse(ctx,
    dive.WithMessages(append(history, savedOutput...)...),
    dive.WithSuspension(savedState),
    dive.WithToolResults(results),
)

When a SuspendableSession is used, the agent auto-persists on the caller's behalf and the behavior is identical to today — users just call CreateResponse with WithToolResults to resume.

Key types (all in root dive package):

  • SuspensionState { PendingToolCalls, CompletedToolCalls, TurnMessageCount } — symmetric in/out payload. Returned on Response.Suspension, passed back via WithSuspension.
  • PendingToolCall { ID, Name, Input, Prompt, Metadata } — a tool call awaiting an external result. Includes UnmarshalInput(into any) error and generic DecodePendingInput[T] helpers for typed input decoding.
  • SuspendableSession — 3 methods: LoadSuspension, SaveSuspendedTurn, SaveResumedTurn. Plain Session implementations or nil work too.

Hooks & streaming:

  • OnSuspendHook fires when the agent transitions into a suspended state, before persistence and before PostGeneration. Aborting via HookAbortError leaves the session untouched.
  • Terminal ResponseItemTypeSuspended stream item with the SuspensionState payload marks end-of-stream for suspended turns.
  • Response.Status is ResponseStatusCompleted or ResponseStatusSuspended (empty treated as Completed for back-compat).

Ergonomics & validation:

  • Resume is explicit. A suspended session called with neither WithToolResults nor WithSuspension nor new input returns ErrSuspendedSessionNoOptIn instead of silently no-op-rewriting the suspended turn.
  • ToolResult.Suspend is a validated tagged union: setting both Suspend and regular result fields surfaces as an IsError result routed through PostToolUseFailure, not a silent-ignore.
  • Parallel tool batches drain all siblings on suspend (no work lost); sequential batches re-execute any "not started" tools on resume.

Hardening pass — locked-in semantics

Future contributors: do not regress these.

  • Persisted pending payload. When a SuspendableSession is used, the full SuspensionState (including Prompt, Metadata, CompletedToolCalls) is persisted per suspended turn. Cross-process resume and partial-resume-again both surface the original payload unchanged.
  • Unknown ID → hard error. Supplying a result for a tool_use ID not in the pending set returns ErrUnknownPendingToolCall without mutating session state.
  • Fork clears suspension. A fork of a suspended session is not suspended. The fork inherits messages but not the pending set.
  • Compact refuses. Session.Compact returns ErrSuspendedSession when suspended.
  • SaveResumedTurn is guarded. Returns ErrNotSuspended if the session is not suspended.
  • Save failure is fatal. If SaveSuspendedTurn fails, CreateResponse returns the error — no phantom suspended state.
  • OnSuspend abort is safe. Hooks run before persistence, so a HookAbortError in OnSuspend leaves the session in its previous state — no compensation needed.
  • OutputMessages invariant. Generate-driven suspends populate Response.OutputMessages with the assistant tool_use message plus any partial tool_result. Partial-resume-only suspends (no generate call) return empty OutputMessages; stateless callers who want the full turn should read from the messages they passed in via WithMessages.

Breaking changes on this unmerged branch

The v0 API from earlier commits on this branch is gone:

  • Response.PendingToolCalls / Response.CompletedToolCallsResponse.Suspension (*SuspensionState).
  • SuspendableSession.Suspended(), .PendingCalls(), .LastEventMessageCount(), .AbandonSuspension() → replaced by LoadSuspension() *SuspensionState.
  • session.PendingCall deleted; use *dive.PendingToolCall directly.
  • ErrSessionNotSuspendable deleted — any session (or no session) can suspend.
  • New sentinel: ErrSuspendedSessionNoOptIn.
  • JSONL session header field pending_callspending_tool_calls (+ new completed_tool_calls). Delete any on-disk suspended sessions created with earlier builds of this branch.

Test plan

  • 14 baseline agent-level suspend/resume tests in agent_suspend_test.go covering simple suspend/resume, suspend-again, parallel & sequential batches, partial resume, no-regression on non-suspending tools, OnSuspend hook ordering, streaming terminal item, SetModel between suspend/resume, etc.
  • 14 adversarial invariant tests covering partial-resume event-count stability, cross-process metadata survival, mixed sequential/parallel agents on the same session, resume-time tool panic recovery, last-iteration-boundary suspend, ctx cancel mid-execution, non-pending ID rejection, fork clearing, compact refusal, SaveResumedTurn guard, SaveSuspendedTurn error propagation, OnSuspend abort rollback, turn-message non-aliasing, and OutputMessages invariants.
  • 5 session-store round-trip tests updated to the new LoadSuspension / SaveSuspendedTurn(state) shape.
  • New tests for the stateless flow: TestStatelessSuspendAndResume, TestStatelessMultiPendingResume, TestSuspendOnPlainSessionReturnsSuspendedResponse, TestSuspendWithoutSessionReturnsSuspendedResponse, TestSuspendedSessionNoOptInErrors, TestDecodePendingInputHelpers, TestMalformedSuspendResultBecomesError.
  • go test -race -count=1 ./... clean across the whole module.
  • go vet ./... clean.
  • Examples updated to the new API: human_approval, partial_resume, async_webhook, plus new stateless example that demonstrates suspend/resume with no session and a JSON state file.

Docs

  • docs/plans/plan-04-suspend-resume-review.md marks items 1/2/3 of the must-do list as shipped.
  • docs/plans/plan-04-suspend-resume.md and docs/prds/prd-04-suspend-resume.md retain the earlier design context.

🤖 Generated with Claude Code

Tools can now return dive.NewSuspendResult(prompt) to pause the agent
mid-turn and hand control back to the caller. Resume is a subsequent
CreateResponse call on the same session with WithToolResults(...) that
supplies the missing tool outputs. Suspend state is tracked authoritatively
on the session via a new SuspendableSession interface, so a process can
crash and restart and still resume cleanly.

The design is additive — Tool.Call keeps its (*ToolResult, error)
signature, Response.Status defaults to Completed, and third-party session
implementations that don't opt in continue to work (they just can't be
resumed). Parallel tool batches drain all siblings on suspend so no work
is lost; sequential batches re-execute any "not started" tools on resume.

Adds a new OnSuspend hook, a terminal ResponseItemTypeSuspended stream
item, and FileStore/MemoryStore persistence for the suspended flag and
pending tool-call IDs. Subagents return an error if they suspend.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Apr 10, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

Adds end-to-end suspend/resume support: tools can return suspension, agent detects and persists suspended turns (with pending tool-call state), callers can resume by supplying out-of-band ToolResults, and agent control flow, hooks, responses, and session storage are made suspension-aware.

Changes

Cohort / File(s) Summary
Core Agent Logic
agent.go
Implements suspend/resume in CreateResponse/generate: detect suspended sessions, validate/merge caller ToolResults, prepare resume state, run resume pre-loop, re-execute not-started tool calls, handle generate-loop suspension, add centralized finishSuspended and wire Hooks.OnSuspend.
Response & Tool Types
response.go, tool.go
Adds Response.Status (completed/suspended), ResponseItemTypeSuspended, Pending/CompletedToolCall types and SuspendedItem; adds ToolResult.Suspend and SuspendResult with constructors.
Public API / Options & Hooks
dive.go, hooks.go
Adds SuspendableSession interface, new resume-related errors, CreateResponseOptions.ToolResults and WithToolResults, and new exported OnSuspendHook type and phase semantics.
Session Core & Storage
session/session.go, session/file_store.go, session/memory_store.go
Persist Suspended flag and PendingCalls; add Session methods SaveSuspendedTurn, SaveResumedTurn, AbandonSuspension, accessors, SessionInfo.Suspended; file-store header updated and writes made atomic (temp file + rename + fsync).
Agent Tests
agent_suspend_test.go
New comprehensive integration tests and scripted LLM/tool doubles exercising suspend/resume flows, partial/parallel resumes, error cases, hooks, streaming semantics, persistence, and adversarial invariants.
Session Tests
session/session_test.go
Converted to external tests; added tests for suspend/resume persistence, cross-process round-trips, and pending-call preservation.
Task Integration
experimental/toolkit/extended/task.go
Treats subagent suspended CreateResponse as failure (convert suspend into error so tasks mark failure).
Examples & Dialog Spec
examples/suspend/..., examples/suspend/dialogspec/dialogspec.go
New examples demonstrating suspend/resume flows (async webhook, human approval, partial resume) and a dialogspec package to serialize dialog specs into suspend metadata.
Docs / Plans / PRD
docs/plans/*, docs/prds/*
Adds design, plan, and PRD documents describing suspend/resume behavior, tests, and follow-ups.

Sequence Diagram(s)

sequenceDiagram
    participant Caller
    participant Agent
    participant LLM
    participant Tool
    participant Session

    Caller->>Agent: CreateResponse(messages)
    Agent->>LLM: Generate (may emit tool_call)
    LLM-->>Agent: ToolUseMessage
    Agent->>Tool: Execute tool call(s)
    Tool-->>Agent: ToolResult (Suspend)
    Agent->>Session: SaveSuspendedTurn(messages, pendingCalls)
    Session-->>Agent: Persisted
    Agent->>Agent: fire OnSuspend hooks / PostGeneration
    Agent-->>Caller: Stream ResponseItemTypeSuspended (Status: suspended, PendingToolCalls)
Loading
sequenceDiagram
    participant Caller
    participant Agent
    participant Session
    participant Tool
    participant LLM

    Caller->>Agent: CreateResponse(..., ToolResults={id:result})
    Agent->>Session: Load suspended state
    Session-->>Agent: Suspended data (pending IDs)
    Agent->>Agent: Validate & merge provided ToolResults
    alt Not-started tools exist
        Agent->>Tool: Re-execute not-started tool calls
        Tool-->>Agent: ToolResult outcomes (complete or suspend)
    end
    Agent->>LLM: Generate with merged tool_result message
    LLM-->>Agent: Completion response
    Agent->>Session: SaveResumedTurn(messages, usage)
    Session-->>Agent: Cleared suspension
    Agent-->>Caller: Response(Status: completed, Output)
Loading

Estimated code review effort

🎯 5 (Critical) | ⏱️ ~120 minutes

Possibly related PRs

Poem

🐰 A hop, a pause, a patient wait—

Pending calls tucked safe in state,
I nibble results and keep the thread,
Then bounce back when the answers are fed,
Resume the run and bound ahead!

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 44.21% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The title 'Add agent suspend/resume for out-of-process tool results' clearly and concisely summarizes the main feature being added to the codebase, with strong alignment to the comprehensive changes across agent control flow, session persistence, and public APIs.
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch agent-suspend-resume

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 6

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@agent.go`:
- Around line 414-421: The code currently routes partial resumes (when
rs.RemainingPending is non-empty) through finishSuspended which re-triggers the
session suspend handler and calls OnSuspend again; modify the partial-resume
path so it does not re-fire OnSuspend: detect the partial-resume case in the
block that creates suspendedSnapshot (uses RemainingPending,
RemainingPendingCalls, CompletedToolCalls()) and call a new or existing finish
helper that skips suspend notifications (e.g., finishPartialResume or
finishSuspended with a flag to suppress OnSuspend/eventCallback invocation),
ensuring suspendedSnapshot is preserved but eventCallback/OnSuspend is not
invoked for this branch.
- Around line 578-594: The suspended terminal ResponseItem is emitted before
verifying the session is suspendable and before persisting, so move the callback
invocation (the code creating ResponseItem with Type ResponseItemTypeSuspended
and Suspended SuspendedItem) to after the sess != nil check, the type assertion
to SuspendableSession, and after the persistence logic that creates the
suspended turn; also handle and return any error from callback instead of
ignoring it (i.e., check callback(ctx, ...) error and propagate) so consumers
don't receive a terminal suspended event when persistence or suspension
capability fails (references: callback,
ResponseItem/ResponseItemTypeSuspended/SuspendedItem, sess, SuspendableSession,
ErrSessionNotSuspendable).
- Around line 603-636: The suspended-turn persistence
(suspendable.SaveSuspendedTurn) is happening before running OnSuspend and
PostGeneration hooks, which can leave a saved suspended turn even if a hook
aborts; move the call to suspendable.SaveSuspendedTurn so it runs after the
loops that execute a.hooks.OnSuspend and a.hooks.PostGeneration and only when
those hooks return no error; keep existing error handling for HookAbortError and
other hook errors (HookAbortError.HookType assignments for "OnSuspend" and
"PostGeneration") and ensure hctx.Response/OutputMessages/Usage remain set
before running the hooks so the saved suspended turn persists the final state
after hooks succeed.
- Around line 330-356: When resuming (rs != nil) we currently drop caller input
by rebuilding messages from sessionMsgs; instead, if a session is suspended and
the caller supplied new input (len(inputMessages) > 0) we must return
ErrSuspendedSessionInput. Add a guard after prepareResume/rs creation that
checks if rs != nil && len(inputMessages) > 0 and returns
ErrSuspendedSessionInput, so prepareResume/sessionMsgs/rs logic remains
unchanged but we never silently discard input from callers (referencing rs,
prepareResume, sessionMsgs, inputMessages, and ErrSuspendedSessionInput).

In `@session/session.go`:
- Around line 266-299: SaveResumedTurn currently overwrites the last persisted
event even when the session isn't suspended; add a guard that rejects the call
if there is no suspended turn to replace. Concretely, at the start of
SaveResumedTurn check s.data.Suspended (and optionally that len(s.data.Events) >
0) and return a non-nil error (e.g., fmt.Errorf or errors.New) if the session is
not suspended or there are no events, so you never replace the last event unless
a suspended turn exists; keep the rest of the logic (creating evt, replacing
last event, clearing s.data.Suspended, updating timestamps, calling
s.appender.putSession) unchanged.
- Around line 110-116: Session fork and compaction are dropping or ignoring the
new persisted suspend state (fields Suspended and PendingToolCallIDs), causing
header/event-log mismatch; update Session.Fork to copy Suspended and
PendingToolCallIDs into the new forked Session and update Session.Compact to
preserve and respect suspension (i.e., prevent compaction or ensure compacted
header retains Suspended and PendingToolCallIDs from the latest events), and
adjust any helpers that construct session headers so they include Suspended and
PendingToolCallIDs consistently to keep resumability intact.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 63a81072-e2ef-4d4d-ada0-0a1388d063ed

📥 Commits

Reviewing files that changed from the base of the PR and between 7afd882 and 0fe7de6.

📒 Files selected for processing (11)
  • agent.go
  • agent_suspend_test.go
  • dive.go
  • experimental/toolkit/extended/task.go
  • hooks.go
  • response.go
  • session/file_store.go
  • session/memory_store.go
  • session/session.go
  • session/session_test.go
  • tool.go

Comment thread agent.go Outdated
Comment thread agent.go Outdated
Comment thread agent.go Outdated
Comment thread agent.go Outdated
Comment thread session/session.go Outdated
Comment thread session/session.go Outdated
Persist the full SuspendResult payload (Prompt, Metadata, Name, Input)
per pending tool call in a new session.PendingCall type, so cross-process
resume and partial-resume-again flows surface the same payload the tool
originally returned instead of dropping it.

Tighten suspend/resume semantics:
- Compact refuses on suspended sessions (ErrSuspendedSession).
- Fork clears suspension on the forked copy.
- SaveResumedTurn is guarded (ErrNotSuspended) against accidental overwrite.
- SaveSuspendedTurn failures propagate from CreateResponse instead of
  being swallowed, so callers never receive pending IDs that don't exist.
- OnSuspend HookAbortError rolls back via AbandonSuspension so the session
  is not stranded when the caller sees an error.
- Supplying a result for a non-pending ID (including already-completed)
  returns ErrUnknownPendingToolCall without mutating the session.

Interface rename: SuspendableSession.PendingToolCallIDs → PendingCalls,
SaveSuspendedTurn takes []session.PendingCall, ClearSuspension →
AbandonSuspension. The JSONL header field pending_tool_call_ids is
replaced by pending_calls — the branch has not shipped, so this is a
clean break.

14 new adversarial tests cover: partial-resume event count stability,
cross-process metadata survival, mixed sequential/parallel agents on the
same session, resume-time tool panic recovery, suspend on the last tool
iteration boundary, ctx cancel mid-execution, non-pending ID rejection,
fork clearing, compact refusal, SaveResumedTurn guard, SaveSuspendedTurn
error propagation, OnSuspend abort rollback, turn-message non-aliasing,
and OutputMessages invariants for generate-driven vs partial-resume
suspends. go test -race ./... clean.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
session/file_store.go (1)

323-369: ⚠️ Potential issue | 🟠 Major

Make full-session rewrites atomic.

Line 323 truncates the live JSONL with os.Create. The new suspend/resume transitions in session/session.go now rely on this full-rewrite path, so a crash or ENOSPC between truncate and Flush can wipe the pending-call state you're explicitly persisting for cross-process resume. Please write to a temp file and rename it into place after a successful flush/sync.

Suggested shape of the rewrite
-	f, err := os.Create(p)
+	tmp, err := os.CreateTemp(filepath.Dir(p), filepath.Base(p)+".*.tmp")
 	if err != nil {
 		return err
 	}
-	defer f.Close()
+	defer func() {
+		_ = tmp.Close()
+		_ = os.Remove(tmp.Name())
+	}()
 
-	w := bufio.NewWriter(f)
+	w := bufio.NewWriter(tmp)
 ...
-	return w.Flush()
+	if err := w.Flush(); err != nil {
+		return err
+	}
+	if err := tmp.Sync(); err != nil {
+		return err
+	}
+	if err := tmp.Close(); err != nil {
+		return err
+	}
+	return os.Rename(tmp.Name(), p)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@session/file_store.go` around lines 323 - 369, The current full-session
rewrite uses os.Create(p) and writes directly, which can truncate the live JSONL
and lose pending-call state on crash; change the implementation that writes the
sessionHeader/jsonlLine and events to instead create a temp file (os.CreateTemp)
in the same directory, write via bufio.NewWriter, call w.Flush(), call f.Sync()
(and then f.Close()), and only after a successful flush/fsync/close do an atomic
os.Rename(tempPath, p) to replace the original file; preserve the existing JSON
encoding steps for sessionHeader and jsonlLine, and ensure errors during
write/flush/sync cause removal of the temp file to avoid littering.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@session/session.go`:
- Around line 76-99: clonePendingCalls currently only shallow-copies the
top-level Metadata map, allowing nested maps/slices in PendingCall.Metadata to
be mutated by callers; update clonePendingCalls (and/or add a helper) to
deep-copy nested JSON-like values for each Metadata entry (e.g., implement a
recursive deepCopyJSONValue(value any) any or marshal+unmarshal each Metadata
value using encoding/json) and assign the fully deep-copied result to
cp.Metadata so callers cannot mutate session internals; reference the
clonePendingCalls function and the PendingCall.Metadata field when making the
change.
- Around line 224-237: SaveTurn currently clears s.data.Suspended, drops
s.data.PendingCalls and appends an event, which breaks resumability by losing
partial tool messages; instead detect suspended sessions in SaveTurn and return
ErrSuspendedSession. Modify the SaveTurn implementation to check
s.data.Suspended at the start (or before mutating PendingCalls/Events), and if
true return ErrSuspendedSession (preserving state) rather than clearing the flag
or calling s.appender.putSession; ensure SaveResumedTurn remains the only path
that clears Suspended and rewrites via s.appender.putSession when resuming.

---

Outside diff comments:
In `@session/file_store.go`:
- Around line 323-369: The current full-session rewrite uses os.Create(p) and
writes directly, which can truncate the live JSONL and lose pending-call state
on crash; change the implementation that writes the sessionHeader/jsonlLine and
events to instead create a temp file (os.CreateTemp) in the same directory,
write via bufio.NewWriter, call w.Flush(), call f.Sync() (and then f.Close()),
and only after a successful flush/fsync/close do an atomic os.Rename(tempPath,
p) to replace the original file; preserve the existing JSON encoding steps for
sessionHeader and jsonlLine, and ensure errors during write/flush/sync cause
removal of the temp file to avoid littering.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 6dd180ca-a7f6-41e9-9243-fda05f1b4426

📥 Commits

Reviewing files that changed from the base of the PR and between 0fe7de6 and b50a1dd.

📒 Files selected for processing (6)
  • agent.go
  • agent_suspend_test.go
  • dive.go
  • session/file_store.go
  • session/session.go
  • session/session_test.go
✅ Files skipped from review due to trivial changes (1)
  • agent.go
🚧 Files skipped from review as they are similar to previous changes (1)
  • session/session_test.go

Comment thread session/session.go Outdated
Comment thread session/session.go Outdated
myzie and others added 2 commits April 10, 2026 13:42
Align finishSuspended with the design doc: OnSuspend and PostGeneration
run before SaveSuspendedTurn so a hook abort leaves the session in its
previous state instead of requiring AbandonSuspension compensation. The
terminal suspended stream item is now emitted after persistence succeeds
and callback errors propagate, so stream consumers never see a suspended
terminal for a call that ultimately returned an error.

Partial resumes (remaining pending non-empty) skip OnSuspend and the
terminal stream item since they continue an existing suspension rather
than announcing a new one.

Tighten the resume input guard to reject caller-supplied input whenever
the session is suspended, not only when tool results are absent.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Deep-copy nested PendingCall metadata so callers cannot mutate session
internals through the returned map. Make SaveTurn return
ErrSuspendedSession instead of silently clearing suspend state, keeping
SaveResumedTurn as the only path that unwinds a suspension. Write
FileStore sessions to a temp file + atomic rename so a crash mid-write
cannot truncate the live JSONL and lose pending-call state.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@session/file_store.go`:
- Around line 401-404: The rename of tmpPath to p must be followed by syncing
the parent directory to make the rename durable: after os.Rename(tmpPath, p)
succeeds in session/file_store.go (around the tmpPath -> p rename and committed
variable), open the parent directory (use filepath.Dir(p)), call Sync() on the
directory file, close it, and propagate any error from the dir sync so callers
know the rename wasn't durable; only set committed = true after both the rename
and the directory sync succeed. Ensure errors from opening or syncing the
directory are returned (and do not swallow them).

In `@session/session.go`:
- Around line 343-352: The code mutates s.data (e.g., s.data.Suspended,
s.data.PendingCalls, s.data.UpdatedAt and any replaced-last-event fields) before
calling s.appender.putSession; if putSession returns an error these in-memory
changes must be rolled back to their prior values to avoid divergence. For each
affected method (SaveSuspendedTurn, SaveResumedTurn, AbandonSuspension and the
shown block calling s.appender.putSession) snapshot the previous s.data fields
you will change, perform the mutations, call s.appender.putSession, and if it
returns an error restore the saved snapshot to s.data before returning the
error. Ensure the snapshot covers PendingCalls (deep clone or equivalent),
Suspended flag, UpdatedAt and any fields modified by the replace-last
optimization so the in-memory Session state remains consistent on failure.
- Around line 255-265: CreateResponse is currently swallowing the
ErrSuspendedSession returned by Session.SaveTurn; change CreateResponse so that
when Session.SaveTurn (and other save calls like SaveSuspendedTurn) return an
error (including ErrSuspendedSession) it is propagated back to the caller
instead of only logging and continuing. Locate the CreateResponse implementation
and replace the log-and-continue pattern around the call to s.SaveTurn(...) (the
same pattern used at agent.go:645-646 for SaveSuspendedTurn) with immediate
return of the error (wrapping or annotating as needed) so persistence failures
for suspended sessions are bubbled up to the caller.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: a2201c0c-8933-44d6-8a55-99d4b2a65ffe

📥 Commits

Reviewing files that changed from the base of the PR and between 5efeb98 and ea429da.

📒 Files selected for processing (2)
  • session/file_store.go
  • session/session.go

Comment thread session/file_store.go
Comment thread session/session.go
Comment thread session/session.go Outdated
myzie and others added 5 commits April 11, 2026 13:15
Three correctness fixes from review:

1. session/file_store.go: fsync the parent directory after the rename
   so the rename itself is durable across power loss, not just the
   file contents.

2. session/session.go: snapshot mutated fields in SaveSuspendedTurn,
   SaveResumedTurn, and AbandonSuspension and roll back on putSession
   failure so the in-memory Session never diverges from disk.

3. agent.go: propagate SaveTurn and SaveResumedTurn errors out of
   CreateResponse instead of logging and continuing. Persistence
   failures must reach the caller; the SaveSuspendedTurn path already
   did this.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Four runnable examples illustrating the suspend/resume API:

- human_approval: synchronous in-process approval prompt
- async_webhook: cross-process resume via persisted FileStore session
- partial_resume: parallel tool batch with staged result delivery
- dialogspec: shared example helper

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Design doc lands under docs/plans alongside the other plans; PRD lands
under docs/prds following the existing numbering convention.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- session/file_store.go: add a Concurrency model section to the
  FileStore godoc making the single-writer-per-session contract
  explicit. Sequential cross-process handoff is supported and
  crash-consistent; concurrent access from multiple processes is
  not supported and should use a database-backed Session backend.

- docs/plans/plan-04-suspend-resume-review.md: combined review notes
  from two independent reviewers, organized by severity and themed.
  Top item is the "sessions are optional, not required" redesign
  bundle (H2 + H3 + M4) — suspend/resume must work for users who
  manage history themselves and don't use Session.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
session/session.go (1)

260-277: ⚠️ Potential issue | 🟠 Major

Roll back SaveTurn mutations when append-to-disk fails.

SaveTurn now propagates append errors, but it still appends to s.data.Events and updates UpdatedAt before calling appendEvent. If the file write fails, the live Session is ahead of disk, so reusing the same object after the error will diverge from a freshly reopened session.

🧯 Minimal rollback pattern
 func (s *Session) SaveTurn(ctx context.Context, messages []*llm.Message, usage *llm.Usage) error {
 	s.mu.Lock()
 	defer s.mu.Unlock()
 	if s.data.Suspended {
 		return ErrSuspendedSession
 	}
 	evt := &event{
 		ID:        newEventID(),
 		Type:      eventTypeTurn,
 		Timestamp: time.Now(),
 		Messages:  messages,
 		Usage:     usage,
 	}
+	prevUpdatedAt := s.data.UpdatedAt
 	s.data.Events = append(s.data.Events, evt)
 	s.data.UpdatedAt = evt.Timestamp
 	if s.appender != nil {
-		return s.appender.appendEvent(ctx, s.data.ID, evt)
+		if err := s.appender.appendEvent(ctx, s.data.ID, evt); err != nil {
+			s.data.Events = s.data.Events[:len(s.data.Events)-1]
+			s.data.UpdatedAt = prevUpdatedAt
+			return err
+		}
 	}
 	return nil
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@session/session.go` around lines 260 - 277, SaveTurn currently mutates
s.data.Events and s.data.UpdatedAt before calling s.appender.appendEvent, so if
appendEvent fails the in-memory Session diverges from disk; fix by either
performing appendEvent first and then updating in-memory state, or by capturing
the previous state (len(s.data.Events) and previous UpdatedAt) and rolling back
those mutations if appendEvent returns an error; update the function SaveTurn to
create evt as now, call s.appender.appendEvent(ctx, s.data.ID, evt) while
holding the lock (or without if safe), and only append evt to s.data.Events and
set s.data.UpdatedAt when appendEvent succeeds—or if you keep the current
ordering, restore s.data.Events (truncate to previous length) and
s.data.UpdatedAt to the prior value when appendEvent returns an error.
♻️ Duplicate comments (1)
session/session.go (1)

386-423: ⚠️ Potential issue | 🟠 Major

Reject resume when there is no suspended turn to replace.

This still appends a brand-new event when Suspended is true but the event log is empty. That silently “repairs” corrupted state instead of failing fast, and it breaks the core invariant that resume replaces the persisted suspended turn rather than manufacturing a new one.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@session/session.go` around lines 386 - 423, SaveResumedTurn currently
creates/appends a new event when s.data.Suspended is true but s.data.Events is
empty; instead, detect this corrupted state and reject the resume. In
SaveResumedTurn, after the Suspended check validate that len(s.data.Events) > 0
(or hasLastEvent) and if not return ErrNotSuspended (or a new
ErrNoSuspendedEvent if you prefer); do not generate evtID or append a new event
when there is no existing event to replace (keep the rollback snapshot logic
unchanged). Reference: function SaveResumedTurn, s.data.Suspended,
s.data.Events, ErrNotSuspended.
🧹 Nitpick comments (2)
examples/suspend/async_webhook/main.go (1)

48-57: Add a production warning around OnSuspend side effects.

This example is close to something users will copy into real code. Since OnSuspend runs before the suspended turn is durably committed, a real webhook POST from here can escape even if SaveSuspendedTurn later fails. A short warning or outbox note next to webhookNotifier would prevent a pretty nasty integration footgun.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@examples/suspend/async_webhook/main.go` around lines 48 - 57, The
webhookNotifier function (used from OnSuspend) performs external network
side-effects at suspend time; add a clear production warning comment immediately
above webhookNotifier explaining that OnSuspend executes before the suspended
turn is durably committed and that performing real POSTs here can leak actions
if SaveSuspendedTurn later fails — instruct consumers to instead use an
outbox/durable queue (enqueue the webhook payload for delivery after commit) or
only log/print in this example; mention the functions OnSuspend and
SaveSuspendedTurn and the preferred outbox pattern so maintainers know how to
fix real integrations.
docs/plans/plan-04-suspend-resume.md (1)

958-958: Add language identifiers to fenced code blocks.

The fenced blocks at Line 958 and Line 1011 should declare a language (for example, text) to satisfy markdownlint MD040.

Also applies to: 1011-1011

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/plans/plan-04-suspend-resume.md` at line 958, Two fenced code blocks in
the document are missing language identifiers (triggering markdownlint MD040);
update the two triple-backtick blocks referenced around the existing examples
(the blocks at the locations mentioned in the review) to include an explicit
language token (e.g., ```text or ```bash) so the fenced code is annotated;
ensure both the block at the earlier occurrence and the one at the later
occurrence are updated consistently to satisfy MD040.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@docs/plans/plan-04-suspend-resume.md`:
- Around line 30-33: Update the "additive only" claim so it no longer asserts
there are no renames/removals; explicitly state that this PR introduces breaking
contract/storage changes — mention the changed pending-call field/name and the
renamed suspension APIs (and the new Suspend field behaving as a nil-pointer
check) so readers are aware this is not fully additive and must handle
migration/compatibility considerations.
- Line 6: Update the stale PRD path references in
docs/plans/plan-04-suspend-resume.md by replacing occurrences of the string
"tasks/prd-agent-suspend-resume.md" (found at the link targets on line 6 and
line 14) with the new path "docs/prds/prd-agent-suspend-resume.md" so the links
point to the moved PR notes; ensure both instances are updated consistently.
- Around line 742-746: Update the design doc to match the current public API by
renaming ClearSuspension to AbandonSuspension and aligning method signatures and
semantics with the PR: replace any references to ClearSuspension and its older
signature with AbandonSuspension(ctx context.Context) error, update descriptions
to reflect the new resume semantics (how CreateResponse transitions out of
resume mode and the updated suspended/resumed error behavior), and make the same
changes in the other occurrences noted (around the other sections referenced).

In `@examples/suspend/async_webhook/main.go`:
- Line 66: The FileStore path passed to session.NewFileStore is unnormalized
which triggers the FileStore path validation to mismatch the root string and
cause ErrInvalidSessionID; update the example to normalize the path before
passing it to session.NewFileStore (e.g., use filepath.Clean or filepath.Abs on
"./async_webhook_sessions") so the store root used by NewFileStore matches the
validated root in the FileStore implementation and prevents ErrInvalidSessionID
when creating or reopening sessions.

In `@session/session.go`:
- Around line 444-471: AbandonSuspension currently clears Suspended and
PendingCalls while leaving the unmatched tool_use event in history, poisoning
the session; change it to detect the last event/unanswered tool_use and remove
or roll back that event from s.data.Events (so the session no longer contains a
half-completed turn) before clearing PendingCalls and setting
s.data.Suspended=false, still cloning previous state (use clonePendingCalls and
prevUpdatedAt) and persisting via s.appender.putSession; if persistence fails,
restore s.data.Events, s.data.PendingCalls, s.data.Suspended and UpdatedAt as
you do now to keep rollback behavior.

---

Outside diff comments:
In `@session/session.go`:
- Around line 260-277: SaveTurn currently mutates s.data.Events and
s.data.UpdatedAt before calling s.appender.appendEvent, so if appendEvent fails
the in-memory Session diverges from disk; fix by either performing appendEvent
first and then updating in-memory state, or by capturing the previous state
(len(s.data.Events) and previous UpdatedAt) and rolling back those mutations if
appendEvent returns an error; update the function SaveTurn to create evt as now,
call s.appender.appendEvent(ctx, s.data.ID, evt) while holding the lock (or
without if safe), and only append evt to s.data.Events and set s.data.UpdatedAt
when appendEvent succeeds—or if you keep the current ordering, restore
s.data.Events (truncate to previous length) and s.data.UpdatedAt to the prior
value when appendEvent returns an error.

---

Duplicate comments:
In `@session/session.go`:
- Around line 386-423: SaveResumedTurn currently creates/appends a new event
when s.data.Suspended is true but s.data.Events is empty; instead, detect this
corrupted state and reject the resume. In SaveResumedTurn, after the Suspended
check validate that len(s.data.Events) > 0 (or hasLastEvent) and if not return
ErrNotSuspended (or a new ErrNoSuspendedEvent if you prefer); do not generate
evtID or append a new event when there is no existing event to replace (keep the
rollback snapshot logic unchanged). Reference: function SaveResumedTurn,
s.data.Suspended, s.data.Events, ErrNotSuspended.

---

Nitpick comments:
In `@docs/plans/plan-04-suspend-resume.md`:
- Line 958: Two fenced code blocks in the document are missing language
identifiers (triggering markdownlint MD040); update the two triple-backtick
blocks referenced around the existing examples (the blocks at the locations
mentioned in the review) to include an explicit language token (e.g., ```text or
```bash) so the fenced code is annotated; ensure both the block at the earlier
occurrence and the one at the later occurrence are updated consistently to
satisfy MD040.

In `@examples/suspend/async_webhook/main.go`:
- Around line 48-57: The webhookNotifier function (used from OnSuspend) performs
external network side-effects at suspend time; add a clear production warning
comment immediately above webhookNotifier explaining that OnSuspend executes
before the suspended turn is durably committed and that performing real POSTs
here can leak actions if SaveSuspendedTurn later fails — instruct consumers to
instead use an outbox/durable queue (enqueue the webhook payload for delivery
after commit) or only log/print in this example; mention the functions OnSuspend
and SaveSuspendedTurn and the preferred outbox pattern so maintainers know how
to fix real integrations.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 6e8fcbb1-cbbf-486f-86d5-695f554c5ab2

📥 Commits

Reviewing files that changed from the base of the PR and between ea429da and 0dac708.

📒 Files selected for processing (10)
  • agent.go
  • docs/plans/plan-04-suspend-resume-review.md
  • docs/plans/plan-04-suspend-resume.md
  • docs/prds/prd-04-suspend-resume.md
  • examples/suspend/async_webhook/main.go
  • examples/suspend/dialogspec/dialogspec.go
  • examples/suspend/human_approval/main.go
  • examples/suspend/partial_resume/main.go
  • session/file_store.go
  • session/session.go
🚧 Files skipped from review as they are similar to previous changes (1)
  • agent.go

Comment thread docs/plans/plan-04-suspend-resume.md Outdated
Comment thread docs/plans/plan-04-suspend-resume.md Outdated
Comment thread docs/plans/plan-04-suspend-resume.md Outdated
flag.Parse()

ctx := context.Background()
store, err := session.NewFileStore("./async_webhook_sessions")
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Use a normalized store path here.

Line 66 currently passes "./async_webhook_sessions", but the current FileStore path validation compares against the unnormalized root string. That makes this example fail with ErrInvalidSessionID before it can even create or reopen the session.

🛠️ Minimal example fix
-	store, err := session.NewFileStore("./async_webhook_sessions")
+	store, err := session.NewFileStore("async_webhook_sessions")
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
store, err := session.NewFileStore("./async_webhook_sessions")
store, err := session.NewFileStore("async_webhook_sessions")
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@examples/suspend/async_webhook/main.go` at line 66, The FileStore path passed
to session.NewFileStore is unnormalized which triggers the FileStore path
validation to mismatch the root string and cause ErrInvalidSessionID; update the
example to normalize the path before passing it to session.NewFileStore (e.g.,
use filepath.Clean or filepath.Abs on "./async_webhook_sessions") so the store
root used by NewFileStore matches the validated root in the FileStore
implementation and prevents ErrInvalidSessionID when creating or reopening
sessions.

Comment thread session/session.go Outdated
myzie and others added 10 commits April 11, 2026 14:29
Combined changes. The first block is the "sessions are optional"
redesign (H2+H3+M4+M3+M5 from plan-04-suspend-resume-review.md);
the second block adds the production hardening follow-ups (M1,
L1, M2, L4).

Sessions-optional redesign:
- WithSuspension(*SuspensionState) passes authoritative resume state
  explicitly, so stateless callers and cross-process resumers no
  longer depend on a SuspendableSession being present.
- SuspendableSession shrunk to 3 methods (LoadSuspension,
  SaveSuspendedTurn, SaveResumedTurn); session.PendingCall deleted;
  session package consumes dive.PendingToolCall / dive.SuspensionState
  directly.
- Response.PendingToolCalls/CompletedToolCalls collapsed into
  Response.Suspension *SuspensionState with TurnMessageCount so
  stateless callers can locate the turn boundary without session
  plumbing.
- Resume is opt-in: suspended-with-no-opt-in now returns
  ErrSuspendedSessionNoOptIn instead of silently re-saving.
  ErrSessionNotSuspendable and AbandonSuspension removed.
- ToolResult tagged-union contract validated at the agent boundary:
  a tool that returns both Suspend and regular result fields flows
  through PostToolUseFailure as IsError.
- Input-decoding helpers on *PendingToolCall: UnmarshalInput(into any)
  and generic DecodePendingInput[T]. New stateless example.

Hardening follow-ups:
- M1: NewFileStoreWithSync(dir, sync bool) opts the hot-path
  appendEvent into f.Sync(); default NewFileStore keeps pagecache
  durability. Trade-off documented on FileStore.
- L1: CreateResponse accumulates OutputMessages, Items, and Usage
  across Stop-hook continuations; the synthetic user reason message
  injected by a Stop hook is appended to the accumulator so a suspend
  on the re-entered generate preserves both iterations plus the Stop
  reason in the persisted turn and on
  Response.Suspension.TurnMessageCount. Pinned by
  TestStopHookContinueThenSuspend.
- M2: package-level sessionLocks sync.Map keyed on Session.ID()
  serializes concurrent CreateResponse calls that share a session
  (including across agents). Stateless callers skip the lock. Pinned
  by TestConcurrentCreateResponseOnSameSessionSerialized under -race.
  dive.go godoc (Session interface, WithToolResults) updated to
  describe the guarantee.
- L4: OnSuspendHook godoc now references the Response.Suspension
  shape and explicitly states no compensating rollback is required
  because the hook runs before persistence.

plan-04-suspend-resume-review.md marks items 4, 5, 6, 8 of the
Follow-up PRs list as shipped.

go test -race -count=1 ./... and go vet ./... clean.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- SuspensionState.TurnMessages replaces TurnMessageCount. Callers
  (and the agent internally) work with the turn as an actual message
  slice rather than an index count — simpler to reason about and
  eliminates the off-by-one risk.
- WithResume(state, results) bundles the stateless-resume pair that
  previously required WithSuspension + WithToolResults. Session-backed
  callers still use WithToolResults only.
- Response.Suspension is now populated on both suspended responses
  AND on resume-completed responses (with PendingToolCalls nil), so
  stateless callers can flush the final merged turn into their local
  history via a single append without reconciling a stale partial
  tool_result from their saved state.
- NewSuspendResult(prompt, metadata) replaces NewSuspendResult +
  NewSuspendResultWithMetadata. One constructor, same capability.
- Renames: ErrSuspendedSessionNoOptIn → ErrResumeRequired,
  ErrSuspendedSessionInput → ErrInputOnSuspendedSession,
  ErrNoSuspendedState → ErrNoSuspendedTurn. Error messages no longer
  reference a non-existent AbandonSuspension API.
- SuspendResult.Metadata godoc explicitly warns about JSON
  round-tripping (int→float64, struct→map[string]any).
- session.ListOptions.Suspended *bool filter honored by both
  MemoryStore and FileStore for finding stale suspended sessions.
- withRollback helper DRYs the three snapshot/restore code paths in
  Session write methods.
- MemoryStore.List documents its store→session lock order.
- Compact godoc no longer references the removed AbandonSuspension.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Tests: rewrite stateless suspend/resume tests to use the
  preHistory + WithResume pattern, assert the merged TurnMessages on
  resume-completed Responses, and replace NewSuspendResult callsites.
  Parametrize a ListOptions.Suspended filter test across MemoryStore
  and FileStore. Update round-trip tests to assert LoadSuspension
  surfaces TurnMessages.
- Examples: rewrite the stateless example to demonstrate the new
  pattern (preHistory/suspension split, WithResume, flush-on-completion).
  dialogspec switches from NewSuspendResultWithMetadata to the single
  NewSuspendResult constructor.
- Review doc: append a "Final API polish pass" section summarizing
  the rename and shape changes shipped in this pass and mark the
  remaining L2 / cleanup checklist items complete.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Add a dedicated suspend-resume guide and cross-link it from the agents,
hooks, and custom-tools guides so the feature is discoverable from every
natural entry point. Extend the hook flow, error table, and next-steps
lists to cover OnSuspend. Update the top-level README, llms.txt, and
CLAUDE.md with pointers to the new material and the suspend/resume key
types so downstream AI assistants and new users land on the full flow.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Append a second round of adversarial tests that pin invariants the
existing suite didn't explicitly cover. Each test locks in a contract
that a careless refactor would otherwise be free to break:

- OnSuspend does not re-fire on a partial resume (no new transition).
- OnSuspend DOES re-fire when not-started sequential tools re-suspend.
- OnSuspend hook chain: HookAbortError stops the rest; regular errors
  are logged and the chain continues through PostGeneration + persist.
- SetSystemPrompt between suspend and resume is honored (counterpart to
  the existing SetModel test).
- WithResume overrides the session's stored suspension state, enabling
  cross-process resumers that hold a newer snapshot than on-disk.
- WithToolResults validation is atomic: a mix of valid and unknown IDs
  fails with no session mutation and no PostToolUse/PostToolUseFailure
  dispatch for the valid entries.
- An empty WithToolResults map is not a resume opt-in and returns
  ErrResumeRequired instead of silently no-op re-saving.
- A suspended tool's Call method is not re-invoked on resume — the
  caller-supplied result is authoritative.
- SuspensionState round-trips through JSON without losing the fields
  needed to drive a resume (the stateless wire-format contract).
- LoadSuspension returns a deep copy; mutating it does not affect the
  session.
- Mutating Response.Suspension on the caller side does not corrupt the
  session's persisted turn.
- Cancelling a multi-pending suspension with IsError results for every
  pending call fires PostToolUseFailure once per cancelled ID and
  completes the turn normally.

Also fix a stale "All three patterns" line in the suspend/resume guide
that listed four runnable examples, and add a brief note that the
dialogspec directory is a shared helper package rather than an example.

Full suite green under `go test -race -count=5 ./...`.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Verified each criterion against the implementation, tests, and docs;
checked off everything that landed across the suspend/resume series
(commits 1362d580912c76). Two substantive doc fixes alongside the
checkbox sweep:

- US-001 acceptance criterion split into two bullets to be honest:
  documentation + runnable examples are done; an async-mode toolkit
  AskUser is not (toolkit/ask_user.go is sync-by-design via Dialog).
  The pattern is demonstrated in examples/suspend/ for now.

- US-003 acceptance criterion rewritten to remove an internal
  contradiction with US-005. The original wording said partial resume
  "errors clearly," but US-005 (and TestPartialResumeTwice) defines
  partial resume as keeping the agent suspended. The bullet now
  describes the unknown-ID case, which is what ErrUnknownPendingToolCall
  and TestResumeUnknownID actually pin.

Status flipped Draft -> Implemented.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The PRD's last open US-001 acceptance criterion required at least one
built-in toolkit tool to demonstrate suspend/resume. AskUserTool was
the natural fit but until now it was sync-by-design: it called into a
dive.Dialog and blocked until the user answered, which is wrong for
SaaS workflows where the wait may take hours or days.

Add an Async option to AskUserToolOptions. When set, Call returns
NewSuspendResult(question, {"type": <input.Type>}) instead of invoking
Dialog. The integrator surfaces the question to a human out-of-band
and resumes the agent later with a tool result whose text content is
JSON-marshaled AskUserOutput. In async mode the Dialog field is
ignored and may be left nil; sync mode is unchanged.

Tests:
- toolkit/ask_user_test.go: 12 new tests covering both modes — name,
  schema, annotations, sync no-dialog error, sync confirm/select with
  AutoApproveDialog, sync deny path, async returns SuspendResult with
  clean tagged-union shape, async carries the question type in
  metadata, async ignores Dialog when both set, async works with nil
  Dialog, and Input round-trips through DecodePendingInput[*AskUserInput].
- agent_suspend_test.go: TestAskUserToolAsyncEndToEnd drives the real
  toolkit.AskUserTool through agent.CreateResponse end-to-end. Pins
  the suspend → integrator-supplies-AskUserOutput → resume → final
  answer flow including the LLM-side tool_result body, closing the
  loop on the PRD acceptance criterion.

Docs:
- docs/guides/suspend-resume.md: new "Built-in: toolkit.AskUserTool in
  async mode" subsection under "Suspending from a tool" so readers
  find the out-of-the-box option for the most common case.
- docs/prds/prd-04-suspend-resume.md: US-001's last bullet flipped
  from [ ] to [x] with a pointer to the new test.

Full suite green under go test -race -count=1 ./...

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
ToolResultContent.Content is an `any` field, so once a message is
round-tripped through JSON — as happens transparently in Message.Copy,
session persistence, and cross-process suspend/resume — reading Content
directly yields the generic decoded shape (map[string]any, []any,
float64), discarding the caller's original Go types.

Add a custom UnmarshalJSON that preserves the original content bytes in
an unexported field alongside the generic Content value, plus a
DecodeContent helper (and DecodeToolResultContent[T] generic wrapper)
that replays those bytes into a typed destination. Integers stay
integers, typed structs decode into themselves, slices keep their
element types. Reading Content directly still gives the generic shape,
so existing callers are unaffected.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
… doc drift

- SaveTurn: roll back in-memory Events/UpdatedAt if appendEvent fails, keeping
  session consistent with what was actually persisted
- SaveResumedTurn: reject corrupted state (Suspended=true but empty Events)
  instead of creating an orphaned event
- FileStore: normalize dir with filepath.Clean in constructor so path()
  prefix check works with relative paths like "./sessions"
- Design doc: fix stale PRD path references (tasks/ → docs/prds/), update
  "additive only" claim to acknowledge breaking session contract changes,
  align SuspendableSession interface and sessionData to match actual code
  (LoadSuspension/SaveResumedTurn, PendingToolCalls, CompletedToolCalls),
  add language identifiers to two unfenced code blocks
- Example: add production warning on webhookNotifier about OnSuspend running
  before SaveSuspendedTurn commits (recommend outbox pattern)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…oolCalls preservation

Populate HookContext.Tool during resume post-hooks, propagate hook mutations
(Result rewriting, AdditionalContext injection) into the merged tool_result
message, and preserve rich CompletedToolCall data across partial resumes
instead of lossy reconstruction from message content blocks.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@myzie myzie merged commit 63a6394 into main Apr 12, 2026
2 checks passed
@myzie myzie deleted the agent-suspend-resume branch April 12, 2026 14:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant