Conversation
Move subagent, compaction, todo, a2a, toolkit/firecrawl, toolkit/google, and toolkit/kagi from experimental/ to top-level stable paths. Update all import paths, documentation, and CLAUDE.md. Fix production-readiness issues: data races in google/kagi search clients (package-level baseURL mutation), input mutation in google Search, kagi type name stuttering, a2a handler allocation per request, subagent MapLoader returning internal map. Settings deferred from this batch due to experimental/sandbox dependency. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
📝 WalkthroughWalkthroughPromotes several experimental packages to top-level (a2a, compaction, subagent, todo), updates imports/docs, introduces a new todo package and Extension with reminder/state hooks, adds todo/compaction integration tests, caches A2A server handler, and refactors toolkit clients (per-client baseURLs and cancellation-aware backoff); hardens Firecrawl handling. Changes
Sequence Diagram(s)sequenceDiagram
participant Agent
participant TodoExt as TodoExtension
participant History as MessageHistory
participant Tool as TodoWriteTool
Agent->>TodoExt: PreGeneration/PreIteration hook
TodoExt->>History: scan messages (newest→oldest) for <todo-state> / tool-use
alt latest state found
History-->>TodoExt: returns todos + turnsSinceWrite
TodoExt->>TodoExt: compute staleness vs threshold
alt stale
TodoExt->>Agent: inject <system-reminder name="todos"> into first user message
else recent
TodoExt->>Agent: remove any existing reminder
end
else no state
TodoExt->>Agent: ensure reminder removed
end
Agent->>Tool: may call TodoWrite tool during generation
Tool-->>TodoExt: successful tool result captured by PostToolUse hook
TodoExt->>Agent: append hidden <todo-state> block to AdditionalContext
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
experimental/toolkit/extended/todo.go (1)
19-23:⚠️ Potential issue | 🟡 MinorStale comment: update reference to canonical types location.
The comment says "canonical types are in experimental/todo" but they have been moved to the stable
todopackage.📝 Proposed fix
-// Type aliases — canonical types are in experimental/todo +// Type aliases — canonical types are in todo type ( TodoStatus = todo.TodoStatus TodoItem = todo.TodoItem )🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@experimental/toolkit/extended/todo.go` around lines 19 - 23, The file comment above the type aliases TodoStatus and TodoItem is stale; update it to reflect that the canonical types now live in the stable todo package (not experimental/todo). Edit the comment text near the type alias block (the line above the type (...) declaration) to reference the stable "todo" package or remove the "experimental" qualifier so it accurately documents that TodoStatus and TodoItem are aliases to the canonical types in todo.docs/guides/a2a.md (1)
386-393:⚠️ Potential issue | 🟡 MinorFix contradiction about package stability in the boundaries section.
This section says A2A is stable, but then says both A2A and MCP live in experimental packages.
✏️ Suggested doc fix
-MCP support remains a separate story. MCP is for tools and data; A2A is -for agents. They live in different experimental packages and are -independently adoptable. +MCP support remains a separate story. MCP is for tools and data; A2A is +for agents. MCP currently lives under `experimental/`, while A2A is in +stable `a2a/`; they are independently adoptable.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@docs/guides/a2a.md` around lines 386 - 393, The paragraph contradicts itself about stability; decide which is correct and make the wording consistent: either (A) mark both a2a and MCP as experimental and state that a2a lives in an experimental package that intentionally does not leak types into the stable dive package (so *dive.Agent, *dive.Response, and dive.Session remain unchanged), or (B) if a2a is intended to be stable, change the sentence that calls it experimental to say MCP is experimental while clarifying that a2a’s public surface does not force core-API changes. Update the sentences mentioning a2a, dive, and MCP so they all reflect the chosen stability status consistently.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Outside diff comments:
In `@docs/guides/a2a.md`:
- Around line 386-393: The paragraph contradicts itself about stability; decide
which is correct and make the wording consistent: either (A) mark both a2a and
MCP as experimental and state that a2a lives in an experimental package that
intentionally does not leak types into the stable dive package (so *dive.Agent,
*dive.Response, and dive.Session remain unchanged), or (B) if a2a is intended to
be stable, change the sentence that calls it experimental to say MCP is
experimental while clarifying that a2a’s public surface does not force core-API
changes. Update the sentences mentioning a2a, dive, and MCP so they all reflect
the chosen stability status consistently.
In `@experimental/toolkit/extended/todo.go`:
- Around line 19-23: The file comment above the type aliases TodoStatus and
TodoItem is stale; update it to reflect that the canonical types now live in the
stable todo package (not experimental/todo). Edit the comment text near the type
alias block (the line above the type (...) declaration) to reference the stable
"todo" package or remove the "experimental" qualifier so it accurately documents
that TodoStatus and TodoItem are aliases to the canonical types in todo.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 1997a133-bef3-4f00-b648-cb6a82956b5a
📒 Files selected for processing (40)
CLAUDE.mda2a/a2a_test.goa2a/client.goa2a/doc.goa2a/interop_test.goa2a/remoteagent.goa2a/rpc.goa2a/server.goa2a/task_store.goa2a/types.gocompaction/compaction.gocompaction/compaction_test.gocompaction/hooks.godocs/README.mddocs/guides/a2a.mddocs/guides/agents.mddocs/guides/compaction.mddocs/guides/hooks.mddocs/guides/subagents.mddocs/guides/todo-lists.mdexamples/a2a_example/main.goexperimental/cmd/dive/app.goexperimental/cmd/dive/app_interactive_test.goexperimental/cmd/dive/app_test.goexperimental/cmd/dive/main.goexperimental/cmd/dive/render.goexperimental/toolkit/extended/task.goexperimental/toolkit/extended/task_test.goexperimental/toolkit/extended/todo.gosubagent/loader.gosubagent/subagent.gosubagent/subagent_test.gotodo/todo_tracker.gotodo/types.gotoolkit/firecrawl/firecrawl.gotoolkit/firecrawl/firecrawl_test.gotoolkit/firecrawl/firecrawl_types.gotoolkit/google/search.gotoolkit/google/types.gotoolkit/kagi/kagi.go
- Translate wonton's "raw_html" format to Firecrawl's "rawHtml" (the v2 API rejects snake_case format names). - Drop the redundant "v2 specific defaults" block (blockAds, removeBase64Images, proxy: auto, storeInCache); these match Firecrawl's own defaults and forcing them removed any future ability for callers to override. - Map metadata.keywords (which Firecrawl returns as either a string or array) into fetch.Metadata.Keywords via a custom unmarshaler. - Nil-check scrapeResp.Data and Data.Metadata before dereferencing. - Add 400, 401, 404, and 502/503/504 to the friendly error branches. - Cover the new behavior with unit tests. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The TodoWrite tool was the last piece left in experimental/toolkit/extended/
after the previous promotion batch. Move it to the stable todo/ package and
restructure it the way the rest of Dive's tools work — as a dive.Extension
modeled after skill.Loader.
Architecture changes:
- Tool is stateless. The previous TodoWriteTool stored todos []TodoItem on
its struct, which meant a single instance reused across sessions or agents
bled state between them. The conversation message history is now the sole
source of truth for the current list. GetTodos / GetCurrentTask are
removed (zero callers in-tree outside the tool's own tests).
- New todo.Extension implements dive.Extension. It returns the TodoWrite
tool plus a PreGenerationHook that walks message history, finds the most
recent TodoWrite tool_use, and injects a <system-reminder name="todos">
block via dive.SetSystemReminder when the model has not used the tool in
the last N assistant turns (default 10, matching Claude Code's
TODO_REMINDER_CONFIG.TURNS_SINCE_WRITE). When the model uses TodoWrite
again, the next iteration removes the block.
- Both tool and hook are fully stateless, so a single Extension instance is
safe to share across agents, sessions, and subagents. No parallel store,
no per-tool-instance bleed. This is structurally simpler than Claude
Code's appState.todos[agentId ?? sessionId] map because Dive does not
have a process-singleton app state to begin with.
Behavioral changes:
- Tool result text now matches Claude's canonical TodoWrite acknowledgment
("Todos have been modified successfully. Ensure that you continue to use
the todo list to track your progress...") followed by a one-line progress
summary, instead of returning a JSON counts blob. Claude is tuned on the
canonical text shape.
- Empty list with all tasks complete appends a one-line nudge to either
write a fresh list or wrap up, but the list itself is retained (not
auto-cleared as Claude Code does — Dive's persistence is more useful).
API changes:
- experimental.NewTodoWriteTool(...) -> todo.NewTool(...) or todo.New(...)
- TodoWriteToolOptions{OnUpdate: fn} -> todo.WithOnUpdate(fn) /
todo.WithExtensionOnUpdate(fn)
- TodoWriteInput -> todo.WriteInput
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Actionable comments posted: 4
🧹 Nitpick comments (1)
todo/tool_test.go (1)
36-137: Add a regression case for multiplein_progresstodos.Given the tool contract says only one active item at a time, please add a test that submits two
in_progressitems and asserts an error result. This will lock behavior and prevent future regressions.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@todo/tool_test.go` around lines 36 - 137, Add a new subtest inside TestTool_Call named "MultipleInProgress" that constructs a NewTool(), builds a WriteInput with two TodoItem entries both using TodoStatusInProgress, calls tool.Call(ctx, input), and asserts the call returns no error but result.IsError is true and result.Content[0].Text contains the expected validation message (e.g., "only one active item" or "only one active item at a time") to lock the behavior that only one in_progress item is allowed.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@todo/extension.go`:
- Around line 142-152: The loop over msg.Content currently scans forward and may
return an older TodoWrite call; change the iteration in the block that inspects
msg.Content (looking for *llm.ToolUseContent with tu.Name == ToolName and
calling parseTodoInput) to iterate from the end to the start (for i :=
len(msg.Content)-1; i >= 0; i--) so the function picks the latest TodoWrite
within the same assistant message and returns parsed, turnsSince, true for the
first matching (most recent) tool use.
In `@todo/tool.go`:
- Around line 148-158: Both PreviewCall and Call dereference input without
checking for nil; add a nil guard at the start of PreviewCall and Call to return
an appropriate error/result when input == nil. In PreviewCall (func (t *Tool)
PreviewCall) check input == nil and return a ToolCallPreview that indicates the
missing payload (or an error preview) instead of calling countByStatus; in Call
(func (t *Tool) Call) check input == nil and return
dive.NewToolResultError("todos array is required") (or a clearer error message)
to avoid panics when accessing input.Todos. Ensure both guards run before any
access to input or calling countByStatus.
- Around line 160-170: The loop over input.Todos currently validates each item's
fields but does not enforce the rule that only one todo may be in the
TodoStatusInProgress state; add validation (after or during the loop) that
counts occurrences of TodoStatusInProgress in input.Todos and if count > 1
return dive.NewToolResultError with a clear message like "only one todo item may
have status 'in_progress'". Use the existing input.Todos slice and the
TodoStatusInProgress constant to locate the check and return the error in the
same style as the other validations.
- Around line 107-135: The Item object in Tool.Schema() lacks a Required list so
the LLM can't rely on schema validation; update the Items property inside
Schema() (the nested *schema.Property under "todos" with Type "object") to
include Required: []string{"content","status","activeForm"} so each todo item
enforces those fields during validation; locate the Items property in the
Schema() method and add the Required slice to that nested object property.
---
Nitpick comments:
In `@todo/tool_test.go`:
- Around line 36-137: Add a new subtest inside TestTool_Call named
"MultipleInProgress" that constructs a NewTool(), builds a WriteInput with two
TodoItem entries both using TodoStatusInProgress, calls tool.Call(ctx, input),
and asserts the call returns no error but result.IsError is true and
result.Content[0].Text contains the expected validation message (e.g., "only one
active item" or "only one active item at a time") to lock the behavior that only
one in_progress item is allowed.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 63acf304-592f-402f-a042-d3b9a53e964f
📒 Files selected for processing (8)
CLAUDE.mddocs/guides/todo-lists.mdexperimental/toolkit/extended/todo.goexperimental/toolkit/extended/todo_test.gotodo/extension.gotodo/extension_test.gotodo/tool.gotodo/tool_test.go
💤 Files with no reviewable changes (2)
- experimental/toolkit/extended/todo_test.go
- experimental/toolkit/extended/todo.go
✅ Files skipped from review due to trivial changes (2)
- CLAUDE.md
- docs/guides/todo-lists.md
| for _, c := range msg.Content { | ||
| tu, ok := c.(*llm.ToolUseContent) | ||
| if !ok || tu.Name != ToolName { | ||
| continue | ||
| } | ||
| parsed, err := parseTodoInput(tu.Input) | ||
| if err != nil { | ||
| continue | ||
| } | ||
| return parsed, turnsSince, true | ||
| } |
There was a problem hiding this comment.
Pick the latest TodoWrite call within the same assistant message.
Line 142 iterates content in forward order. If one assistant message contains multiple TodoWrite tool uses, this can return an older call from that message. Iterate message content from end to start.
🔧 Suggested fix
- for _, c := range msg.Content {
- tu, ok := c.(*llm.ToolUseContent)
+ for j := len(msg.Content) - 1; j >= 0; j-- {
+ tu, ok := msg.Content[j].(*llm.ToolUseContent)
if !ok || tu.Name != ToolName {
continue
}
parsed, err := parseTodoInput(tu.Input)
if err != nil {
continue
}
return parsed, turnsSince, true
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| for _, c := range msg.Content { | |
| tu, ok := c.(*llm.ToolUseContent) | |
| if !ok || tu.Name != ToolName { | |
| continue | |
| } | |
| parsed, err := parseTodoInput(tu.Input) | |
| if err != nil { | |
| continue | |
| } | |
| return parsed, turnsSince, true | |
| } | |
| for j := len(msg.Content) - 1; j >= 0; j-- { | |
| tu, ok := msg.Content[j].(*llm.ToolUseContent) | |
| if !ok || tu.Name != ToolName { | |
| continue | |
| } | |
| parsed, err := parseTodoInput(tu.Input) | |
| if err != nil { | |
| continue | |
| } | |
| return parsed, turnsSince, true | |
| } |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@todo/extension.go` around lines 142 - 152, The loop over msg.Content
currently scans forward and may return an older TodoWrite call; change the
iteration in the block that inspects msg.Content (looking for
*llm.ToolUseContent with tu.Name == ToolName and calling parseTodoInput) to
iterate from the end to the start (for i := len(msg.Content)-1; i >= 0; i--) so
the function picks the latest TodoWrite within the same assistant message and
returns parsed, turnsSince, true for the first matching (most recent) tool use.
| func (t *Tool) Schema() *schema.Schema { | ||
| return &schema.Schema{ | ||
| Type: "object", | ||
| Required: []string{"todos"}, | ||
| Properties: map[string]*schema.Property{ | ||
| "todos": { | ||
| Type: "array", | ||
| Description: "The complete updated todo list", | ||
| Items: &schema.Property{ | ||
| Type: "object", | ||
| Properties: map[string]*schema.Property{ | ||
| "content": { | ||
| Type: "string", | ||
| Description: "The task description in imperative form (e.g., 'Run tests')", | ||
| }, | ||
| "status": { | ||
| Type: "string", | ||
| Enum: []any{"pending", "in_progress", "completed"}, | ||
| Description: "The task status: pending, in_progress, or completed", | ||
| }, | ||
| "activeForm": { | ||
| Type: "string", | ||
| Description: "The task in present continuous form (e.g., 'Running tests')", | ||
| }, | ||
| }, | ||
| }, | ||
| }, | ||
| }, | ||
| } |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
find . -name "tool.go" -path "*/todo/*" -type fRepository: deepnoodle-ai/dive
Length of output: 76
🏁 Script executed:
cat -n ./todo/tool.goRepository: deepnoodle-ai/dive
Length of output: 9016
Add Required slice to Items property schema to align with validation.
Lines 161–169 reject todo items missing content, status, or activeForm, but the schema at lines 115–131 does not declare these fields as required on the Items object. This forces the LLM to learn the constraints through rejection loops rather than schema inspection.
♻️ Schema alignment fix
"todos": {
Type: "array",
Description: "The complete updated todo list",
Items: &schema.Property{
Type: "object",
+ Required: []string{"content", "status", "activeForm"},
Properties: map[string]*schema.Property{📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| func (t *Tool) Schema() *schema.Schema { | |
| return &schema.Schema{ | |
| Type: "object", | |
| Required: []string{"todos"}, | |
| Properties: map[string]*schema.Property{ | |
| "todos": { | |
| Type: "array", | |
| Description: "The complete updated todo list", | |
| Items: &schema.Property{ | |
| Type: "object", | |
| Properties: map[string]*schema.Property{ | |
| "content": { | |
| Type: "string", | |
| Description: "The task description in imperative form (e.g., 'Run tests')", | |
| }, | |
| "status": { | |
| Type: "string", | |
| Enum: []any{"pending", "in_progress", "completed"}, | |
| Description: "The task status: pending, in_progress, or completed", | |
| }, | |
| "activeForm": { | |
| Type: "string", | |
| Description: "The task in present continuous form (e.g., 'Running tests')", | |
| }, | |
| }, | |
| }, | |
| }, | |
| }, | |
| } | |
| func (t *Tool) Schema() *schema.Schema { | |
| return &schema.Schema{ | |
| Type: "object", | |
| Required: []string{"todos"}, | |
| Properties: map[string]*schema.Property{ | |
| "todos": { | |
| Type: "array", | |
| Description: "The complete updated todo list", | |
| Items: &schema.Property{ | |
| Type: "object", | |
| Required: []string{"content", "status", "activeForm"}, | |
| Properties: map[string]*schema.Property{ | |
| "content": { | |
| Type: "string", | |
| Description: "The task description in imperative form (e.g., 'Run tests')", | |
| }, | |
| "status": { | |
| Type: "string", | |
| Enum: []any{"pending", "in_progress", "completed"}, | |
| Description: "The task status: pending, in_progress, or completed", | |
| }, | |
| "activeForm": { | |
| Type: "string", | |
| Description: "The task in present continuous form (e.g., 'Running tests')", | |
| }, | |
| }, | |
| }, | |
| }, | |
| }, | |
| } |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@todo/tool.go` around lines 107 - 135, The Item object in Tool.Schema() lacks
a Required list so the LLM can't rely on schema validation; update the Items
property inside Schema() (the nested *schema.Property under "todos" with Type
"object") to include Required: []string{"content","status","activeForm"} so each
todo item enforces those fields during validation; locate the Items property in
the Schema() method and add the Required slice to that nested object property.
| func (t *Tool) PreviewCall(_ context.Context, input *WriteInput) *dive.ToolCallPreview { | ||
| pending, inProgress, completed, _ := countByStatus(input.Todos) | ||
| return &dive.ToolCallPreview{ | ||
| Summary: fmt.Sprintf("Update todos: %d pending, %d in progress, %d completed", | ||
| pending, inProgress, completed), | ||
| } | ||
| } | ||
|
|
||
| func (t *Tool) Call(_ context.Context, input *WriteInput) (*dive.ToolResult, error) { | ||
| if input.Todos == nil { | ||
| return dive.NewToolResultError("todos array is required"), nil |
There was a problem hiding this comment.
Guard nil input to avoid panics in preview/call paths.
Line 149 and Line 157 dereference input without a nil check. A nil payload can panic before producing a tool error result.
🛡️ Suggested nil guards
func (t *Tool) PreviewCall(_ context.Context, input *WriteInput) *dive.ToolCallPreview {
+ if input == nil {
+ return &dive.ToolCallPreview{
+ Summary: "Update todos: 0 pending, 0 in progress, 0 completed",
+ }
+ }
pending, inProgress, completed, _ := countByStatus(input.Todos)
return &dive.ToolCallPreview{
Summary: fmt.Sprintf("Update todos: %d pending, %d in progress, %d completed",
pending, inProgress, completed),
}
}
func (t *Tool) Call(_ context.Context, input *WriteInput) (*dive.ToolResult, error) {
+ if input == nil {
+ return dive.NewToolResultError("input is required"), nil
+ }
if input.Todos == nil {
return dive.NewToolResultError("todos array is required"), nil
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| func (t *Tool) PreviewCall(_ context.Context, input *WriteInput) *dive.ToolCallPreview { | |
| pending, inProgress, completed, _ := countByStatus(input.Todos) | |
| return &dive.ToolCallPreview{ | |
| Summary: fmt.Sprintf("Update todos: %d pending, %d in progress, %d completed", | |
| pending, inProgress, completed), | |
| } | |
| } | |
| func (t *Tool) Call(_ context.Context, input *WriteInput) (*dive.ToolResult, error) { | |
| if input.Todos == nil { | |
| return dive.NewToolResultError("todos array is required"), nil | |
| func (t *Tool) PreviewCall(_ context.Context, input *WriteInput) *dive.ToolCallPreview { | |
| if input == nil { | |
| return &dive.ToolCallPreview{ | |
| Summary: "Update todos: 0 pending, 0 in progress, 0 completed", | |
| } | |
| } | |
| pending, inProgress, completed, _ := countByStatus(input.Todos) | |
| return &dive.ToolCallPreview{ | |
| Summary: fmt.Sprintf("Update todos: %d pending, %d in progress, %d completed", | |
| pending, inProgress, completed), | |
| } | |
| } | |
| func (t *Tool) Call(_ context.Context, input *WriteInput) (*dive.ToolResult, error) { | |
| if input == nil { | |
| return dive.NewToolResultError("input is required"), nil | |
| } | |
| if input.Todos == nil { | |
| return dive.NewToolResultError("todos array is required"), nil | |
| } |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@todo/tool.go` around lines 148 - 158, Both PreviewCall and Call dereference
input without checking for nil; add a nil guard at the start of PreviewCall and
Call to return an appropriate error/result when input == nil. In PreviewCall
(func (t *Tool) PreviewCall) check input == nil and return a ToolCallPreview
that indicates the missing payload (or an error preview) instead of calling
countByStatus; in Call (func (t *Tool) Call) check input == nil and return
dive.NewToolResultError("todos array is required") (or a clearer error message)
to avoid panics when accessing input.Todos. Ensure both guards run before any
access to input or calling countByStatus.
| for i, item := range input.Todos { | ||
| if item.Content == "" { | ||
| return dive.NewToolResultError(fmt.Sprintf("todo[%d].content is required", i)), nil | ||
| } | ||
| if item.ActiveForm == "" { | ||
| return dive.NewToolResultError(fmt.Sprintf("todo[%d].activeForm is required", i)), nil | ||
| } | ||
| if item.Status != TodoStatusPending && item.Status != TodoStatusInProgress && item.Status != TodoStatusCompleted { | ||
| return dive.NewToolResultError(fmt.Sprintf("todo[%d].status must be 'pending', 'in_progress', or 'completed'", i)), nil | ||
| } | ||
| } |
There was a problem hiding this comment.
Enforce the “one in_progress item” contract.
The description/rules state only one active task at a time, but Line 160-170 accepts multiple in_progress items. This weakens consistency of the todo state.
✅ Suggested validation
func (t *Tool) Call(_ context.Context, input *WriteInput) (*dive.ToolResult, error) {
if input.Todos == nil {
return dive.NewToolResultError("todos array is required"), nil
}
+ inProgressCount := 0
for i, item := range input.Todos {
if item.Content == "" {
return dive.NewToolResultError(fmt.Sprintf("todo[%d].content is required", i)), nil
}
if item.ActiveForm == "" {
return dive.NewToolResultError(fmt.Sprintf("todo[%d].activeForm is required", i)), nil
}
if item.Status != TodoStatusPending && item.Status != TodoStatusInProgress && item.Status != TodoStatusCompleted {
return dive.NewToolResultError(fmt.Sprintf("todo[%d].status must be 'pending', 'in_progress', or 'completed'", i)), nil
}
+ if item.Status == TodoStatusInProgress {
+ inProgressCount++
+ }
}
+ if inProgressCount > 1 {
+ return dive.NewToolResultError("only one todo may be 'in_progress'"), nil
+ }📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| for i, item := range input.Todos { | |
| if item.Content == "" { | |
| return dive.NewToolResultError(fmt.Sprintf("todo[%d].content is required", i)), nil | |
| } | |
| if item.ActiveForm == "" { | |
| return dive.NewToolResultError(fmt.Sprintf("todo[%d].activeForm is required", i)), nil | |
| } | |
| if item.Status != TodoStatusPending && item.Status != TodoStatusInProgress && item.Status != TodoStatusCompleted { | |
| return dive.NewToolResultError(fmt.Sprintf("todo[%d].status must be 'pending', 'in_progress', or 'completed'", i)), nil | |
| } | |
| } | |
| func (t *Tool) Call(_ context.Context, input *WriteInput) (*dive.ToolResult, error) { | |
| if input.Todos == nil { | |
| return dive.NewToolResultError("todos array is required"), nil | |
| } | |
| inProgressCount := 0 | |
| for i, item := range input.Todos { | |
| if item.Content == "" { | |
| return dive.NewToolResultError(fmt.Sprintf("todo[%d].content is required", i)), nil | |
| } | |
| if item.ActiveForm == "" { | |
| return dive.NewToolResultError(fmt.Sprintf("todo[%d].activeForm is required", i)), nil | |
| } | |
| if item.Status != TodoStatusPending && item.Status != TodoStatusInProgress && item.Status != TodoStatusCompleted { | |
| return dive.NewToolResultError(fmt.Sprintf("todo[%d].status must be 'pending', 'in_progress', or 'completed'", i)), nil | |
| } | |
| if item.Status == TodoStatusInProgress { | |
| inProgressCount++ | |
| } | |
| } | |
| if inProgressCount > 1 { | |
| return dive.NewToolResultError("only one todo may be 'in_progress'"), nil | |
| } |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@todo/tool.go` around lines 160 - 170, The loop over input.Todos currently
validates each item's fields but does not enforce the rule that only one todo
may be in the TodoStatusInProgress state; add validation (after or during the
loop) that counts occurrences of TodoStatusInProgress in input.Todos and if
count > 1 return dive.NewToolResultError with a clear message like "only one
todo item may have status 'in_progress'". Use the existing input.Todos slice and
the TodoStatusInProgress constant to locate the check and return the error in
the same style as the other validations.
The todo.Extension now captures each successful TodoWrite as a hidden <todo-state> block on the tool_result message via a PostToolUse hook, so the stale-list reminder and downstream tooling can read authoritative state from conversation history instead of re-parsing raw tool_use input. compaction.CompactMessages carries the latest block forward into the summary message so todo state survives context rewriting. The state capture hook refuses to emit a block for failed tool results as a defensive guard against future dispatch refactors. Adds end-to-end integration tests covering multi-turn persistence, reminder self-clearing after a second write, empty-list handling, malformed-block fail-safe behavior, suspend/resume preservation, and compaction-driven state carry-forward. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Adds table cases for 400, 401, 404 and additional status codes to pin the client's HTTP error surface. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
🧹 Nitpick comments (3)
compaction/compaction.go (1)
212-226: Token estimate doesn't account for appended todo-state block.The
tokensAftercalculation on lines 222-223 only usessummaryPrefix + summaryTextlength, but if a<todo-state>block is appended (lines 214-216), this underestimates the actual token count. Consider including the state block length:🔧 Suggested fix
- // Step 7: Build compaction event - // TokensAfter is estimated from full summary message length (rough heuristic: ~4 chars per token) - fullSummaryLen := len(summaryPrefix) + len(summaryText) + // Step 7: Build compaction event + // TokensAfter is estimated from full summary message length (rough heuristic: ~4 chars per token) + fullSummaryLen := 0 + for _, c := range summaryMsg.Content { + if tc, ok := c.(*llm.TextContent); ok { + fullSummaryLen += len(tc.Text) + } + }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@compaction/compaction.go` around lines 212 - 226, The token estimate ignores any appended todo-state block; update the tokensAfter calculation to include the length of text added to summaryMsg (including the todo.StateBlock content) rather than only fullSummaryLen derived from summaryPrefix+summaryText. Locate summaryMsg and, after appending possible todo.StateBlock (todo.StateBlock / summaryMsg.Content), compute the total summary length by summing summaryPrefix+summaryText plus any Text from summaryMsg.Content (or iterate summaryMsg.Content to accumulate Text lengths), then derive tokensAfter from that total and keep the existing minimum cap (tokensAfter = max(totalLen/4, 100)).todo/extension_integration_test.go (1)
22-93: Consider extracting shared test helpers.
scriptedLLMandtodoWriteToolUseare duplicated betweentodo/extension_integration_test.goandcompaction/todo_integration_test.go. Consider extracting to a sharedinternal/testutilpackage if more tests are added.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@todo/extension_integration_test.go` around lines 22 - 93, The test helpers scriptedLLM (type and methods: Name, Generate, calls, lastReceived) and todoWriteToolUse are duplicated across tests; extract them into a shared internal test helper package (e.g., internal/testutil) and update tests to import and use the helpers. Move the scriptedLLM type and its methods plus todoWriteToolUse and any dependent types (TodoItem, WriteInput, ToolName references) into that package, keep their APIs identical, update imports in todo/extension_integration_test.go and compaction/todo_integration_test.go to use testutil.scriptedLLM and testutil.todoWriteToolUse, and run tests to ensure no missing dependencies or visibility issues (export symbols if needed).todo/state.go (1)
54-66: Consider iterating message content from end to start for consistency.If a single message contains multiple
TextContentblocks with<todo-state>markers (edge case but possible), the forward iteration returns the first (oldest) block in that message. For consistency with the newest-to-oldest message scan, consider iterating content in reverse:🔧 Suggested fix
for i := len(messages) - 1; i >= 0; i-- { msg := messages[i] - for _, c := range msg.Content { - tc, ok := c.(*llm.TextContent) + for j := len(msg.Content) - 1; j >= 0; j-- { + tc, ok := msg.Content[j].(*llm.TextContent) if !ok { continue }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@todo/state.go` around lines 54 - 66, The current scan walks messages newest-to-oldest but iterates a single message's Content forward, which can return an older <todo-state> inside a message; change the inner loop over msg.Content to iterate from end to start (e.g., index from len(msg.Content)-1 down to 0) so you examine the newest TextContent first; keep the existing checks and returns that use llm.TextContent, parseStateBlock, cloneTodos, snap.Todos, snap.TurnsSinceWrite and turnsAfterState unchanged.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@compaction/compaction.go`:
- Around line 212-226: The token estimate ignores any appended todo-state block;
update the tokensAfter calculation to include the length of text added to
summaryMsg (including the todo.StateBlock content) rather than only
fullSummaryLen derived from summaryPrefix+summaryText. Locate summaryMsg and,
after appending possible todo.StateBlock (todo.StateBlock / summaryMsg.Content),
compute the total summary length by summing summaryPrefix+summaryText plus any
Text from summaryMsg.Content (or iterate summaryMsg.Content to accumulate Text
lengths), then derive tokensAfter from that total and keep the existing minimum
cap (tokensAfter = max(totalLen/4, 100)).
In `@todo/extension_integration_test.go`:
- Around line 22-93: The test helpers scriptedLLM (type and methods: Name,
Generate, calls, lastReceived) and todoWriteToolUse are duplicated across tests;
extract them into a shared internal test helper package (e.g.,
internal/testutil) and update tests to import and use the helpers. Move the
scriptedLLM type and its methods plus todoWriteToolUse and any dependent types
(TodoItem, WriteInput, ToolName references) into that package, keep their APIs
identical, update imports in todo/extension_integration_test.go and
compaction/todo_integration_test.go to use testutil.scriptedLLM and
testutil.todoWriteToolUse, and run tests to ensure no missing dependencies or
visibility issues (export symbols if needed).
In `@todo/state.go`:
- Around line 54-66: The current scan walks messages newest-to-oldest but
iterates a single message's Content forward, which can return an older
<todo-state> inside a message; change the inner loop over msg.Content to iterate
from end to start (e.g., index from len(msg.Content)-1 down to 0) so you examine
the newest TextContent first; keep the existing checks and returns that use
llm.TextContent, parseStateBlock, cloneTodos, snap.Todos, snap.TurnsSinceWrite
and turnsAfterState unchanged.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: e91f1e98-b103-463e-9c9f-d01fa7e8fe68
📒 Files selected for processing (10)
CLAUDE.mdcompaction/compaction.gocompaction/compaction_test.gocompaction/todo_integration_test.godocs/guides/todo-lists.mdtodo/extension.gotodo/extension_integration_test.gotodo/extension_test.gotodo/state.gotoolkit/firecrawl/firecrawl_test.go
✅ Files skipped from review due to trivial changes (2)
- CLAUDE.md
- todo/extension_test.go
🚧 Files skipped from review as they are similar to previous changes (2)
- docs/guides/todo-lists.md
- toolkit/firecrawl/firecrawl_test.go
Summary
subagent,compaction,todo,a2a,toolkit/firecrawl,toolkit/google, andtoolkit/kagifromexperimental/to top-level stable pathsexperimental/cmd/dive, andexamplesdocs/guides/experimental/todocs/guides/for promoted packagesbaseURLwas a package-level var mutated by option functions; moved to Client struct field. Retry sleeps now respect context cancellation.KagiClient→Client,WithKagiAPIKey→WithAPIKey, etc.)Searchno longer mutates caller's input struct. Removed dead commented-out code.Handler()allocated a newhttp.ServeMuxper call; now built once and cached.os.Getwd()call.MapLoader.Load()returns a defensive copy.settingsdeferred from this batch — it importsexperimental/sandboxwhich is staying experimental, and nothing currently imports it.Test plan
go vet ./...cleango test ./...all passingcd experimental/cmd/dive && go buildsucceedscd examples && go vet ./...cleanexperimental/references in code, guides, or README (historical PRD/plan docs left as-is)docs/guides/experimental/only contains still-experimental guides (mcp, sandboxing, media-generation)🤖 Generated with Claude Code
Summary by CodeRabbit
New Features
Improvements
Tests
Documentation