Skip to content

Codex/agent core roadmap#2

Merged
echoVic merged 16 commits intomainfrom
codex/agent-core-roadmap
Apr 6, 2026
Merged

Codex/agent core roadmap#2
echoVic merged 16 commits intomainfrom
codex/agent-core-roadmap

Conversation

@echoVic
Copy link
Copy Markdown
Owner

@echoVic echoVic commented Apr 4, 2026

Summary by CodeRabbit

Release Notes

  • New Features

    • Memory management system with read/write tools and persistent storage
    • Enhanced background agent execution with task management (create, get, update, list, stop)
    • Automatic context compression with four optimization tiers
    • Tool discovery and dynamic tool exposure via discovery interface
    • Context-aware skill activation based on file conditions
    • Environment context omission for subagents to reduce token usage
  • Improvements

    • Streaming tool execution support with incremental result handling
    • Tool source policies and permission tracking via denial history
    • Enhanced hook runtime for session-level event handling
    • Dynamic tool behavior resolution based on parameters

echoVic added 10 commits April 3, 2026 08:32
- 引入 SubagentRegistry 支持会话级别的子代理注册,不同会话间不共享自定义代理
- 新增 MemoryRead/MemoryWrite 工具,仅在显式传入 MemoryManager 时注册
- 添加结构化任务管理工具(TaskCreate/Get/Update/List/Stop)
- 扩展 AgentSession 接口,支持输出文件路径记录
- 改进令牌预算逻辑,添加收益递减检测和接近限制警告
- 完善测试覆盖和文档更新
将工具参数修复逻辑提取为独立模块,并在 executeToolCalls 中增加 onAfterToolExec 钩子
新增 StreamingToolExecutor 以支持流式响应中的工具调用,能够实时派发可解析的工具
修改 AgentLoop 以集成流式执行器,避免重复调用钩子并确保事件顺序正确
添加全面的单元测试覆盖新功能和集成场景
新增 RecoveryEvent 类型用于追踪恢复流程各阶段
提取独立的 isOverflowRecoverable 函数,增强对嵌套错误链(如 CannotRetryError)的检测能力
优化 AgentLoop 中的恢复逻辑,确保同一轮次内仅尝试一次恢复
添加对应测试用例验证恢复流程的正确性
- 在子代理执行时支持传递 `omitEnvironment` 配置以裁剪运行时上下文
- 为技能元数据添加 `runtimeEffects` 字段,显式定义运行时效果(允许的工具、模型ID)
- 引入 `HookBus` 和 `HookRuntime` 统一管理会话钩子,优化提示提交处理
- 扩展 `ExecutionContext` 和 `ChatContext` 以支持后台代理管理器
- 实现 `sideQuery` 方法用于旁路查询,避免污染主对话历史
- 新增 `microcompact` 策略,在 LLM 压缩前智能替换大型历史工具输出
- 重构后台代理管理器,分离生命周期控制器和工作控制器,改进取消逻辑
- 工具注册表添加排序缓存,提升工具列表获取性能
- 为 MCP 服务器添加能力投影,提供连接状态、健康检查和工具信息
- 新增多个单元测试,覆盖压缩处理、钩子运行时、循环状态等场景
@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Apr 4, 2026

Caution

Review failed

The pull request is closed.

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 7ff8f7c2-c7b5-4012-807e-bdc88ce03326

📥 Commits

Reviewing files that changed from the base of the PR and between 6fcfa9e and ee7cb3f.

⛔ Files ignored due to path filters (1)
  • pnpm-lock.yaml is excluded by !**/pnpm-lock.yaml
📒 Files selected for processing (113)
  • package.json
  • src/__tests__/rootExports.test.ts
  • src/acp/AcpServiceContext.ts
  • src/agent/Agent.ts
  • src/agent/AgentEvent.ts
  • src/agent/AgentLoop.ts
  • src/agent/LoopHookBuilder.ts
  • src/agent/LoopRunner.ts
  • src/agent/PlanExecutor.ts
  • src/agent/RuntimePatchManager.ts
  • src/agent/StreamingToolExecutor.ts
  • src/agent/__tests__/Agent.stream.test.ts
  • src/agent/__tests__/Agent.test.ts
  • src/agent/__tests__/AgentLoop.streaming.test.ts
  • src/agent/__tests__/AgentLoop.test.ts
  • src/agent/__tests__/LoopRunner.test.ts
  • src/agent/__tests__/StreamingToolExecutor.test.ts
  • src/agent/isOverflowRecoverable.ts
  • src/agent/loop/__tests__/executeToolCalls.test.ts
  • src/agent/loop/__tests__/planToolExecution.test.ts
  • src/agent/loop/executeToolCalls.ts
  • src/agent/loop/planToolExecution.ts
  • src/agent/loop/runToolCall.ts
  • src/agent/loop/toolInterruptBehavior.ts
  • src/agent/state/LoopState.ts
  • src/agent/state/TurnState.ts
  • src/agent/subagents/AgentSessionStore.ts
  • src/agent/subagents/BackgroundAgentManager.ts
  • src/agent/subagents/SubagentExecutor.ts
  • src/agent/subagents/__tests__/AgentSessionStore.test.ts
  • src/agent/subagents/__tests__/BackgroundAgentManager.test.ts
  • src/agent/subagents/__tests__/SubagentExecutor.test.ts
  • src/agent/types.ts
  • src/context/ContextManager.ts
  • src/context/README.md
  • src/context/strategies/MicrocompactStrategy.ts
  • src/hooks/HookRuntime.ts
  • src/hooks/HookStage.ts
  • src/hooks/PostToolUseHookStage.ts
  • src/hooks/__tests__/HookRuntime.test.ts
  • src/index.ts
  • src/prompts/builder.ts
  • src/runtime/RuntimeContextPatch.ts
  • src/runtime/RuntimePatch.ts
  • src/runtime/__tests__/RuntimePatch.test.ts
  • src/runtime/index.ts
  • src/services/FileSystemService.ts
  • src/session/Session.ts
  • src/session/SessionRuntime.ts
  • src/session/__tests__/SessionContext.test.ts
  • src/session/__tests__/SessionRuntime.test.ts
  • src/session/types.ts
  • src/skills/SkillLoader.ts
  • src/skills/SkillRegistry.ts
  • src/skills/__tests__/SkillLoader.test.ts
  • src/skills/__tests__/SkillRegistry.test.ts
  • src/skills/__tests__/activation.test.ts
  • src/skills/activation.ts
  • src/skills/index.ts
  • src/skills/injectSkillsMetadata.ts
  • src/skills/types.ts
  • src/tools/__tests__/createTool.test.ts
  • src/tools/builtin/file/edit.ts
  • src/tools/builtin/file/read.ts
  • src/tools/builtin/file/sensitivePathCheck.ts
  • src/tools/builtin/file/write.ts
  • src/tools/builtin/index.ts
  • src/tools/builtin/memory/memoryRead.ts
  • src/tools/builtin/memory/memoryWrite.ts
  • src/tools/builtin/search/glob.ts
  • src/tools/builtin/search/grep.ts
  • src/tools/builtin/shell/bash.ts
  • src/tools/builtin/shell/killShell.ts
  • src/tools/builtin/system/__tests__/askUserQuestion.test.ts
  • src/tools/builtin/system/__tests__/discoverTools.test.ts
  • src/tools/builtin/system/__tests__/skill.test.ts
  • src/tools/builtin/system/askUserQuestion.ts
  • src/tools/builtin/system/discoverTools.ts
  • src/tools/builtin/system/index.ts
  • src/tools/builtin/system/skill.ts
  • src/tools/builtin/task/__tests__/taskTools.test.ts
  • src/tools/builtin/task/task.ts
  • src/tools/builtin/task/taskOutput.ts
  • src/tools/builtin/todo/todoWrite.ts
  • src/tools/builtin/web/webFetch.ts
  • src/tools/builtin/web/webSearch.ts
  • src/tools/catalog/ToolCatalog.ts
  • src/tools/catalog/__tests__/ToolCatalog.test.ts
  • src/tools/catalog/index.ts
  • src/tools/core/ToolInvocation.ts
  • src/tools/core/createTool.ts
  • src/tools/execution/ExecutionPipeline.ts
  • src/tools/execution/FileLockManager.ts
  • src/tools/execution/PipelineStages.ts
  • src/tools/execution/ResultArtifactStore.ts
  • src/tools/execution/__tests__/ExecutionPipeline.test.ts
  • src/tools/execution/__tests__/FileLockManager.test.ts
  • src/tools/exposure/ToolExposurePlanner.ts
  • src/tools/exposure/__tests__/ToolExposurePlanner.test.ts
  • src/tools/exposure/index.ts
  • src/tools/registry/ToolRegistry.ts
  • src/tools/registry/__tests__/ToolRegistry.test.ts
  • src/tools/search/toolSearch.ts
  • src/tools/types/ExecutionTypes.ts
  • src/tools/types/ToolEffects.ts
  • src/tools/types/ToolTypes.ts
  • src/tools/types/__tests__/ToolEffects.test.ts
  • src/tools/types/index.ts
  • src/tools/validation/__tests__/zodSchemas.test.ts
  • src/tools/validation/lazySchema.ts
  • src/tools/validation/zodSchemas.ts
  • src/types/__tests__/permissions.test.ts
  • src/types/permissions.ts

📝 Walkthrough

Walkthrough

This pull request introduces comprehensive runtime management capabilities: a memory persistence system with read/write tools, background agent task execution with lifecycle management, a hook runtime system for callback integration, tool catalog and runtime patch mechanisms for skill/tool effects, streaming tool execution with progress tracking, context compression with microcompaction, and subagent registry improvements. Documentation updates cover automatic context compression, skills metadata specifications, and MCP capabilities reporting.

Changes

Cohort / File(s) Summary
Memory System
src/memory/..., src/tools/builtin/memory/...
New memory persistence layer with FileSystemMemoryStore, MemoryManager, and optional memory read/write tools (createMemoryReadTool/createMemoryWriteTool). Memory types include user, feedback, project, and reference categories with CRUD operations.
Background Agents & Tasks
src/agent/subagents/BackgroundAgentManager.ts, src/tools/builtin/task/...
Enhanced background agent execution with separate lifecycle/work controllers, session output files, and progress tracking. Refactored task tool from singleton to factory-based (TaskCreate, TaskGet, TaskUpdate, TaskList, TaskStop with per-session TaskStore).
Hook Runtime System
src/hooks/HookRuntime.ts, src/hooks/HookBus.ts, src/hooks/BashClassifier.ts
New HookRuntime for session-level hook management with pre/post-tool-use, permission request, user prompt, and session lifecycle hooks. BashClassifier categorizes bash commands (destructive/write/readonly). Removed legacy HookStage and PostToolUseHookStage.
Runtime Patches & Effects
src/runtime/RuntimePatch.ts, src/agent/RuntimePatchManager.ts
Runtime patch system enabling skill/tool effects to modify tool policy, context overlay, environment, model, and hook registrations. RuntimePatchManager tracks patch applications with provenance and manages turn/session-scoped effects.
Streaming Tool Execution
src/agent/StreamingToolExecutor.ts, src/agent/loop/runToolCall.ts, src/agent/loop/toolInterruptBehavior.ts
Streaming-aware tool execution with incremental content/tool-call assembly, concurrent batch processing with sibling-tool cascade cancellation, and interrupt behavior resolution. New ToolExecutionUpdate events for progress, messages, and effect propagation.
Context Compression
src/context/strategies/MicrocompactStrategy.ts, src/agent/CompactionHandler.ts
Microcompaction strategy as pre-compaction tier before soft/LLM/emergency tiers. Token recalculation after microcompact determines whether LLM/emergency compaction is needed.
Agent Loop & State Management
src/agent/AgentLoop.ts, src/agent/state/LoopState.ts, src/agent/state/TurnState.ts, src/agent/LoopRunner.ts
Refactored turn-scoped state via LoopState and TurnState interfaces. Per-turn prepareTurnState replaces static AgentLoopConfig dependencies. Enhanced with recovery tracking, tool planning via planning algorithms, and new tool execution pipelines.
Tool System Enhancements
src/tools/execution/ExecutionPipeline.ts, src/tools/execution/DenialTracker.ts, src/tools/catalog/...
ToolCatalog with source-based filtering and trust levels. Concurrency control with maxConcurrency batching (parallel vs serial). DenialTracker for permission denial history. Updated tool schemas (ToolSchemas.flag, ToolSchemas.semanticNumber).
Subagent Registry & Execution
src/agent/subagents/SubagentRegistry.ts, src/agent/subagents/SubagentExecutor.ts
Per-session SubagentRegistry with registration override semantics. SubagentExecutor accepts optional registry. Built-in agents (general-purpose, Explore, Plan) with omitEnvironment configuration for token optimization.
File Tools Refactoring
src/tools/builtin/file/..., src/tools/builtin/shell/bash.ts
Replaced extractSignatureContent/abstractPermissionRule with preparePermissionMatcher. Added resolveBehavior for dynamic tool classification, validateInput for pre-execution checks, maxResultSizeChars limits. Removed ACP mode logic.
System Tools Expansion
src/tools/builtin/system/discoverTools.ts, src/tools/builtin/system/skill.ts
New DiscoverTools tool for deferred tool activation. Skill tool now accepts args, applies activation conditions (paths), compiles runtime hooks, and returns runtimePatch effects with model/tool-policy overrides.
Session & Runtime
src/session/SessionRuntime.ts, src/session/Session.ts, src/services/ChatServiceInterface.ts
SessionRuntime now manages SubagentRegistry, BackgroundAgentManager, HookRuntime, and ToolCatalog. Added sideQuery to IChatService for non-dialogue queries. Tool-related stream events (tool_progress, tool_message, tool_runtime_patch, etc.) now emitted.
Documentation
docs/agents.md, docs/api-reference.md, docs/session.md, docs/tools.md, docs/skills.md, docs/hooks.md, docs/mcp.md, docs/blade-agent-sdk.md, README.md
Updated to document context auto-compression tiers, SubagentRegistry registration semantics, memory tools, background agents, skill activation conditions, hook architecture, MCP health-check states, and task tool registry isolation.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~50 minutes

Poem

🐰 Through memory's halls and hooks' refrain,
Tasks bloom in background's gentle plane,
With streaming whispers, patches dance,
The agent learns each new advance,
Context compressed, spirits soar—
Blade SDK opens wider door! 🚀

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch codex/agent-core-roadmap

- 新增“自动上下文管理”章节,详细说明SDK内置的多层上下文压缩策略
- 在MCP文档中添加健康检查状态说明和工具排序规则
- 完善Hooks文档架构说明,明确内联回调和文件脚本的执行顺序
- 更新README和工具文档,反映新增的TaskStop工具和工具排序规则
- 在API参考中添加BashClassifier和DenialTracker等新增组件
- 新增“上下文自动压缩”章节,详细说明压缩层级和溢出恢复机制
- 完善子Agent文档,说明contextOmissions字段和后台Agent生命周期
- 扩展Skills文档,增加完整元数据字段说明和运行时效果
Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

Note

Due to the large number of review comments, Critical severity comments were prioritized as inline comments.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (8)
src/tools/registry/ToolRegistry.ts (1)

272-280: ⚠️ Potential issue | 🟠 Major

Remove the previous MCP tool from the indexes before hot-replacing it.

registerMcpTool() overwrites the map entry, but the old tool's category/tag memberships are never removed. If the replacement changes category or tags, getByCategory() and getByTag() can still surface the new tool from stale buckets.

🔧 Proposed fix
  registerMcpTool(tool: Tool): void {
-    if (this.mcpTools.has(tool.name)) {
-      // MCP工具可以覆盖(支持热更新)
-      this.mcpTools.delete(tool.name);
-    }
+    const previous = this.mcpTools.get(tool.name);
+    if (previous) {
+      // MCP工具可以覆盖(支持热更新)
+      this.removeFromIndexes(previous);
+    }

     this.mcpTools.set(tool.name, tool);
     this.updateIndexes(tool);
     this.invalidateSortedToolCaches();
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/tools/registry/ToolRegistry.ts` around lines 272 - 280, registerMcpTool
currently overwrites this.mcpTools without removing the old tool from
category/tag buckets, causing stale entries in getByCategory()/getByTag(); fix
by reading the existing tool before replacement (const existing =
this.mcpTools.get(tool.name)), and if present remove its memberships from the
indexes (the inverse of updateIndexes) — e.g., call a new helper like
removeFromIndexes(existing) or inline logic to remove existing.category and
existing.tags entries from the same index structures updateIndexes populates —
then proceed to set the new tool, call updateIndexes(tool) and
invalidateSortedToolCaches().
src/agent/Agent.ts (2)

301-309: ⚠️ Potential issue | 🟠 Major

Remove the new non-null assertions; CI is already failing on them.

Biome's noNonNullAssertion rule is red on both outcome.result! and result.metadata!, so this branch won't merge as-is. Narrow explicitly instead of asserting.

💡 Suggested fix
     const wrapper = async function* (): AsyncGenerator<AgentEvent, LoopResult> {
       const outcome = await generator;
       if ('events' in outcome && outcome.events) {
         for (const event of outcome.events) yield event;
       }
       if ('continuation' in outcome && outcome.continuation) {
         return yield* outcome.continuation;
       }
-      return outcome.result!;
+      if ('result' in outcome && outcome.result) {
+        return outcome.result;
+      }
+      throw new Error('Agent stream terminated without a result');
     };
     return wrapper();
   }
...
   private async executePlanApproval(
     enhancedMessage: UserMessageContent,
     context: ChatContext,
     loopOptions: LoopOptions,
     result: LoopResult,
   ): Promise<string> {
-    const targetMode = result.metadata!.targetMode as PermissionMode;
-    const planContent = result.metadata!.planContent as string | undefined;
+    const metadata = result.metadata;
+    const targetMode = metadata?.targetMode as PermissionMode | undefined;
+    const planContent = metadata?.planContent as string | undefined;
+    if (!targetMode) {
+      throw new Error('Missing targetMode in plan approval result');
+    }
     this.logger.debug(`🔄 Plan 模式已批准,切换到 ${targetMode} 模式并重新执行`);

Also applies to: 457-458

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/agent/Agent.ts` around lines 301 - 309, The wrapper async generator uses
non-null assertions (outcome.result! and elsewhere result.metadata!) which
violates noNonNullAssertion; instead add explicit type narrowing: check the
shape of outcome (e.g. if 'result' in outcome && outcome.result !== undefined)
before returning outcome.result, and likewise guard any use of result.metadata
by verifying result and result.metadata are present (e.g. if ('result' in
outcome && outcome.result && outcome.result.metadata) { ... }). Update the
wrapper function and the similar branch around the other occurrence (the
result.metadata lines referenced) to use these explicit runtime checks rather
than the ! operator.

239-241: ⚠️ Potential issue | 🟠 Major

Carry the injected backgroundAgentManager into the approved rerun.

The plan phase uses withBackgroundAgentManager(context), but the follow-up execution rebuilds newContext from the original context. When the caller didn't provide a manager, the approved run loses the injected instance and task/background-agent tools can resolve against different state.

Also applies to: 461-464

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/agent/Agent.ts` around lines 239 - 241, The approved rerun loses the
injected backgroundAgentManager because the plan-phase uses
withBackgroundAgentManager(context) but executePlanApproval(enhancedMessage,
context, loopOptions, result) and the later newContext rebuild (also around the
461-464 block) reconstruct context from the original context without copying
backgroundAgentManager; update the code paths that call executePlanApproval and
that recreate newContext to carry context.backgroundAgentManager (or the
injected manager property name used by withBackgroundAgentManager) into the
newContext passed to the approved execution so the injected manager instance is
preserved for task/background-agent tools.
src/agent/loop/executeToolCalls.ts (1)

83-92: ⚠️ Potential issue | 🟠 Major

Keep the toolUseUuid when execution fails.

If onBeforeToolExec creates a tool-use record and executionPipeline.execute() throws, the catch path resets toolUseUuid to null. That breaks correlation and cleanup for failed executions.

💡 Suggested fix
 async function executeToolCall(
   toolCall: FunctionToolCall,
   input: ExecuteToolCallsInput,
 ): Promise<ToolExecutionOutcome> {
   const logger = input.logger ?? NOOP_LOGGER.child(LogCategory.AGENT);
   let outcome: ToolExecutionOutcome;
+  let toolUseUuid: string | null = null;

   try {
     const params = JSON.parse(toolCall.function.arguments) as Record<string, unknown>;
     await repairToolCallParams(toolCall, params);

-    const toolUseUuid = await input.hooks?.onBeforeToolExec?.({
+    toolUseUuid = await input.hooks?.onBeforeToolExec?.({
       toolCall,
       params,
     }) ?? null;
...
     outcome = {
       toolCall,
       result: {
         success: false,
@@
           message: error instanceof Error ? error.message : 'Unknown error',
         },
       },
-      toolUseUuid: null,
+      toolUseUuid,
     };
   }

Also applies to: 107-121

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/agent/loop/executeToolCalls.ts` around lines 83 - 92, The code clears
toolUseUuid to null when executionPipeline.execute() throws, breaking
correlation; update executeToolCalls so that the toolUseUuid variable (set from
input.hooks?.onBeforeToolExec in executeToolCalls) is preserved on error instead
of being reset to null—remove any catch-path assignments that overwrite
toolUseUuid and ensure subsequent error-handling calls (the catch that handles
executionPipeline.execute() failure and the later similar block) use the
preserved toolUseUuid when invoking hooks or cleanup logic so failed executions
remain correlated.
src/agent/subagents/BackgroundAgentManager.ts (2)

328-350: ⚠️ Potential issue | 🟠 Major

Publish terminal output on failures too.

The success path writes session.outputFile, but this catch path returns without updating it. Failed or cancelled background agents will therefore never emit a terminal file payload, which can leave TaskOutput-style readers stuck on missing or stale data.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/agent/subagents/BackgroundAgentManager.ts` around lines 328 - 350, The
catch block in BackgroundAgentManager currently marks the session completed but
never publishes the terminal output file (session.outputFile), so
failed/cancelled agents don't emit the terminal payload; update the catch to
write/publish the terminal output just like the success path: retrieve the
session/output content (session.outputFile or equivalent), ensure
sessionStore.markCompleted includes the terminal output metadata, call the same
publish/write routine used on success (or invoke the same helper that writes
session.outputFile), and then log and return the error response; reference
this.sessionStore.markCompleted, session.outputFile, and this.logger.warn when
implementing the fix.

414-446: ⚠️ Potential issue | 🟠 Major

Don’t let resume swap the subagent config mid-session.

This reuses session.messages but trusts the caller-supplied config. Resuming an old transcript under a different config.name changes the system prompt/tool whitelist mid-session and can widen capabilities unexpectedly. Reject mismatches or resolve the config from session.subagentType instead.

💡 Minimal guard
   if (!session) {
     this.logger.warn(`Cannot resume agent ${agentId}: session not found`);
     return undefined;
   }
+
+  if (session.subagentType !== config.name) {
+    this.logger.warn(
+      `Cannot resume agent ${agentId}: expected ${session.subagentType}, got ${config.name}`,
+    );
+    return undefined;
+  }
 
   if (this.isRunning(agentId)) {
     this.logger.warn(`Cannot resume agent ${agentId}: still running`);
     return undefined;
   }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/agent/subagents/BackgroundAgentManager.ts` around lines 414 - 446,
resumeAgent currently accepts a caller-supplied config while reusing
session.messages, which can swap the subagent type mid-session; update
resumeAgent to prevent that by either resolving the SubagentConfig from the
stored session.subagentType or by validating the incoming config.name (or
equivalent identifier) against session.subagentType and rejecting the resume if
they differ; locate the resumeAgent method and change the call-site before
startBackgroundAgent to (a) fetch the canonical config for session.subagentType
or (b) throw/log and return undefined when config.name !== session.subagentType,
ensuring existingMessages: session.messages is preserved and
startBackgroundAgent continues to receive the session-aligned config.
src/session/SessionRuntime.ts (1)

361-385: ⚠️ Potential issue | 🔴 Critical

The cwd-only canUseTool path bypasses static permission rules and forces confirmation on every tool.

Tools allowed by static permissions.allow rules will now prompt for confirmation if the session has a project directory but no permission hooks or baseCanUseTool.

Here's why:

  1. PermissionStage respects static rules and sets needsConfirmation = false for allowed tools
  2. ConfirmationStage immediately calls canUseTool if present (early return, ignoring needsConfirmation)
  3. createCanUseTool() activates whenever hasProjectDir is true, returning a callback that yields { behavior: 'ask' } when no hooks/baseCanUseTool provide a decision
  4. The 'ask' result overrides the static allow decision and triggers confirmation via handleLegacyConfirmation()

Static permission configurations become ineffective for cwd-only sessions. Move the hasProjectDir check into applyPermissionRequestHooks() so that static permissions are respected before the fallback to 'ask'.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/session/SessionRuntime.ts` around lines 361 - 385, The current
createCanUseTool early-returns when hasProjectDir is true which causes cwd-only
sessions to bypass static permission rules; instead remove the top-level
hasProjectDir gating and always return the async canUseTool that first calls
this.hookRuntime.applyPermissionRequestHooks(toolName, input, options) (pass
hasProjectDir via context or options so hooks can honor static rules), then
honor hookResult.decision if present, then call baseCanUseTool if defined, and
only after those fallbacks return { behavior: 'ask' } as a PermissionResult;
update references to hasProjectDir and ensure PermissionResult behavior 'ask' is
only reached when neither hooks nor baseCanUseTool decide.
src/agent/LoopRunner.ts (1)

384-397: ⚠️ Potential issue | 🔴 Critical

Compaction is still reading stale message state.

After executeWithAgentLoop() switches the loop over to loopState.messages, context.messages no longer tracks the live transcript. Both hooks below ignore the current message array and compact context.messages instead, so reactive recovery can retry the same oversized prompt unchanged and turn-limit compaction can drop the current turn from the summary. Use the live loop messages (ctx.messages / loopState.messages) as the compaction source of truth and update that same array on success.

Also applies to: 549-565

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/agent/LoopRunner.ts` around lines 384 - 397, The onReactiveCompact
handler is compacting the stale context.messages instead of the live loop
messages; change the implementation used by onReactiveCompact (and the similar
handler at the other location) to call self.compactionHandler.reactiveCompact
with the live messages array (ctx.messages / loopState.messages) as the source
of truth, iterate the async generator as now, and when compaction succeeds
replace/update that same live array (ctx.messages or loopState.messages) with
the compacted results rather than mutating context.messages so reactive recovery
and turn-limit compaction operate on the current transcript.
🟠 Major comments (20)
src/tools/builtin/task/TaskStore.ts-121-147 (1)

121-147: ⚠️ Potential issue | 🟠 Major

Keep dependency edges consistent on update and delete.

addBlocks / addBlockedBy record ids on the current task even when the referenced task is missing, and delete() removes a task without pruning its id from peers. That leaves dangling blocks / blockedBy references, so tasks can stay blocked by ids that no longer resolve.

Also applies to: 154-156

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/tools/builtin/task/TaskStore.ts` around lines 121 - 147, The update logic
for addBlocks/addBlockedBy currently inserts IDs even when the referenced task
is missing and delete() doesn't prune peers, leaving dangling edges; update the
block-handling in the addBlocks/addBlockedBy branches (the code that mutates
updated.blocks and updated.blockedBy and the loops that set this.tasks for
blockedTask/blockingTask) to only add an ID if this.tasks.has(referencedId)
(skip otherwise) and maintain symmetry by also adding the reciprocal edge on the
peer; additionally, modify delete(id) to iterate peers and remove id from their
blocks and blockedBy arrays (updating their updatedAt and persisting back into
this.tasks) so deleted tasks are pruned from others' references, and apply the
same cleanup for any removeBlocks/removeBlockedBy paths (lines around 154-156)
to keep edges consistent.
src/tools/builtin/task/taskUpdate.ts-33-35 (1)

33-35: ⚠️ Potential issue | 🟠 Major

This lookup misses the disk-backed task store.

TaskStore.getInstance(sessionId, configDir) and TaskStore.getInstance(sessionId) are different singleton keys, and recovery only happens via load(). As written, updates after a restart will hit a fresh in-memory store and report persisted tasks as missing.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/tools/builtin/task/taskUpdate.ts` around lines 33 - 35, The code in
execute uses TaskStore.getInstance(sid) which misses the disk-backed singleton
key; change the lookup to include the configDir so the same persisted store is
returned (e.g., TaskStore.getInstance(sid, configDir) or
TaskStore.getInstance(context?.configDir ?? configDir) if configDir comes from
context) and then ensure recovery by calling store.load() if not already
invoked; update the reference in the execute function to use the two-argument
getInstance and invoke load() when necessary so updates post-restart see
persisted tasks.
src/tools/builtin/task/taskList.ts-8-10 (1)

8-10: ⚠️ Potential issue | 🟠 Major

TaskList is misclassified as a write tool.

This only reads task state. Marking it as ToolKind.Write will route it through write permissions and confirmations, which can block task inspection in stricter modes.

♻️ Proposed fix
-    kind: ToolKind.Write,
+    kind: ToolKind.ReadOnly,
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/tools/builtin/task/taskList.ts` around lines 8 - 10, The TaskList tool is
incorrectly classified as a write tool; change its kind from ToolKind.Write to
ToolKind.Read in the TaskList definition so it is treated as a read-only
operation. Update the object where name: 'TaskList' / displayName: 'List Tasks'
is defined to set kind: ToolKind.Read (ensure the ToolKind enum/import is
available) so task inspection uses read permissions instead of write
confirmations.
src/tools/builtin/task/TaskStore.ts-171-183 (1)

171-183: ⚠️ Potential issue | 🟠 Major

Don't silently reset the store on non-ENOENT failures.

The bare catch treats malformed JSON, permission errors, and truncated writes the same as a missing file. That turns persistence failures into silent task loss.

🛠️ Proposed fix
-    } catch {
-      // File doesn't exist yet — start fresh
+    } catch (error) {
+      if ((error as NodeJS.ErrnoException).code === 'ENOENT') {
+        return;
+      }
+      throw error;
     }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/tools/builtin/task/TaskStore.ts` around lines 171 - 183, The load method
currently swallows all errors and silently resets the store; update
TaskStore.async load to only ignore a missing-file error (ENOENT) but rethrow or
surface any other exceptions (e.g., JSON.parse errors, permission errors from
readFile) so persistence failures don’t become silent data loss: when calling
readFile(this.persistPath, 'utf-8') and JSON.parse(raw) inside load(), catch the
error, check error.code === 'ENOENT' (or equivalent) and only then return/start
fresh; for any other error rethrow or log/propagate it instead of clearing
this.tasks.
src/tools/builtin/task/TaskStore.ts-115-117 (1)

115-117: ⚠️ Potential issue | 🟠 Major

Honor the advertised null-means-delete metadata contract.

The tool schema says metadata keys set to null should be removed, but this merge just stores null values. That leaves stale metadata behind and makes the update behavior misleading.

🧹 Proposed fix
-      ...(input.metadata !== undefined && {
-        metadata: { ...task.metadata, ...input.metadata },
-      }),
+      ...(input.metadata !== undefined && {
+        metadata: Object.fromEntries(
+          Object.entries({
+            ...(task.metadata ?? {}),
+            ...input.metadata,
+          }).filter(([, value]) => value !== null)
+        ),
+      }),
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/tools/builtin/task/TaskStore.ts` around lines 115 - 117, The current
merge stores null metadata values instead of removing keys; change the merge
logic that builds metadata from task.metadata and input.metadata so that for
each key in input.metadata you delete that key from the resulting metadata when
input.metadata[key] === null, otherwise set/overwrite it; implement this by
creating a shallow copy of task.metadata, iterating over
Object.entries(input.metadata) to delete or assign keys, and then assign the
cleaned result to metadata (omit metadata entirely if empty) so no null values
are persisted.
src/tools/builtin/task/taskList.ts-28-35 (1)

28-35: ⚠️ Potential issue | 🟠 Major

Filter blockedBy down to unresolved prerequisites.

The description says this field lists open tasks that still block execution, but the implementation returns every prerequisite id verbatim. After a dependency is completed, TaskList will still make the task look blocked.

♻️ Proposed fix
       const store = TaskStore.getInstance(sid);
       const tasks = await store.list();
+      const openTaskIds = new Set(
+        tasks.filter((t) => t.status !== 'completed').map((t) => t.id),
+      );
       const summary = tasks.map((t) => ({
         id: t.id,
         subject: t.subject,
         status: t.status,
         owner: t.owner ?? '',
-        blockedBy: t.blockedBy,
+        blockedBy: t.blockedBy.filter((id) => openTaskIds.has(id)),
       }));
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/tools/builtin/task/taskList.ts` around lines 28 - 35, The summary
currently returns t.blockedBy verbatim; change it to only include unresolved
prerequisites by filtering each t.blockedBy to IDs whose corresponding task in
the retrieved tasks list is not completed (e.g., status !== 'done' && status !==
'completed'); use the in-scope tasks array from store.list() to look up each
blocked ID (via tasks.find or a map of id->task) and include the ID only if the
found task exists and its status is not a terminal/completed state (if no
matching task, you may conservatively keep it); update the mapping that builds
summary in taskList.ts (the block that constructs summary from tasks) to apply
this filter to the blockedBy field.
src/agent/AgentEvent.ts-150-155 (1)

150-155: ⚠️ Potential issue | 🟠 Major

Carry turn and retry context on RecoveryEvent.

A bare { phase, reason } is too thin for the shared event stream promised in this file. Once CLI/Web consumers observe recovery asynchronously, retrying and failed are hard to correlate to the right turn without at least turn and the current retry attempt.

🔧 Proposed fix
 export interface RecoveryEvent {
   type: 'recovery';
+  turn: number;
+  attempt: number;
   phase: 'started' | 'retrying' | 'failed';
   reason: 'context_overflow' | 'reactive_compact' | 'recovery_exhausted';
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/agent/AgentEvent.ts` around lines 150 - 155, RecoveryEvent is too minimal
to correlate async recovery notifications; update the RecoveryEvent interface to
include the originating turn identifier and current retry attempt (e.g., add
turn: string | number and attempt: number) and ensure producers set these fields
when emitting 'retrying' and 'failed' phases (look for uses of RecoveryEvent,
emit/replay logic, and any event constructors that create recovery events) so
consumers can correlate phase/reason to the correct turn and retry count.
src/tools/builtin/task/taskGet.ts-11-11 (1)

11-11: ⚠️ Potential issue | 🟠 Major

TaskGet 的工具类型应为只读,不应标记为 Write。

Line 11 当前会把纯读取操作纳入写入权限路径,导致额外确认或在严格模式下被错误拦截。

💡 Proposed fix
-    kind: ToolKind.Write,
+    kind: ToolKind.Read,
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/tools/builtin/task/taskGet.ts` at line 11, The TaskGet tool is
incorrectly marked as a write operation; locate the TaskGet definition where it
sets kind: ToolKind.Write and change it to the read-only variant (e.g.,
ToolKind.Read or ToolKind.ReadOnly depending on your ToolKind enum) so the tool
is treated as a pure read operation; update any related references/tests that
assert the kind if present to use the read enum value and run tests to confirm
no permission/strict-mode checks are triggered.
src/agent/loop/repairToolCallParams.ts-18-24 (1)

18-24: ⚠️ Potential issue | 🟠 Major

todos 修复逻辑应限制在 Task 工具内。

Line 18–24 目前会对任意工具的 params.todosJSON.parse,存在跨工具参数被意外改写的风险。

💡 Proposed fix
-  if (typeof params.todos === 'string') {
+  if (toolCall.function.name === 'Task' && typeof params.todos === 'string') {
     try {
       params.todos = JSON.parse(params.todos) as unknown;
     } catch {
       // Let the validation layer handle malformed todos payloads.
     }
   }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/agent/loop/repairToolCallParams.ts` around lines 18 - 24, The current
repairToolCallParams logic mutates params.todos for any tool which risks
cross-tool corruption; restrict the JSON.parse so only the Task tool handles
parsing by removing/parsing code from repairToolCallParams and moving it into
the Task tool's params normalization (or guard here by checking the tool
identity, e.g., only parse when tool.name === 'Task' or tool.id === 'Task'),
ensuring you operate on params.todos only for the Task handler and let other
tools remain untouched; update repairToolCallParams to stop globally parsing
params.todos and implement parsing/validation inside the Task tool's request
handling or its normalize/validate function (refer to params.todos and the Task
tool handler).
src/tools/builtin/index.ts-61-64 (1)

61-64: ⚠️ Potential issue | 🟠 Major

Add error handling for loadFromStandardLocations call

The synchronous call to loadFromStandardLocations at line 63 can throw if file system operations fail (e.g., fs.readdirSync fails on the agents directory). While individual file parsing errors are handled internally, directory-level I/O errors propagate unhandled. Other callers like Agent.ts wrap this in try-catch for safety. Wrap the call in a try-catch block to gracefully handle potential failures.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/tools/builtin/index.ts` around lines 61 - 64, The call to
registry.loadFromStandardLocations (when opts?.subagentRegistry is not provided)
can throw on directory-level I/O errors; wrap that invocation in a try-catch
around the branch that creates a new SubagentRegistry so filesystem errors are
caught, and in the catch log the error (using the file's logger or
console.error) with clear context including
"SubagentRegistry.loadFromStandardLocations" and the configDir, then continue
without crashing; keep the behavior unchanged when opts?.subagentRegistry is
supplied.
src/tools/builtin/task/taskStop.ts-37-54 (1)

37-54: ⚠️ Potential issue | 🟠 Major

Don't report success when the final update didn't happen.

The code checks existence with get(), but it still returns success: true when store.update() comes back empty. If another worker deletes the task between those calls, the user gets a false "stopped" response even though nothing was updated. Treat updated === undefined as a not-found/conflict result instead of falling back to a synthetic success payload.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/tools/builtin/task/taskStop.ts` around lines 37 - 54, The current
taskStop implementation returns success even if store.update(taskId, ...)
returns undefined; change it to treat updated === undefined as a failure: after
calling store.update in the taskStop handler, if updated is undefined return
success: false with an appropriate llmContent/displayContent like `Task
#${taskId} not found or already removed` and an error object (e.g., type:
ToolErrorType.VALIDATION_ERROR) instead of falling back to a synthetic updated
payload; update any references to llmContent/displayContent/metadata to reflect
the failed update and ensure store.get and store.update are still used as-is.
src/tools/execution/DenialTracker.ts-21-31 (1)

21-31: ⚠️ Potential issue | 🟠 Major

Normalize denial reasons before putting them back into prompt context.

toSummary() injects reason verbatim, and this value can come from user rejection flows. A multiline or instruction-like reason will break the summary format and persist across later turns. Collapse whitespace / strip control characters before storing or rendering it.

Also applies to: 65-70

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/tools/execution/DenialTracker.ts` around lines 21 - 31, The denial reason
must be normalized to remove control characters and collapse whitespace before
storing and before rendering: update DenialTracker.record(signature, toolName,
reason) to sanitize the incoming reason (e.g., replace \r/\n/t and other control
chars with a single space, collapse multiple spaces into one, and trim) and
store that sanitized string instead of the raw input; also ensure toSummary()
and the code around lines 65-70 use the sanitized reason (or sanitize again on
render) so multiline or instruction-like text cannot break the summary format.
src/memory/FileSystemMemoryStore.ts-26-39 (1)

26-39: ⚠️ Potential issue | 🟠 Major

Persist updatedAt; don't regenerate it on every read.

Right now save() returns a fresh timestamp but never writes it, and get() fabricates updatedAt: Date.now(). Anything that sorts or compacts memories by freshness will treat old entries as newly updated after every load.

💡 Suggested fix
   async save(memory: MemoryInput): Promise<Memory> {
     this.ensureSlug(memory.name);
     const filename = this.nameToFilename(memory.name);
     const contentPath = path.join(this.dir, filename);
+    const updatedAt = Date.now();

     await mkdir(this.dir, { recursive: true });
     await writeFile(
       contentPath,
       matter.stringify(memory.body, {
         name: memory.name,
         description: memory.description,
         type: memory.type,
+        updatedAt,
       }),
       'utf8'
     );

     const stored: Memory = {
       ...memory,
-      updatedAt: Date.now(),
+      updatedAt,
     };
...
       const fm = parsed.data as {
         name?: string;
         description?: string;
         type?: Memory['type'];
+        updatedAt?: number;
       };

-      if (!fm.name || !fm.description || !fm.type) {
+      if (
+        !fm.name ||
+        !fm.description ||
+        !fm.type ||
+        typeof fm.updatedAt !== 'number'
+      ) {
         return undefined;
       }

       return {
         name: fm.name,
         description: fm.description,
         type: fm.type,
         body: parsed.content.trim(),
-        updatedAt: Date.now(),
+        updatedAt: fm.updatedAt,
       };

Also applies to: 72-77

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/memory/FileSystemMemoryStore.ts` around lines 26 - 39, save() and get()
are currently fabricating updatedAt (save() returns a new timestamp but doesn't
persist it, and get() sets updatedAt = Date.now()), so update persistence and
reads to use a stored timestamp: in FileSystemMemoryStore.save(), set
memory.updatedAt = Date.now() and include that value in the data written via
matter.stringify (so the file contains updatedAt), and in
FileSystemMemoryStore.get() (and the similar load/deserialize area around lines
72-77) parse and return the updatedAt from the file contents rather than
assigning Date.now(); reference the save(), get(), contentPath,
matter.stringify, and Memory object to locate the spots to change.
src/memory/FileSystemMemoryStore.ts-96-101 (1)

96-101: ⚠️ Potential issue | 🟠 Major

Don't rewrite the index after a failed unlink().

Catching every unlink() error and still removing the entry makes transient EACCES/EBUSY failures look like a successful delete, leaving an orphaned .md file that list() can no longer discover. Only suppress ENOENT; otherwise keep the index intact and surface the failure.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/memory/FileSystemMemoryStore.ts` around lines 96 - 101, The current
deletion flow always writes the updated index even if unlink() failed, which
hides transient filesystem errors; change the try/catch around
unlink(path.join(this.dir, filename)) to only suppress ENOENT (file already
gone) and rethrow any other errors so the index is not rewritten on failure;
then proceed to call readIndex() and writeIndex(...) only if unlink succeeded or
the caught error was ENOENT. Reference unlink, this.dir, filename, readIndex,
writeIndex, and list() in FileSystemMemoryStore to locate and implement this
behavior.
src/agent/loop/executeToolCalls.ts-125-126 (1)

125-126: ⚠️ Potential issue | 🟠 Major

Guard onAfterToolExec so hook failures don't masquerade as tool failures.

This callback runs after the tool already executed. If it throws, executeToolCall() rejects and the caller can abort/retry even though the tool may have already mutated state. Wrap it in its own try/catch and report the hook failure separately.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/agent/loop/executeToolCalls.ts` around lines 125 - 126, The post-tool
hook call in executeToolCall (input.hooks?.onAfterToolExec) must be guarded so
hook exceptions don't cause executeToolCall to reject and be mistaken for tool
failures; wrap the invocation of input.hooks?.onAfterToolExec(outcome) in its
own try/catch, log or attach the hook error to the returned outcome (e.g.,
outcome.hookError or outcome.metadata.hookError) and ensure the function still
returns the original outcome even if the hook throws.
src/memory/FileSystemMemoryStore.ts-41-51 (1)

41-51: ⚠️ Potential issue | 🟠 Major

Serialize MEMORY.md mutations or concurrent writes will drop entries.

save() and delete() both do a read-modify-write of the index with no coordination. With mixed/parallel execution and background agents in this PR, two writers can read the same old index and the last write wins, silently losing the other change.

Also applies to: 100-101, 121-125

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/memory/FileSystemMemoryStore.ts` around lines 41 - 51, The save() and
delete() implementations perform an uncoordinated read-modify-write on the index
(via readIndex() / writeIndex()) which allows concurrent writers to clobber each
other; fix by serializing index mutations — implement a simple file-level mutex
or optimistic-compare-and-replace around the MEMORY.md index: acquire a lock (or
load index along with a version/hash), readIndex(), apply the change (the code
that builds next in save() and deletes in delete()), then writeIndex() only if
the version/hash is unchanged (or while holding the lock), retry on conflict a
few times, and ensure writeIndex() is atomic (write temp file + fs.rename).
Reference functions: save(), delete(), readIndex(), writeIndex(), and the
MEMORY.md index file.
src/agent/Agent.ts-547-551 (1)

547-551: ⚠️ Potential issue | 🟠 Major

getAllNames().length > 0 is too broad as an "already loaded" sentinel.

This registry can now be injected/shared. If it already contains builtins or session agents from another agent/context, this early return skips loading user/project agents for the current storageRoot/cwd and leaves stale subagents in place. Track loaded sources explicitly, or reload file-backed sources with override: true.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/agent/Agent.ts` around lines 547 - 551, The current early-return uses
subagentRegistry.getAllNames().length > 0 which is too broad and can skip
loading agents for the current project because the registry may contain agents
from another context; modify Agent initialization to detect whether the registry
already contains agents from THIS project/context (e.g., compare project dir or
storageRoot stored on the registry) or explicitly reload file-backed sources
with override: true instead of returning early. Concretely, replace the
getAllNames().length check with a targeted check (e.g.,
subagentRegistry.hasSourceForProject(getContextCwd(this.defaultContext) or
subagentRegistry.getLoadedSources().includes(this.storageRoot)) and if
file-backed sources exist call the registry's file-load/reload method with
override: true (or set a per-source loaded flag) so user/project agents are
always (re)loaded for the current project; keep setLogger and setProjectDir
calls as-is and remove the global early return that skips per-project loading.
src/agent/subagents/BackgroundAgentManager.ts-157-162 (1)

157-162: ⚠️ Potential issue | 🟠 Major

Rotate or clear the output file for each execution.

The path is derived only from agentId, and resumeAgent() reuses that same id. If the previous run already wrote a result, a resumed agent starts with stale output at the same path, so any reader polling the file can observe the old result before the new run finishes.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/agent/subagents/BackgroundAgentManager.ts` around lines 157 - 162, The
output file path derived from agentId (const id / outputFile) can reuse stale
results across runs; update BackgroundAgentManager (where resumeAgent and the
outputFile are created) to either (a) append a new execution-specific suffix
(e.g., timestamp or nanoid) so use `blade-agent-${id}-${executionId}.output` to
isolate runs, or (b) atomically clear/rotate the existing file at the start of
each execution (unlink or truncate the existing outputFile before
opening/writing, or rename it to a .old backup) so readers won't observe
previous results; make this change where id and outputFile are computed to
ensure each run has a fresh/unique output path or is cleared before use.
src/agent/__tests__/StreamingToolExecutor.test.ts-51-56 (1)

51-56: ⚠️ Potential issue | 🟠 Major

These async-generator mocks currently fail Biome.

Each of these async function* stubs returns or throws without ever yielding, and Biome reports lint/correctness/useYield on them. Add a no-op yield* [] as never[]; (like the other test mocks in this PR) or change the mock shape so CI lint passes.

Also applies to: 353-378, 434-439, 465-467

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/agent/__tests__/StreamingToolExecutor.test.ts` around lines 51 - 56, The
async-generator test stubs (the anonymous async function* returned in the mocked
streams inside StreamingToolExecutor.test.ts) never yield, triggering Biome lint
rule lint/correctness/useYield; update each async function* mock (the ones at
the shown diff and also the other occurrences around the test file) to include a
no-op yield such as "yield* [] as never[];" before returning so the generator
yields at least once and the linter is satisfied, keeping the same return shape
({ content: '', toolCalls: [] }) afterwards.
src/hooks/HookRuntime.ts-91-124 (1)

91-124: ⚠️ Potential issue | 🟠 Major

skip from post-tool callbacks is currently a no-op.

afterExecute() can return { action: 'skip' } for PostToolUse / PostToolUseFailure, but ExecutionPipeline.applyAfterExecuteHooks() only honors abort and modifiedOutput. A session callback that asks to skip a post-tool result will therefore be silently ignored. Either translate skip to a concrete result here or stop exposing it from the post-exec path.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/hooks/HookRuntime.ts` around lines 91 - 124, afterExecute currently
returns { action: 'skip' } from post-tool callbacks but
ExecutionPipeline.applyAfterExecuteHooks only honors 'abort' and
'modifiedOutput', so 'skip' is ignored; update afterExecute to translate any
output.action === 'skip' into a concrete modifiedOutput result (e.g., return {
modifiedOutput: undefined, reason: output.reason } or another explicit sentinel
your pipeline expects) instead of returning { action: 'skip' }, referencing the
afterExecute handler and HookEvent.PostToolUse / PostToolUseFailure so
ExecutionPipeline.applyAfterExecuteHooks will receive a handled value.
🟡 Minor comments (7)
src/tools/registry/ToolRegistry.ts-289-297 (1)

289-297: ⚠️ Potential issue | 🟡 Minor

Don't infer MCP membership from the tool name.

getSortedTools() uses this.mcpTools.has(tool.name) as a proxy for source. If an MCP tool shares a name with a builtin tool, the builtin instance is classified as MCP too, so getAll() no longer preserves the builtin-before-MCP ordering this cache is meant to guarantee.

🔧 Proposed fix
  private getSortedTools(tools: Tool[]): Tool[] {
+    const mcpToolSet = new Set(this.mcpTools.values());
     return [...tools].sort((left, right) => {
-      const leftIsMcp = this.mcpTools.has(left.name);
-      const rightIsMcp = this.mcpTools.has(right.name);
+      const leftIsMcp = mcpToolSet.has(left);
+      const rightIsMcp = mcpToolSet.has(right);
       if (leftIsMcp !== rightIsMcp) {
         return leftIsMcp ? 1 : -1;
       }
       return left.name.localeCompare(right.name);
     });
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/tools/registry/ToolRegistry.ts` around lines 289 - 297, getSortedTools is
incorrectly using this.mcpTools.has(tool.name) to infer MCP membership by name;
change the membership test to use object identity (e.g., this.mcpTools.has(left)
/ this.mcpTools.has(right)) so a builtin tool that shares a name with an MCP
tool isn’t misclassified. Ensure the mcpTools collection stores Tool instances
(Set<Tool>) rather than names, and update the code that populates mcpTools to
add the actual Tool objects (where mcpTools is populated) so getSortedTools can
reliably call this.mcpTools.has(tool).
src/tools/builtin/task/taskCreate.ts-23-25 (1)

23-25: ⚠️ Potential issue | 🟡 Minor

建议禁止空字符串任务标题/描述。

Line 23–24 的 z.string() 会接受 '',可能创建无效任务。建议至少 trim().min(1)

💡 Proposed fix
-      subject: z.string().describe('A brief, actionable title in imperative form (e.g., "Fix authentication bug in login flow")'),
-      description: z.string().describe('What needs to be done'),
+      subject: z.string().trim().min(1).describe('A brief, actionable title in imperative form (e.g., "Fix authentication bug in login flow")'),
+      description: z.string().trim().min(1).describe('What needs to be done'),
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/tools/builtin/task/taskCreate.ts` around lines 23 - 25, The schema
currently allows empty strings for subject and description; update the zod
validators for subject and description in taskCreate.ts to prevent blank values
by using z.string().trim().min(1). Also consider applying the same guard to
activeForm (z.string().optional().transform or
z.string().trim().min(1).optional()) if you want to disallow empty optional
strings; update the schema entries for subject, description (and optionally
activeForm) accordingly so blank strings are rejected.
src/agent/__tests__/LoopRunner.test.ts-205-207 (1)

205-207: ⚠️ Potential issue | 🟡 Minor

Generator mock doesn't yield, which may cause issues if the caller iterates.

The chatWithRetryEvents mock is an async generator but never yields. If the actual code iterates this generator expecting events, it will receive nothing. Consider either making this a regular async function if yields aren't needed, or adding a minimal yield.

🔧 Suggested fix: Return directly or yield the result
-        chatWithRetryEvents: vi.fn(async function* (...args: Parameters<typeof chatFn>) {
-          return await chatFn(...args);
-        }),
+        chatWithRetryEvents: vi.fn(async function* (...args: Parameters<typeof chatFn>) {
+          yield { type: 'start' };
+          return await chatFn(...args);
+        }),

Or if the mock just needs to return the result without yielding events:

-        chatWithRetryEvents: vi.fn(async function* (...args: Parameters<typeof chatFn>) {
-          return await chatFn(...args);
-        }),
+        chatWithRetryEvents: vi.fn(async (...args: Parameters<typeof chatFn>) => {
+          return await chatFn(...args);
+        }),
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/agent/__tests__/LoopRunner.test.ts` around lines 205 - 207, The mock for
chatWithRetryEvents is declared as an async generator but never yields, so
callers iterating it will get no events; update the mock in the test to either
make chatWithRetryEvents a regular async function that returns await
chatFn(...args) or, if iteration is expected, yield at least one value (e.g.,
yield the result of await chatFn(...args)) so consumers of the async generator
see events; locate the mock definition of chatWithRetryEvents in
LoopRunner.test.ts and change its implementation accordingly to match the real
chatWithRetryEvents behavior relative to chatFn.
src/agent/state/LoopState.ts-114-119 (1)

114-119: ⚠️ Potential issue | 🟡 Minor

Clear transitionReason when recovery is reset.

After resetRecovery(), buildTurnState() will keep surfacing the last recovery reason until some caller clears it separately, so later turns can carry stale recovery metadata.

💡 Proposed fix
   resetRecovery(): void {
     this.recovery = {
       attempt: 0,
       hasAttemptedReactiveCompact: false,
     };
+    this.transitionReason = undefined;
   }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/agent/state/LoopState.ts` around lines 114 - 119, In
LoopState.resetRecovery(), also clear the transitionReason field so that
subsequent calls to buildTurnState() won't surface stale recovery metadata;
update the LoopState.resetRecovery() implementation to reset
this.transitionReason (or the containing state property name) alongside
resetting this.recovery so recovery.attempt and hasAttemptedReactiveCompact are
cleared and transitionReason becomes undefined/null.
src/agent/StreamingToolExecutor.ts-377-384 (1)

377-384: ⚠️ Potential issue | 🟡 Minor

Fix non-null assertion flagged by pipeline at line 383.

Same pattern as before - store the outcome in a variable before assigning to the results array.

🔧 Proposed fix
-      input.executionResults[input.index] = {
+      const outcome: ToolExecutionOutcome = {
         toolCall: input.toolCall,
         result,
         toolUseUuid,
       };
+      input.executionResults[input.index] = outcome;
 
-      await input.executionConfig.onAfterToolExec?.(input.executionResults[input.index]!);
+      await input.executionConfig.onAfterToolExec?.(outcome);
       await input.executionConfig.onToolComplete?.(input.toolCall, result);
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/agent/StreamingToolExecutor.ts` around lines 377 - 384, The non-null
assertion on input.executionResults[input.index]! should be avoided; create a
local const (e.g., const outcome = { toolCall: input.toolCall, result,
toolUseUuid }) then assign input.executionResults[input.index] = outcome and
pass outcome into both await input.executionConfig.onAfterToolExec?.(outcome)
and await input.executionConfig.onToolComplete?.(input.toolCall, result) to
remove the forced non-null and ensure the same object is used for callbacks.
src/agent/StreamingToolExecutor.ts-401-412 (1)

401-412: ⚠️ Potential issue | 🟡 Minor

Fix non-null assertion flagged by pipeline at line 411.

The assertion is safe due to the has/set logic, but the linter requires removal. Restructure to avoid the assertion.

🔧 Proposed fix
-    if (!accumulator.has(index)) {
-      accumulator.set(index, {
+    let entry = accumulator.get(index);
+    if (!entry) {
+      entry = {
         id: toolCallChunk.id || '',
         name: toolCallChunk.function?.name || '',
         arguments: '',
         dispatched: false,
         cancelled: false,
-      });
+      };
+      accumulator.set(index, entry);
     }
 
-    const entry = accumulator.get(index)!;
-
     if (toolCallChunk.id && !entry.id) {
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/agent/StreamingToolExecutor.ts` around lines 401 - 412, Replace the
non-null assertion on accumulator.get(index) by ensuring variable is assigned
safely: change to a let entry = accumulator.get(index) and if it's undefined
create the same object (using toolCallChunk.id, toolCallChunk.function?.name,
etc.), set it into accumulator, and then continue using entry; this removes the
need for the "!" while preserving the existing has/set semantics for accumulator
and uses the existing symbols accumulator, index, and toolCallChunk.
src/agent/StreamingToolExecutor.ts-281-294 (1)

281-294: ⚠️ Potential issue | 🟡 Minor

Fix non-null assertions flagged by pipeline.

The pipeline reports lint/style/noNonNullAssertion errors. While the assertions are safe here (value was just assigned), storing the outcome in a variable before assignment eliminates the need for assertions.

🔧 Proposed fix
       if (input.batchController.signal.aborted) {
         entry.cancelled = true;
-        input.executionResults[index] = {
+        const outcome: ToolExecutionOutcome = {
           toolCall: this.toFunctionToolCall(entry.id, entry.name, entry.arguments),
           result: this.buildCascadeAbortResult(),
           toolUseUuid: null,
         };
-        await input.executionConfig.onAfterToolExec?.(input.executionResults[index]!);
+        input.executionResults[index] = outcome;
+        await input.executionConfig.onAfterToolExec?.(outcome);
         await input.executionConfig.onToolComplete?.(
-          input.executionResults[index]!.toolCall,
-          input.executionResults[index]!.result,
+          outcome.toolCall,
+          outcome.result,
         );
         continue;
       }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/agent/StreamingToolExecutor.ts` around lines 281 - 294, The non-null
assertion usage on input.executionResults[index]! should be removed by creating
a local variable for the entry result before assigning it into
input.executionResults; e.g., construct a const execResult = { toolCall:
this.toFunctionToolCall(entry.id, entry.name, entry.arguments), result:
this.buildCascadeAbortResult(), toolUseUuid: null } then assign
input.executionResults[index] = execResult and pass execResult to
input.executionConfig.onAfterToolExec and input.executionConfig.onToolComplete
so you can drop the `!`s; update the block inside StreamingToolExecutor where
input.batchController.signal.aborted is handled to use this pattern and
reference the existing helpers toFunctionToolCall, buildCascadeAbortResult,
onAfterToolExec, and onToolComplete.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 3ae8cf28-37e0-4c57-aa79-e9daa8adafbe

📥 Commits

Reviewing files that changed from the base of the PR and between c9d2167 and e1d56ad.

📒 Files selected for processing (89)
  • docs/agents.md
  • docs/api-reference.md
  • docs/session.md
  • docs/tools.md
  • src/agent/Agent.ts
  • src/agent/AgentEvent.ts
  • src/agent/AgentLoop.ts
  • src/agent/CompactionHandler.ts
  • src/agent/LoopRunner.ts
  • src/agent/StreamingToolExecutor.ts
  • src/agent/TokenBudget.ts
  • src/agent/__tests__/AgentLoop.streaming.test.ts
  • src/agent/__tests__/AgentLoop.test.ts
  • src/agent/__tests__/CompactionHandler.test.ts
  • src/agent/__tests__/LoopRunner.test.ts
  • src/agent/__tests__/LoopState.test.ts
  • src/agent/__tests__/StreamingToolExecutor.test.ts
  • src/agent/loop/__tests__/decisions.test.ts
  • src/agent/loop/__tests__/repairToolCallParams.test.ts
  • src/agent/loop/executeToolCalls.ts
  • src/agent/loop/repairToolCallParams.ts
  • src/agent/recovery/isOverflowRecoverable.ts
  • src/agent/state/LoopState.ts
  • src/agent/state/TurnState.ts
  • src/agent/subagents/AgentSessionStore.ts
  • src/agent/subagents/BackgroundAgentManager.ts
  • src/agent/subagents/SubagentExecutor.ts
  • src/agent/subagents/SubagentRegistry.ts
  • src/agent/subagents/__tests__/BackgroundAgentManager.test.ts
  • src/agent/subagents/__tests__/SubagentExecutor.test.ts
  • src/agent/subagents/__tests__/SubagentRegistry.test.ts
  • src/agent/subagents/builtinAgents.ts
  • src/agent/subagents/types.ts
  • src/agent/types.ts
  • src/context/CompactionService.ts
  • src/context/__tests__/CompactionService.test.ts
  • src/context/strategies/MicrocompactStrategy.ts
  • src/context/strategies/__tests__/MicrocompactStrategy.test.ts
  • src/hooks/BashClassifier.ts
  • src/hooks/HookBus.ts
  • src/hooks/HookRuntime.ts
  • src/hooks/__tests__/BashClassifier.test.ts
  • src/hooks/__tests__/HookRuntime.test.ts
  • src/index.test.ts
  • src/index.ts
  • src/mcp/McpCapabilityProjector.ts
  • src/memory/FileSystemMemoryStore.ts
  • src/memory/MemoryManager.ts
  • src/memory/MemoryStore.ts
  • src/memory/MemoryTypes.ts
  • src/memory/__tests__/FileSystemMemoryStore.test.ts
  • src/memory/__tests__/MemoryManager.test.ts
  • src/memory/index.ts
  • src/services/ChatServiceInterface.ts
  • src/services/VercelAIChatService.ts
  • src/session/Session.ts
  • src/session/SessionRuntime.ts
  • src/session/__tests__/SessionContext.test.ts
  • src/session/__tests__/SessionRuntime.subagents.test.ts
  • src/session/__tests__/SessionRuntime.test.ts
  • src/skills/SkillLoader.ts
  • src/skills/__tests__/SkillLoader.test.ts
  • src/skills/types.ts
  • src/tools/builtin/index.ts
  • src/tools/builtin/memory/__tests__/memoryTools.test.ts
  • src/tools/builtin/memory/index.ts
  • src/tools/builtin/memory/memoryRead.ts
  • src/tools/builtin/memory/memoryWrite.ts
  • src/tools/builtin/shell/bash.ts
  • src/tools/builtin/system/skill.ts
  • src/tools/builtin/task/TaskStore.ts
  • src/tools/builtin/task/__tests__/TaskStore.test.ts
  • src/tools/builtin/task/__tests__/taskTool.registry.test.ts
  • src/tools/builtin/task/__tests__/taskTools.test.ts
  • src/tools/builtin/task/index.ts
  • src/tools/builtin/task/task.ts
  • src/tools/builtin/task/taskCreate.ts
  • src/tools/builtin/task/taskGet.ts
  • src/tools/builtin/task/taskList.ts
  • src/tools/builtin/task/taskOutput.ts
  • src/tools/builtin/task/taskStop.ts
  • src/tools/builtin/task/taskUpdate.ts
  • src/tools/execution/DenialTracker.ts
  • src/tools/execution/ExecutionPipeline.ts
  • src/tools/execution/PipelineStages.ts
  • src/tools/execution/__tests__/ExecutionPipeline.test.ts
  • src/tools/registry/ToolRegistry.ts
  • src/tools/registry/__tests__/ToolRegistry.test.ts
  • src/tools/types/ExecutionTypes.ts

Comment on lines +244 to +306
if (streamHandler && turnTools.length > 0) {
const streamingExecutor = new StreamingToolExecutor(
streamHandler,
() => turnChatService,
config.logger,
);
const pendingEvents: AgentEvent[] = [];
let waitForEventsResolve: (() => void) | null = null;
let executionDone = false;
let executionError: unknown;

const flushWaiter = () => {
const resolve = waitForEventsResolve;
waitForEventsResolve = null;
resolve?.();
};

const enqueueEvent = (event: AgentEvent) => {
pendingEvents.push(event);
flushWaiter();
};

const executionPromise = streamingExecutor
.collectAndExecute(messages, turnTools, signal, {
executionPipeline,
executionContext: turnExecutionContext,
logger: config.logger,
permissionMode: turnPermissionMode,
hooks: {
onBeforeToolExec: config.onBeforeToolExec,
},
onAfterToolExec: config.onAfterToolExec,
onContentDelta: (delta) => {
enqueueEvent({ type: 'content_delta', delta });
},
onThinkingDelta: (delta) => {
enqueueEvent({ type: 'thinking_delta', delta });
},
onStreamEnd: () => {
if (!signal?.aborted) {
enqueueEvent({ type: 'stream_end' });
}
},
onToolReady: (toolCall) => {
const toolDef = executionPipeline.getRegistry().get(toolCall.function.name);
const toolKind = toolDef?.kind as 'readonly' | 'write' | 'execute' | undefined;
enqueueEvent({ type: 'tool_start', toolCall, toolKind });
},
onToolComplete: (toolCall, result) => {
enqueueEvent({ type: 'tool_result', toolCall, result });
},
})
.then(({ chatResponse, executionResults }) => {
turnResult = chatResponse;
streamingExecutionResults = executionResults;
})
.catch((error: unknown) => {
executionError = error;
})
.finally(() => {
executionDone = true;
flushWaiter();
});
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Don't reactive-retry a streamed turn after tool dispatch.

StreamingToolExecutor.collectAndExecute() can surface a recoverable context-overflow error after it has already started tool execution, and in that failure path streamingExecutionResults is still undefined. The catch block then compacts and reruns the turn, which can execute write/execute tools twice. Track whether any streamed tool was dispatched/completed and skip reactive retry once the turn has already produced side effects.

Also applies to: 380-425

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/agent/AgentLoop.ts` around lines 244 - 306, The turn retries can re-run
write/execute tools if a stream error happens after tools were already
dispatched; add a boolean flag (e.g., streamedToolSideEffectOccurred)
initialized false and set it true inside the streaming hooks that indicate side
effects (onToolReady/onToolComplete or when toolKind === 'write'|'execute')
within the StreamingToolExecutor.collectAndExecute call, then update the retry
decision logic that currently checks streamingExecutionResults/executionError to
also skip the reactive retry when streamedToolSideEffectOccurred is true; apply
the same change to the duplicate block at the other location (the block
referenced by the reviewer for lines 380-425) so turns that have already
dispatched/completed streamed tools are not compacted and retried.

Comment on lines +55 to +61
private constructor(
private readonly sessionId: string,
configDir?: string,
) {
this.persistPath = configDir
? path.join(configDir, 'tasks', `${sessionId}.json`)
: undefined;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Hash or sanitize sessionId before using it as a filename.

path.join(configDir, 'tasks', \${sessionId}.json`)lets asessionIdcontaining/or..escape thetasks/` directory and overwrite arbitrary files. Persist to a safe filename instead of using the raw identifier.

🔐 Proposed fix
+import { createHash } from 'node:crypto';
 import { mkdir, readFile, writeFile } from 'node:fs/promises';
 import * as path from 'node:path';
 import { nanoid } from 'nanoid';
@@
-    this.persistPath = configDir
-      ? path.join(configDir, 'tasks', `${sessionId}.json`)
-      : undefined;
+    const fileName = `${createHash('sha256').update(sessionId).digest('hex')}.json`;
+    this.persistPath = configDir
+      ? path.join(configDir, 'tasks', fileName)
+      : undefined;
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
private constructor(
private readonly sessionId: string,
configDir?: string,
) {
this.persistPath = configDir
? path.join(configDir, 'tasks', `${sessionId}.json`)
: undefined;
import { createHash } from 'node:crypto';
import { mkdir, readFile, writeFile } from 'node:fs/promises';
import * as path from 'node:path';
import { nanoid } from 'nanoid';
// ... other code ...
private constructor(
private readonly sessionId: string,
configDir?: string,
) {
const fileName = `${createHash('sha256').update(sessionId).digest('hex')}.json`;
this.persistPath = configDir
? path.join(configDir, 'tasks', fileName)
: undefined;
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/tools/builtin/task/TaskStore.ts` around lines 55 - 61, The constructor in
TaskStore currently uses the raw sessionId to build persistPath (persistPath =
path.join(configDir, 'tasks', `${sessionId}.json`)), which allows
path-traversal; change it to derive a safe filename from sessionId (e.g.,
compute a deterministic hash like SHA-256 or sanitize/encode the ID to remove
path separators) and use that safeName when building persistPath
(path.join(configDir, 'tasks', `${safeName}.json`)); ensure you still handle
undefined configDir the same way and create/ensure the 'tasks' directory exists
before writing.

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
docs/tools.md (1)

114-137: ⚠️ Potential issue | 🟠 Major

Update tool inventory to include 4 missing task storage management tools

The documentation claims 18 standard built-in tools, but getBuiltinTools currently returns 22 core tools. The table omits TaskCreate, TaskGet, TaskUpdate, and TaskList—four task storage management tools that are already implemented and exported from the codebase. Update both the tool count and the table to reflect the complete inventory, adding these four tools to the Tasks section.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/tools.md` around lines 114 - 137, The docs claim 18 built-in tools but
the code's getBuiltinTools actually returns 22; update the tool count and add
the four missing task storage management tools (TaskCreate, TaskGet, TaskUpdate,
TaskList) into the Tasks section of the table, listing their Kind and brief
descriptions consistent with the other entries so the documentation matches the
exported implementations from getBuiltinTools.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@docs/agents.md`:
- Around line 15-17: Remove the nonexistent contextOmissions documentation and
replace it with the actual omitEnvironment behavior: remove any mention of
contextOmissions?: Array<'environment'> from the AgentDefinition docs, update
the tip on line 16 to state that Explore and Plan built-in agents omit
environment context via omitEnvironment: true, and clarify in the
AgentDefinition field table that environment omission is not configurable from
AgentDefinition but is controlled at subagent runtime (see omitEnvironment in
src/agent/subagents/types.ts and built-in behavior in
src/agent/subagents/builtinAgents.ts); ensure references to AgentDefinition,
omitEnvironment, and the subagent runtime config are used so readers can locate
the correct implementation.

---

Outside diff comments:
In `@docs/tools.md`:
- Around line 114-137: The docs claim 18 built-in tools but the code's
getBuiltinTools actually returns 22; update the tool count and add the four
missing task storage management tools (TaskCreate, TaskGet, TaskUpdate,
TaskList) into the Tasks section of the table, listing their Kind and brief
descriptions consistent with the other entries so the documentation matches the
exported implementations from getBuiltinTools.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 8a13fe4c-1686-4168-addb-d72f8292d73a

📥 Commits

Reviewing files that changed from the base of the PR and between e1d56ad and 6fcfa9e.

📒 Files selected for processing (9)
  • README.md
  • docs/agents.md
  • docs/api-reference.md
  • docs/blade-agent-sdk.md
  • docs/hooks.md
  • docs/mcp.md
  • docs/session.md
  • docs/skills.md
  • docs/tools.md
✅ Files skipped from review due to trivial changes (5)
  • docs/mcp.md
  • docs/blade-agent-sdk.md
  • README.md
  • docs/hooks.md
  • docs/session.md
🚧 Files skipped from review as they are similar to previous changes (1)
  • docs/api-reference.md

Comment on lines +15 to +17
::: tip
Explore 和 Plan 子 Agent 默认设置了 `contextOmissions: ['environment']`,省略环境上下文信息以节省 token,让更多上下文空间用于实际任务内容。
:::
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Verify actual API fields used by docs claims
rg -n -C2 "interface AgentDefinition|contextOmissions|omitEnvironment" src/session/types.ts src/agent/subagents/types.ts src/agent/subagents/builtinAgents.ts

Repository: echoVic/blade-agent-sdk

Length of output: 2092


🏁 Script executed:

cat -n src/session/types.ts | sed -n '88,110p'

Repository: echoVic/blade-agent-sdk

Length of output: 699


🏁 Script executed:

cat -n docs/agents.md | sed -n '1,80p'

Repository: echoVic/blade-agent-sdk

Length of output: 2478


Documentation shows non-existent contextOmissions field; use actual omitEnvironment configuration

The docs incorrectly display contextOmissions: ['environment'] as part of AgentDefinition (lines 16, 52-60, 70), but this field does not exist in the actual type definition (src/session/types.ts).

The correct mechanism is:

  • Built-in Explore and Plan agents use omitEnvironment: true internally (src/agent/subagents/builtinAgents.ts)
  • Custom agents configured via AgentDefinition cannot control environment omission; this is only available for subagent runtime config (src/agent/subagents/types.ts)

Update the documentation to:

  1. Remove contextOmissions?: Array<'environment'> from the AgentDefinition interface block
  2. Change line 16 tip to describe the actual behavior: "Explore and Plan built-in agents omit environment context (omitEnvironment: true) to save tokens"
  3. Update the field table to clarify that AgentDefinition does not expose environment omission control
Suggested changes
-Explore 和 Plan 子 Agent 默认设置了 `contextOmissions: ['environment']`,省略环境上下文信息以节省 token,让更多上下文空间用于实际任务内容。
+Explore 和 Plan 内置 Agent 在运行时默认启用环境裁剪(`omitEnvironment: true`),以节省 token,让更多上下文空间用于实际任务内容。
 interface AgentDefinition {
   name: string;
   description: string;
   systemPrompt?: string;
   allowedTools?: string[];
   model?: string;
-  contextOmissions?: Array<'environment'>;
 }
-| `contextOmissions` | 省略的上下文部分。设置 `['environment']` 可跳过环境信息注入,节省 token |
+| (无该字段) | `AgentDefinition` 不暴露环境裁剪选项;仅内置 Explore/Plan Agent 运行时启用环境裁剪 |
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
::: tip
Explore 和 Plan Agent 默认设置了 `contextOmissions: ['environment']`,省略环境上下文信息以节省 token,让更多上下文空间用于实际任务内容。
:::
::: tip
Explore 和 Plan 内置 Agent 在运行时默认启用环境裁剪(`omitEnvironment: true`),以节省 token,让更多上下文空间用于实际任务内容。
:::
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/agents.md` around lines 15 - 17, Remove the nonexistent contextOmissions
documentation and replace it with the actual omitEnvironment behavior: remove
any mention of contextOmissions?: Array<'environment'> from the AgentDefinition
docs, update the tip on line 16 to state that Explore and Plan built-in agents
omit environment context via omitEnvironment: true, and clarify in the
AgentDefinition field table that environment omission is not configurable from
AgentDefinition but is controlled at subagent runtime (see omitEnvironment in
src/agent/subagents/types.ts and built-in behavior in
src/agent/subagents/builtinAgents.ts); ensure references to AgentDefinition,
omitEnvironment, and the subagent runtime config are used so readers can locate
the correct implementation.

echoVic added 5 commits April 5, 2026 20:56
- 新增 `RuntimePatch` 类型,支持技能激活时的工具策略、模型覆盖、系统提示追加和环境变量注入
- 扩展技能元数据,支持路径条件激活和运行时钩子注册
- 重构技能注册表,支持多来源技能加载和基于激活上下文的技能过滤
- 在 `LoopRunner` 中集成运行时补丁应用,支持技能激活时的状态管理
- 新增技能激活路径收集和条件匹配逻辑
- 扩展钩子系统支持运行时注册的钩子回调
- 更新技能工具,支持基于路径条件的激活验证
- 添加完整的单元测试覆盖新功能
- 新增 ToolCatalog 用于工具发现和信任级别管理
- 新增 RuntimePatch 系统,支持运行时环境、提示词和工具策略的动态修改
- 新增 DiscoverTools 工具,允许会话中按需发现和激活隐藏工具
- 重构权限匹配接口,统一使用 preparePermissionMatcher 方法
- 新增工具效果系统,支持运行时补丁、上下文补丁、消息注入和权限更新
- 新增敏感文件路径检测和工具行为解析机制
- 改进工具执行流程,支持进度更新、消息注入和中断行为控制
- 更新测试以覆盖新功能,确保向后兼容性
移除不再使用的 @agentclientprotocol/sdk 依赖包
删除 ACP 服务上下文文件和相关类型引用
清理工具代码中与 ACP 模式相关的逻辑和注释
更新文件系统服务文档以反映 ACP 移除
将 LoopRunner 中的运行时补丁管理逻辑提取到 RuntimePatchManager 类,负责:
- 管理运行时状态(skill、tool policy、context overlay、discovered tools、hooks、patch history)
- 派生和应用 RuntimePatch
- 构建运行时上下文快照
- 管理 Skill 激活/清除
- 管理工具发现和目录消息同步
- 清理 turn-scoped 状态

将循环钩子构建逻辑提取到 buildLoopConfig 函数,负责:
- 构建 AgentLoopConfig 对象(包含所有 hooks)
- 统一 JSONL 持久化模式

同时清理了过时的上下文模块文档,并修复了类型导入和测试文件中的类型声明。
- 将 BackgroundAgentManager 和 AgentSessionStore 从单例模式改为实例模式,支持每个 SessionRuntime 独立实例
- 在 LoopHookBuilder 中添加 onProgress 回调,在每次工具调用后触发进度更新
- 扩展 IBackgroundAgentManager 接口,分离读写与控制职责,添加进度和消息队列支持
- 修改任务工具以通过执行上下文获取后台代理管理器,移除硬编码的单例依赖
- 在任务输出中显示后台代理的运行时进度信息
@echoVic echoVic merged commit 98e9bd5 into main Apr 6, 2026
0 of 3 checks passed
@echoVic echoVic deleted the codex/agent-core-roadmap branch April 6, 2026 16:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant