feat(research): structured ResearchContext — direct tool calls in the research stage (#3372 increment 1)#3806
Merged
Conversation
…istic renderer (#3372 increment 1, WIP) Per the 7/7 higher_order vote (Option A): pure module that maps a research-tool ResearchDiscoverResponse + analyze recommendations into structured ResearchContext { text, metadata: { discoveredItems(relevanceScore), recommendations, qualitySignals } }, with the human-readable text DERIVED deterministically from the metadata (single source of truth) and external titles/recommendations escaped + bounded (security-voter condition). 7 tests; typecheck + lint clean. WIP — the consumer (rewire the research stage to call executeDiscovery + analyzeGaps directly and return this text) lands in the next commit; NOT yet a mergeable PR (producer-consumer gate needs the consumer). Branch persists for completion. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
… increment 1) Completes increment 1 (the core ResearchContext module landed in the prior commit). Per the 7/7 higher_order vote (Option A): - research stage (agent-executor.ts) calls executeDiscovery + analyzeGaps DIRECTLY instead of two LLM experts that discard the structure; builds the ResearchContext and returns its deterministic, structure-derived text (signature unchanged — no DevPipelineStages churn this increment). Fail-safe try/catch. Prior-learnings memory context is appended to the text (preserved from #1716). - export analyzeGaps from research-analyze.ts (deterministic, registry-derived). - agent-executor.test.ts: research-stage tests now mock executeDiscovery/analyzeGaps and assert the direct calls + deterministic structured text (was the LLM path). Full gate set green: 628 pipeline tests, governance (46 tools), description-drift, docs:tools (47), registry-coverage, producer-consumer (module now has its consumer), typecheck + lint. Closes #3372 (increment 1; increment 2 threads metadata through plan/vote — tracked). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Closes #3372 (increment 1). Unblocks #3234 once increment 2 lands.
Decision (7/7 higher_order consensus vote → Option A)
The research stage routed through two LLM experts that called
research_discover/research_analyzeand returned only text — the structuredDiscoveredItem[](relevanceScore),recommendations, and quality signals were discarded (agent-executor.ts:455). The vote chose Option A: call the tools directly for structure and derive the text deterministically from that same structure (single source of truth). Rejected B (re-running the query twice → divergent text-vs-metadata) and C (invasive expert-bridge change).Change (increment 1)
pipeline/research-context.ts: mapsResearchDiscoverResponse+ analyzerecommendations→ResearchContext { text, metadata: { discoveredItems(relevanceScore), recommendations, qualitySignals } }. Text is rendered deterministically from the metadata; external titles/recommendations are escaped (backticks/control-chars/newlines neutralized) and the rendered list is bounded (security-voter conditions). 7 unit tests.agent-executor.ts) now callsexecuteDiscovery+analyzeGapsdirectly (no research-stage LLM tokens), builds the context, and returns its structure-derived text.research()signature unchanged this increment (returns string) → zero churn to the ~6 stage-mocking test files. Fail-safetry/catch; prior-learnings memory context still appended (feat(pipeline): wire memory system into dev pipeline — query + write-back #1716).analyzeGapsexported fromresearch-analyze.ts(deterministic — registry-derived, no LLM).Verification (full gate set)
628 pipeline tests · governance (46 tools) · description-drift · docs:tools (47) · registry-coverage · producer-consumer (the new module has its consumer) · typecheck · lint — all green.
Next (increment 2, tracked on #3372/#3234)
Thread the
ResearchContextmetadata throughresearch()→plan()→vote()(DevPipelineStages signature change) + weightvoter-prompts.tson research maturity (the consumer the vote requires) + instrument vote outcomes. That unblocks #3234.🤖 Generated with Claude Code