Findings: research-memory-context (8, health: adequate) — system review 2026-05-31

## Findings catalog — research-memory-context

From the [2026-05-31 full-codebase review](../blob/main/docs/archive/system-review-2026-05-31.md) (epic #3143). Domain health: `adequate`. This issue is the durable, individually-trackable list of findings for this domain; thematic work is tracked under epic #3143 (related phase: #3148).

### Findings
- [x] **[HIGH][mission-gap]** Research discoveries not flowing into context retrieval — knowledge substrate remains write-only for most agents — ✅ RESOLVED via #3231/#3372 (structured ResearchContext → plan/vote, PR #3806) + #3234 (→ routing, PR #3816) (#3696 reconciliation 2026-06-09)
  - Evidence: `src/context/context-retriever.ts:90-113 fetches beliefs, agentic, adaptive, mobimem, outcomes, priorStrategies but NO research_synthesize results. src/mcp/tools/research-discover.ts tools emit DiscoveredItem but there's no code path wiring synthesis into UnifiedContext. Plan→vote→implement loop never reads research insights as input.`
  - Fix: Add Phase 6 (#2792) entry point: wire research_synthesize output into either (a) a new `UnifiedContext.researchInsights` field carrying synthesized findings+alignments as structured belief precursors, or (b) auto-distill synthesis results into BeliefMemory during research_catalog_review. Enable the closed-loop: research feeds beliefs→context→routing→strategy distillation→priorStrategies.
- [ ] **[HIGH][architecture]** Memory persistence layers operate independently without unified ingestion contract — research metadata, quality scores, and alignment mappings not persisted
  - Evidence: `src/research/research-schemas.ts defines quality_score, evidence_tier, venue_tier, related_issues but these are YAML registry only. No pipeline ingests papers.yaml quality assessments into belief/agentic/adaptive backends. src/context/belief-memory.ts and agentic-memory.ts have independent extraction logic (agentic-memory-extraction.ts) but don't consume research metadata as a source. Each memory system writes only from its own tool invocations.`
  - Fix: Create src/context/research-metadata-ingester.ts exporting syncResearchQualityToBelief(paper) and syncTechniqueAlignmentToAgentic(technique) to run on papers.yaml changes (CI hook + on-demand via research_import tool). Wire into memory_write and memory_promotion pipelines so research quality assessments raise memory confidence scores.
- [ ] **[MED][correctness]** Context budgeting does not account for research synthesis output size — no token-aware clipping for synthesized insights
  - Evidence: `src/context/token-budget-tracker.ts tracks session/model tokens but synthesizeResearch() from research-helpers-synthesize.ts returns arbitrarily large ClusterSynthesis[] with full paper lists, key insights, gaps. summarizeContextForPrompt() in context-retriever.ts does crude slice(0, 5) but no token counting. If synthesis output ever fed into context, token-aware clipping would be missing.`
  - Fix: Add TokenCounterProvider call in research-discover.ts result assembly and context-retriever.ts `fetchResearchInsights()` (when added). Implement SynthesisCompressionStrategy in research-helpers-synthesize.ts: truncate keyInsights to top-3 by quality_score, limit gaps to 2, summarize implementationOpportunities to URL-only links. Store compression stats in memory metadata.
- [ ] **[MED][modularity]** MobiMem routing patterns never see research-derived task categories — experience patterns trained on CLI sequences only, not on task specialization
  - Evidence: `src/context/routing-memory.ts recordExperience() takes workflow + model sequence but ignores research/domain context. src/context/mobimem.ts experience.recordExecution() stores action sequence and outcome but has no facility for research-informed task typing. src/cli-adapters/task-classifier.ts (TaskCategory inference) happens at routing, but routing→experience feedback loop ignores research classification.`
  - Fix: Extend recordExperience signature to include optional `researchContext?: { topic: ResearchTopic; techniques: string[] }`. Store in MobiMem experience metadata so getExperiencePatterns() can filter by research alignment. Wire cli-adapters/task-classifier results → RoutingMemory so learned patterns are tagged with research domain (e.g., 'memory' domain tasks prefer Opus over Sonnet).
- [ ] **[MED][modularity]** Research quality assessment logic is fragmented across 4+ modules with inconsistent scoring — no single source of truth for paper evaluation
  - Evidence: `src/research/research-quality.ts defines computeQualityScore(venue_tier, recency, citationCount) | src/research/source-quality.ts computeSourceQualityScore(stars, reviewed). src/research/research-index-generator.ts loads papers and computes stats. src/cli/research-helpers-synthesize.ts recomputes QualityDistribution in lines 293-301. No unified IQualityAssessor interface; scoring rules are inlined.`
  - Fix: Extract to src/research/quality-assessor.ts exporting QualityAssessor interface with methods assessPaper(paper) → QualityAssessment, assessSource(source) → QualityAssessment, bulk operations. Use in research-index-generator.ts, synthesis, and memory-ingester. Document via ADR that this is the canonical quality authority consumed by MemoryPromoter and belief confidence.
- [x] **[MED][user-journey]** Context-retriever returns 6 independent memory backends' results but no ranking by relevance to task — consumer must implement sorting — ✅ RESOLVED via #3236 (cross-ranked unified memory prefix) (#3696 reconciliation 2026-06-09)
  - Evidence: `src/context/context-retriever.ts UnifiedContext returns beliefs[], similarMemories[], recentLearnings[], experiencePatterns[] as parallel lists. summarizeContextForPrompt() slices each to top-N but never cross-ranks (e.g., a belief from 2 days ago vs. a pattern from 6 months with 95% success). No unified RankedMemory type. #2792 Phase 5 acceptance doesn't define ranking contract.`
  - Fix: Add Phase 6: UnifiedContext.rankedMemories: readonly RankedMemoryItem[] carrying (source: 'belief'|'agentic'|'adaptive'|'experience'|'outcome'|'strategy', relevanceScore: 0-1, item: Belief|AgenticMemoryEntry|...). Implement unified ranker in context-retriever-helpers.ts: BM25 on free-text match + temporal decay + source confidence weights. consumers call getContextForTask().rankedMemories for a single, sorted list.
- [ ] **[MED][correctness]** Research schema evolution not coordinated with downstream memory/context consumers — quality_score added but no migration for old papers
  - Evidence: `src/research/research-schemas.ts ResearchPaperSchema.quality_score is optional with default 0 in synthesis (line 244). Existing papers.yaml entries pre-quality_score stay at 0. No backfill script runs via CI. research-quality.ts computeQualityScore() can auto-assign but is never invoked at load time. Imported papers via research_add.ts have no quality assessment step.`
  - Fix: Add scripts/backfill-research-quality.ts exporting backfillPaperQualities(registry): PapersRegistry to recompute missing scores. Wire into research-helpers-io.ts loadPapersRegistry() as an optional post-load transform (gate via NEXUS_BACKFILL_RESEARCH_QUALITY=1). Document schema version in frontmatter; error if consumer model version < schema version (fail-safe).
- [ ] **[LOW][architecture]** ContextRetriever faithfully implements #2792 Phase 2 but phases 3-6 are incomplete — entry-point wiring incomplete across orchestration
  - Evidence: `src/context/context-retriever.ts fully implements getContextForTask(). Pipeline/stage-wrappers.ts at line 217 has TODO: 'getContextForTask once #2795 lands'. Orchestration/graph/graph-executor.ts DOES call getContextForTask at executor start. But mcp/tools/orchestrate.ts imports getContextForTask but code flow unclear. Multiple entry points, inconsistent adoption.`
  - Fix: Complete Phase 3 (#2795) by documenting which entry points call getContextForTask() and which will in which version. Create integration map: routing (✓ composite-router), orchestration (✓ graph-executor, ? mcp-orchestrate), skill-creation (?), consensus-voting (?). Drive adoption via fitness audit: penalize entry points that skip context retrieval.

### Composability notes
The research-memory-context substrate has strong modular separation (research::index, context::memory, context::retrieval) but the composition points are **not** designed for reuse. Research tools are write-only (papers.yaml emit DiscoveredItem, but never feed back into memory). Memory backends are callable individually but ContextRetriever is the only attempt at unified composition, and it's incomplete (research layer missing). No public contracts exist for "how to wire research quality into memory" or "how to rank unified context by relevance" — meaning external projects building on nexus-agents' memory system will have to reinvent integration. For the building-blocks→pipelines vision, this needs explicit composition interfaces: MemoryIngestor (research→memory), ContextAugmenter (memory→retrieval), RankedContextProvider (unified→consumer). Currently, only the last interface (ContextRetriever) exists, and it doesn't include research.

### Mission gaps
- Research synthesis output does not flow into plan→vote→implement loop: agents never see research findings as input to their decisions. The autonomous loop reads outcomes/beliefs/priors but ignores synthesized research insights, breaking the claim of 'expand beyond coding to accomplish any goal' when the knowledge substrate is write-only.
- No closed-loop tuning for research quality: papers are cataloged and synthesized but their impact on agent performance is not measured. No feedback mechanism to improve quality scoring, re-rank papers by outcome correlation, or deprecate low-signal sources.
- Memory ingestion from research metadata is manual/absent: quality_score, evidence_tier, related_issues, aligned_techniques all live in YAML but aren't persisted to belief/agentic/adaptive backends. The 'knowledge substrate' is partitioned into unconnected silos.
- Task specialization (TaskCategory) is not informed by research domains: routing/orchestration infer category from keywords, but nexus-agents' own research registry (memory, routing, consensus, etc.) is not visible to the routing decisions that rely on those domains.

---
Part of epic #3143. Full review record: `docs/archive/system-review-2026-05-31.md`.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Findings: research-memory-context (8, health: adequate) — system review 2026-05-31 #3161

Findings catalog — research-memory-context

Findings

Composability notes

Mission gaps

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Findings: research-memory-context (8, health: adequate) — system review 2026-05-31 #3161

Description

Findings catalog — research-memory-context

Findings

Composability notes

Mission gaps

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions