Skip to content

Findings: research-memory-context (8, health: adequate) — system review 2026-05-31 #3161

@williamzujkowski

Description

@williamzujkowski

Findings catalog — research-memory-context

From the 2026-05-31 full-codebase review (epic #3143). Domain health: adequate. This issue is the durable, individually-trackable list of findings for this domain; thematic work is tracked under epic #3143 (related phase: #3148).

Findings

  • [HIGH][mission-gap] Research discoveries not flowing into context retrieval — knowledge substrate remains write-only for most agents — ✅ RESOLVED via research: Research discoveries not flowing into context retrieval — knowledge substrate remains write-only for most agents #3231/Pass structured ResearchContext (not just text) to plan/vote — #3258 Option B follow-up #3372 (structured ResearchContext → plan/vote, PR feat(research): structured ResearchContext — direct tool calls in the research stage (#3372 increment 1) #3806) + research: MobiMem routing patterns never see research-derived task categories — experience patterns trained on CLI sequences on... #3234 (→ routing, PR feat(routing): record research-maturity on routing outcomes + measurement surface (#3234) #3816) (chore: reconcile the 12 system-review finding catalogs — tick resolved sub-findings / split residuals #3696 reconciliation 2026-06-09)
    • Evidence: src/context/context-retriever.ts:90-113 fetches beliefs, agentic, adaptive, mobimem, outcomes, priorStrategies but NO research_synthesize results. src/mcp/tools/research-discover.ts tools emit DiscoveredItem but there's no code path wiring synthesis into UnifiedContext. Plan→vote→implement loop never reads research insights as input.
    • Fix: Add Phase 6 (epic: cross-cutting memory access — every entry point reads what every entry point writes #2792) entry point: wire research_synthesize output into either (a) a new UnifiedContext.researchInsights field carrying synthesized findings+alignments as structured belief precursors, or (b) auto-distill synthesis results into BeliefMemory during research_catalog_review. Enable the closed-loop: research feeds beliefs→context→routing→strategy distillation→priorStrategies.
  • [HIGH][architecture] Memory persistence layers operate independently without unified ingestion contract — research metadata, quality scores, and alignment mappings not persisted
    • Evidence: src/research/research-schemas.ts defines quality_score, evidence_tier, venue_tier, related_issues but these are YAML registry only. No pipeline ingests papers.yaml quality assessments into belief/agentic/adaptive backends. src/context/belief-memory.ts and agentic-memory.ts have independent extraction logic (agentic-memory-extraction.ts) but don't consume research metadata as a source. Each memory system writes only from its own tool invocations.
    • Fix: Create src/context/research-metadata-ingester.ts exporting syncResearchQualityToBelief(paper) and syncTechniqueAlignmentToAgentic(technique) to run on papers.yaml changes (CI hook + on-demand via research_import tool). Wire into memory_write and memory_promotion pipelines so research quality assessments raise memory confidence scores.
  • [MED][correctness] Context budgeting does not account for research synthesis output size — no token-aware clipping for synthesized insights
    • Evidence: src/context/token-budget-tracker.ts tracks session/model tokens but synthesizeResearch() from research-helpers-synthesize.ts returns arbitrarily large ClusterSynthesis[] with full paper lists, key insights, gaps. summarizeContextForPrompt() in context-retriever.ts does crude slice(0, 5) but no token counting. If synthesis output ever fed into context, token-aware clipping would be missing.
    • Fix: Add TokenCounterProvider call in research-discover.ts result assembly and context-retriever.ts fetchResearchInsights() (when added). Implement SynthesisCompressionStrategy in research-helpers-synthesize.ts: truncate keyInsights to top-3 by quality_score, limit gaps to 2, summarize implementationOpportunities to URL-only links. Store compression stats in memory metadata.
  • [MED][modularity] MobiMem routing patterns never see research-derived task categories — experience patterns trained on CLI sequences only, not on task specialization
    • Evidence: src/context/routing-memory.ts recordExperience() takes workflow + model sequence but ignores research/domain context. src/context/mobimem.ts experience.recordExecution() stores action sequence and outcome but has no facility for research-informed task typing. src/cli-adapters/task-classifier.ts (TaskCategory inference) happens at routing, but routing→experience feedback loop ignores research classification.
    • Fix: Extend recordExperience signature to include optional researchContext?: { topic: ResearchTopic; techniques: string[] }. Store in MobiMem experience metadata so getExperiencePatterns() can filter by research alignment. Wire cli-adapters/task-classifier results → RoutingMemory so learned patterns are tagged with research domain (e.g., 'memory' domain tasks prefer Opus over Sonnet).
  • [MED][modularity] Research quality assessment logic is fragmented across 4+ modules with inconsistent scoring — no single source of truth for paper evaluation
    • Evidence: src/research/research-quality.ts defines computeQualityScore(venue_tier, recency, citationCount) | src/research/source-quality.ts computeSourceQualityScore(stars, reviewed). src/research/research-index-generator.ts loads papers and computes stats. src/cli/research-helpers-synthesize.ts recomputes QualityDistribution in lines 293-301. No unified IQualityAssessor interface; scoring rules are inlined.
    • Fix: Extract to src/research/quality-assessor.ts exporting QualityAssessor interface with methods assessPaper(paper) → QualityAssessment, assessSource(source) → QualityAssessment, bulk operations. Use in research-index-generator.ts, synthesis, and memory-ingester. Document via ADR that this is the canonical quality authority consumed by MemoryPromoter and belief confidence.
  • [MED][user-journey] Context-retriever returns 6 independent memory backends' results but no ranking by relevance to task — consumer must implement sorting — ✅ RESOLVED via research: Context-retriever returns 6 independent memory backends' results but no ranking by relevance to task — consumer must ... #3236 (cross-ranked unified memory prefix) (chore: reconcile the 12 system-review finding catalogs — tick resolved sub-findings / split residuals #3696 reconciliation 2026-06-09)
    • Evidence: src/context/context-retriever.ts UnifiedContext returns beliefs[], similarMemories[], recentLearnings[], experiencePatterns[] as parallel lists. summarizeContextForPrompt() slices each to top-N but never cross-ranks (e.g., a belief from 2 days ago vs. a pattern from 6 months with 95% success). No unified RankedMemory type. #2792 Phase 5 acceptance doesn't define ranking contract.
    • Fix: Add Phase 6: UnifiedContext.rankedMemories: readonly RankedMemoryItem[] carrying (source: 'belief'|'agentic'|'adaptive'|'experience'|'outcome'|'strategy', relevanceScore: 0-1, item: Belief|AgenticMemoryEntry|...). Implement unified ranker in context-retriever-helpers.ts: BM25 on free-text match + temporal decay + source confidence weights. consumers call getContextForTask().rankedMemories for a single, sorted list.
  • [MED][correctness] Research schema evolution not coordinated with downstream memory/context consumers — quality_score added but no migration for old papers
    • Evidence: src/research/research-schemas.ts ResearchPaperSchema.quality_score is optional with default 0 in synthesis (line 244). Existing papers.yaml entries pre-quality_score stay at 0. No backfill script runs via CI. research-quality.ts computeQualityScore() can auto-assign but is never invoked at load time. Imported papers via research_add.ts have no quality assessment step.
    • Fix: Add scripts/backfill-research-quality.ts exporting backfillPaperQualities(registry): PapersRegistry to recompute missing scores. Wire into research-helpers-io.ts loadPapersRegistry() as an optional post-load transform (gate via NEXUS_BACKFILL_RESEARCH_QUALITY=1). Document schema version in frontmatter; error if consumer model version < schema version (fail-safe).
  • [LOW][architecture] ContextRetriever faithfully implements epic: cross-cutting memory access — every entry point reads what every entry point writes #2792 Phase 2 but phases 3-6 are incomplete — entry-point wiring incomplete across orchestration
    • Evidence: src/context/context-retriever.ts fully implements getContextForTask(). Pipeline/stage-wrappers.ts at line 217 has TODO: 'getContextForTask once #2795 lands'. Orchestration/graph/graph-executor.ts DOES call getContextForTask at executor start. But mcp/tools/orchestrate.ts imports getContextForTask but code flow unclear. Multiple entry points, inconsistent adoption.
    • Fix: Complete Phase 3 (feat(memory): wire ContextRetriever into CompositeRouter / orchestrate / graph workflow start (Phase 3 of #2792) #2795) by documenting which entry points call getContextForTask() and which will in which version. Create integration map: routing (✓ composite-router), orchestration (✓ graph-executor, ? mcp-orchestrate), skill-creation (?), consensus-voting (?). Drive adoption via fitness audit: penalize entry points that skip context retrieval.

Composability notes

The research-memory-context substrate has strong modular separation (research::index, context::memory, context::retrieval) but the composition points are not designed for reuse. Research tools are write-only (papers.yaml emit DiscoveredItem, but never feed back into memory). Memory backends are callable individually but ContextRetriever is the only attempt at unified composition, and it's incomplete (research layer missing). No public contracts exist for "how to wire research quality into memory" or "how to rank unified context by relevance" — meaning external projects building on nexus-agents' memory system will have to reinvent integration. For the building-blocks→pipelines vision, this needs explicit composition interfaces: MemoryIngestor (research→memory), ContextAugmenter (memory→retrieval), RankedContextProvider (unified→consumer). Currently, only the last interface (ContextRetriever) exists, and it doesn't include research.

Mission gaps

  • Research synthesis output does not flow into plan→vote→implement loop: agents never see research findings as input to their decisions. The autonomous loop reads outcomes/beliefs/priors but ignores synthesized research insights, breaking the claim of 'expand beyond coding to accomplish any goal' when the knowledge substrate is write-only.
  • No closed-loop tuning for research quality: papers are cataloged and synthesized but their impact on agent performance is not measured. No feedback mechanism to improve quality scoring, re-rank papers by outcome correlation, or deprecate low-signal sources.
  • Memory ingestion from research metadata is manual/absent: quality_score, evidence_tier, related_issues, aligned_techniques all live in YAML but aren't persisted to belief/agentic/adaptive backends. The 'knowledge substrate' is partitioned into unconnected silos.
  • Task specialization (TaskCategory) is not informed by research domains: routing/orchestration infer category from keywords, but nexus-agents' own research registry (memory, routing, consensus, etc.) is not visible to the routing decisions that rely on those domains.

Part of epic #3143. Full review record: docs/archive/system-review-2026-05-31.md.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or requestp2Priority 2 - Medium impact, moderate changes needed

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions