Skip to content

[routing] Tool co-occurrence tracking for relationship-aware routing #156

@dgenio

Description

@dgenio

Context

contextweaver routes tools based on static metadata (descriptions, tags, names) using TF-IDF and Jaccard scoring. It has no knowledge of which tools are actually used together in practice. This means:

  • A tool frequently called after another gets no routing score boost
  • Common tool sequences (e.g., "search" → "summarize" → "email") are invisible to the routing engine
  • The EpisodicStore exists but only stores individual events, not tool usage relationships

ChainWeaver (#11, #12) captures runtime traces and proposes candidate flows. But contextweaver has no mechanism to receive this relationship data and use it to improve routing decisions.

Why it matters

Acceptance Criteria

  • CoOccurrenceTracker class that records tool usage pairs
  • record(tool_a: str, tool_b: str) -> None — record that tool_b followed tool_a in a session
  • score(tool_a: str, tool_b: str) -> float — co-occurrence score (0.0–1.0, normalized)
  • top_k(tool_a: str, k: int) -> list[tuple[str, float]] — most frequently paired tools
  • Backed by EpisodicStore or a new lightweight store (append-only, per conventions)
  • Router can optionally use co-occurrence scores as a signal (blended with TF-IDF/BM25 scores)
  • Score blending is configurable: co_occurrence_weight: float = 0.0 (default off, backward-compatible)
  • Unit tests: record pairs, query scores, top-k, integration with Router
  • No new runtime dependencies

Implementation Notes

class CoOccurrenceTracker:
    """Tracks and queries tool co-occurrence frequency across sessions."""
    
    def record(self, tool_a: str, tool_b: str) -> None:
        """Record that tool_b was used after tool_a."""
        ...
    
    def score(self, tool_a: str, tool_b: str) -> float:
        """Co-occurrence score (0.0-1.0), normalized by total observations."""
        ...
    
    def top_k(self, tool_a: str, k: int = 5) -> list[tuple[str, float]]:
        """Top-k tools most frequently used after tool_a."""
        ...
    
    def export(self) -> dict[str, dict[str, int]]:
        """Export raw counts for use by ChainWeaver's chain analysis."""
        ...

Integration with Router:

# In Router.route() scoring:
base_score = self.scorer.score(query, candidate_id)
co_score = self.co_occurrence.score(last_tool_used, candidate_id) if self.co_occurrence else 0.0
final_score = (1 - co_weight) * base_score + co_weight * co_score

Files likely touched:

  • src/contextweaver/store/co_occurrence.py — new module
  • src/contextweaver/routing/router.py — optional co-occurrence signal
  • tests/test_store_co_occurrence.py — new
  • src/contextweaver/store/__init__.py — export

Dependencies

Metadata

Metadata

Assignees

No one assigned

    Labels

    area/routingRouting engine: catalog, graph, router, cardsarea/storeData stores: event log, artifacts, episodic, factscomplexity/moderateMultiple modules involved, some design workenhancementNew feature or requestpriority/lowLower priority — scale & validation

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions