Skip to content

Pre-Phase-7: per-source LLM routing seam#7

Merged
Archibald312 merged 2 commits into
mainfrom
phase-pre8-model-routing-seam
May 16, 2026
Merged

Pre-Phase-7: per-source LLM routing seam#7
Archibald312 merged 2 commits into
mainfrom
phase-pre8-model-routing-seam

Conversation

@Archibald312

Copy link
Copy Markdown
Owner

Summary

  • Defer original Phase 7 (local inference / Ollama / vLLM) to post-launch, and land the per-source LLM routing seam now so it's in place before Phase 7 connectors. See decisions.md (2026-05-15) and the renumbered build plan in CLAUDE.md.
  • New resolveModelRouting() resolver with document → project → request precedence. Stored prefs are validated against the canonical model set; unknown IDs are rejected and recorded, then the resolver walks down the chain. Conflicts between documents are captured but the first non-null wins (no policy engine yet).
  • streamChatWithTools accepts an optional routing context, dispatches with the resolved model, and records the resolution into the existing audit_log.routing_policy_applied jsonb column shipped in Phase 6.
  • Main chat path wired to pass the chat's project + the document IDs from docIndex. Today this resolves to the user's requested model unchanged; Phase 7 connectors will populate documents.model_preference at ingest and the same code path will route accordingly without further changes.
  • New model_preference (nullable text) columns on projects and documents. No UI yet — by design.

Why now, not in Phase 7

The local-inference adapter is cheap to add later (a fourth file alongside claude.ts/gemini.ts/openai.ts). The routing policy surface is the part that would otherwise get baked into every connector and dispatch site in Phase 7+, and retrofitting it would force a much wider edit later. Landing the seam now keeps Phase 7 connectors free to declare a model_preference at ingest without inventing the surface from scratch.

Test plan

  • backend/npx tsc --noEmit clean
  • backend/npx vitest run — 73/73 green (8 new routing tests covering precedence, conflicts, unknown-model rejection at doc + project layers, db-error tolerance, and missing-document-ids skip path)
  • Manual: with the migration applied, set a documents.model_preference to a known model id on one doc; confirm a chat that includes that doc resolves to the override and writes the policy into audit_log.routing_policy_applied
  • Manual: confirm a chat that includes no documents and no project override still works and audit_log.routing_policy_applied records source: "request"

🤖 Generated with Claude Code

Archibald312 and others added 2 commits May 15, 2026 21:02
Defer original Phase 7 (local inference / Ollama / vLLM) to post-launch
and land the per-source LLM routing surface ahead of Phase 7 connectors,
so that connectors and the eventual local-inference adapter plug into the
same decision point without touching dispatch sites. See decisions.md
(2026-05-15) and the renumbered build plan in CLAUDE.md.

- model_preference (nullable) columns on projects + documents
- backend/src/lib/llm/routing.ts: resolveModelRouting() with doc → project
  → request precedence, conflicts and unknown-model rejections captured
  for audit
- streamChatWithTools accepts optional routing context, dispatches with
  the resolved model, and records the policy into the existing
  audit_log.routing_policy_applied jsonb column
- main chat path wired to pass project + document IDs
- 8 unit tests covering precedence, conflicts, rejections, and db errors

Backend tsc --noEmit clean; Vitest 73/73 passing.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@Archibald312 Archibald312 merged commit c549ff2 into main May 16, 2026
4 checks passed
@Archibald312 Archibald312 deleted the phase-pre8-model-routing-seam branch May 16, 2026 02:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant