feat(bin): opt-in daemon fast-path for qmd search via QMD_DAEMON_URL#582
Open
lukeboyett wants to merge 11 commits intotobi:mainfrom
Open
feat(bin): opt-in daemon fast-path for qmd search via QMD_DAEMON_URL#582lukeboyett wants to merge 11 commits intotobi:mainfrom
lukeboyett wants to merge 11 commits intotobi:mainfrom
Conversation
added 11 commits
April 14, 2026 08:34
The narrow cross-runtime Database interface in src/db.ts defines the subset of better-sqlite3 / bun:sqlite methods used throughout QMD. Commit fee576b ("fix: migrate legacy lowercase paths on reindex") introduced a db.transaction(...) call in src/store.ts but did not extend the interface, breaking `tsc -p tsconfig.build.json`: src/store.ts(2142,22): error TS2339: Property 'transaction' does not exist on type 'Database'. Both underlying engines expose transaction(fn), so this just makes the type reflect reality.
`qmd search foo` on a large index pays ~500–800 ms of Node + better-
sqlite3 + sqlite-vec bootstrap every call — most of the wall clock is
module initialization, not the query itself. `qmd mcp --http --daemon`
already ships as an always-warm server and already exposes a
`POST /search` REST endpoint (used today by MCP-adjacent tooling), so
the cold-start cost is avoidable for callers willing to keep a daemon
running.
This PR wires `bin/qmd` to that existing endpoint. When `QMD_DAEMON_URL`
is set, the shell wrapper:
- Health-checks `GET /health` with a 1 s timeout. On failure it falls
through to the normal cold-start CLI.
- Translates `qmd search foo -c alma --limit 5` into
`POST /search {"searches":[{"type":"lex","query":"foo"}],
"collections":["alma"],"limit":5}`.
- Does the same for `vsearch` with `type:"vec"`.
- Prints whatever JSON the daemon returns (users that want formatted
text leave `QMD_DAEMON_URL` unset).
- Silently falls through when `--index <name>` is used (one daemon =
one index) or when any curl error occurs.
No new daemon or new endpoints are introduced; this is purely a CLI
routing change over existing server surface. Stays POSIX sh so it
works under the current #!/bin/sh shebang.
Tests:
- `test/mcp.test.ts` "MCP HTTP Transport" suite gains contract tests
for the REST path `bin/qmd` depends on: happy path with
`type="lex"` + collections, happy path with `null` collections
(unset `-c`), happy path with `type="vec"` (`vsearch`), and a 400
for missing `searches`. These pin the payload/response shape so a
future upstream change to that endpoint can't silently break the
fast-path.
Docs:
- README gains a "Fast CLI via daemon (scripts / agents)" section with
the opt-in env var, example payload, and fall-through rules.
- CHANGELOG entry under Changes.
Three linked fixes for the daemon fast-path, addressing Codex round 1 on #5: - P1 "Preserve CLI args when daemon POST falls through": the previous implementation did `shift` in the outer shell and then iterated `$@` destructively. A healthy /health followed by a failed POST would reach the fallback `exec` with mutated/empty argv, so the cold-start CLI would see the wrong command. Moved the whole fast-path into a shell function (`_qmd_try_daemon`) so its parsing operates on a local copy of the args. The outer shell's `$@` is never touched, and on any `return 1` (health fail, POST fail, empty query) control falls through to the lockfile-gated `exec` with the exact argv the user typed. - P1 "Support -n limit flag in daemon fast-path parser": upstream's search limit flag is `-n <num>` (`src/cli/qmd.ts:2486`, documented at `--help` line 2773). The fast-path was parsing `-l|--limit` which does not exist on `search` (`-l` is a `get` option). `qmd search foo -n 3` would discard `-n`, treat `3` as part of the query, and use the default limit of 10. Now parses `-n`, `--n`, `--n=...`. - P2 "Preserve all collection filters": `collection` is declared `multiple: true` in `src/cli/qmd.ts:2496` — `qmd search foo -c a -c b` should hit both collections. The parser stored a single scalar so only the last `-c` survived. Collections are now accumulated into a JSON array (`["a","b"]`) or rendered as `null` when no `-c` is passed. Tests: new `test/bin-qmd-daemon-fastpath.test.ts` spins up a tiny HTTP mock on a random port, exports QMD_DAEMON_URL, and spawns the shell wrapper via async spawn (spawnSync would deadlock the mock's event loop). It covers: - payload shape for `search -c -n`, - `vsearch` → `type:"vec"`, - multiple `-c` flags accumulate, - `--collection=` long form mixes with `-c`, - unset `-c` yields `collections: null`, - `--index` bypasses (zero captures), - non-`search` subcommands don't touch the daemon, - health-check 500 falls through without attempting POST.
Codex P1 on #5 round 2: the fast-path's catch-all for unrecognised dashed flags was `shift` (drop the flag token only). Several upstream `search` / `vsearch` options take a value — `--min-score <N>`, `-C/--candidate-limit`, `--intent <text>`, etc. — while others are booleans. A call like `qmd search foo --min-score 0.8` would drop `--min-score`, then on the next loop the `*)` arm would treat `0.8` as part of the query text. The daemon would receive `foo 0.8` and the min-score semantics would vanish silently. Rather than enumerate every upstream flag (brittle — breaks each time upstream adds one), the unknown-flag branch now `return 1`s out of the fast-path and lets the cold-start CLI handle the invocation. The opt-in fast-path stays a strict subset of interactive CLI semantics: if we can't decode the argument safely, the user gets correct results via the cold-start path instead of wrong results fast. Tests cover both a value-taking unknown flag (`--min-score 0.8`) and a boolean one (`--json`). Both now fall through; the mock server records no `/search` request.
Codex P2 on #5 round 3: `-n`, `--n`, `-c`, `--collection` unconditionally called `shift 2`. If the user omitted the value (e.g. `qmd search foo -n` or the flag happens to be the last token), `/bin/sh` aborted the whole script with "shift: can't shift that many" (exit 2) instead of letting the fast-path `return 1` and fall back to the cold-start CLI. Added `[ $# -ge 2 ] || return 1` guards before every `shift 2` in the fast-path. Malformed input is now indistinguishable from any other reason to fall through — the cold-start CLI handles the error reporting, which keeps the fast-path a strict subset of CLI semantics. Tests: - `dangling value flag (-n with no argument) falls through without aborting shell` - `dangling -c with no argument also falls through cleanly` Both assert stderr does NOT contain "can't shift" and that no `/search` request was made.
Codex P2 on #5 round 4: the fast-path defaulted to `limit: 10` when the user didn't pass `-n`, but the interactive CLI defaults to 5 (src/cli/qmd.ts --help line 2773: "Max results (default 5, or 20 for --files/--json)"). Scripts written against the CLI's default would see different result counts depending on whether QMD_DAEMON_URL was set — a behavior regression. Fix: `_qmd_limit=5`. The `--files` / `--json` special case (default 20) isn't reachable here because those flags are unknown to the fast-path parser and already route through the cold-start branch. Test: new `unset -n uses 5 to match the interactive CLI default`.
Codex P2 on #5 round 5: the fast-path forwarded the raw -n string into the JSON payload, but the interactive CLI resolves the limit with `parseInt(values.n, 10) || defaultLimit` (src/cli/qmd.ts:2550). As a result some inputs produced transport-dependent results: - `qmd search foo -n 0` → CLI resolves to 5 (default), fast-path was sending `"limit":0` (typically zero results). - `-n 1e2` → CLI `parseInt("1e2",10)` = 1 (stops at 'e'), fast-path was forwarding "1e2" which the daemon would fail/misinterpret. - Any non-numeric (`-n abc`, `-n -1`, empty) → CLI → default, fast-path → garbage payload. Fix: after parsing, normalise `_qmd_limit`. Empty, zero, or any value containing a non-digit collapses to the default of 5 — matching the `parseInt || default` semantics as closely as POSIX shell can. Tests: - `-n 0 normalises to the default 5` - `-n <non-numeric> normalises to the default 5`
Codex P1 on #5 round 6: the previous normalisation collapsed any value containing a non-digit to the default, and left zero-padded numbers like "00" intact. Both diverge from the interactive CLI: - `parseInt("1e2", 10)` === 1, not the default. The fast-path was collapsing "1e2" to 5, so a user passing `-n 1e2` would get different result counts depending on QMD_DAEMON_URL. - `"00"` with my all-digits regex stayed "00", producing `"limit":00` — invalid JSON — and causing the daemon request to fail (silently falling through to cold-start, but with user-observable confusion and no daemon speedup). Fix mirrors parseInt more precisely: 1. sed extracts the leading run of digits (`1e2` -> `1`, `5abc` -> `5`, `abc` -> empty). 2. Empty leading digits -> default 5. 3. Otherwise coerce via `$((expr))`, which normalises padded values (`007` -> 7) and emits plain integers. 4. If the coerced value is <= 0, fall back to default — matching `parseInt(...) || defaultLimit`. New tests exercise the specific divergence cases: - `-n 1e2` -> 1 (matches parseInt) - `-n 5abc` -> 5 (matches parseInt) - `-n abc` -> 5 (no leading digits -> default) - `-n 00` -> 5 (zero -> default, emits valid JSON) - `-n 007` -> 7 (strips leading zeros, emits valid JSON) Each JSON payload is round-tripped through JSON.parse to prove it's structurally valid — a payload with leading zeros would have thrown.
Codex rounds 5-7 on #5 kept surfacing new edge cases while trying to match `parseInt(values.n, 10) || defaultLimit` in POSIX shell (`1e2` → 1, `00` as invalid JSON, `-1`/`+7` signed values, zero-padding). Each fix exposed another subtle divergence from JS number parsing. Shifting approach: the fast-path now only accepts a plain positive integer with no sign, no leading zeros, no exponent, no trailing characters. Everything else (empty, `0`, `00`, `007`, `-1`, `+7`, `1e2`, `5abc`, `abc`, …) returns 1 from the fast-path function so the cold-start CLI handles the input with its real `parseInt` logic. This matches the PR's stated philosophy — "fast-path stays a strict subset of CLI semantics" — and eliminates the sh-vs-JS number-parsing minefield entirely. Users passing weird `-n` values get the same results they'd get without `QMD_DAEMON_URL` set, just slower. Tests: - `strict -n validation: non-plain-positive-integer values fall through` covers 0, 00, 007, -1, +7, 1e2, 5abc, abc. - `plain positive integers in -n are forwarded as-is` covers 1, 3, 20, 100.
…fallback Codex P2 on #5 round 8: `curl` was streaming the daemon response straight to stdout. If the daemon started writing then disconnected (connection reset, `--max-time` expiring mid-response), partial JSON would hit stdout before the fast-path's `return 1` triggered the cold-start fallback. Script consumers would then see `{"partial":` + cold-start text concatenated on one invocation, breaking expectations. Fix: capture the response in a shell variable, only `printf` it on success. A curl error (any exit code) short-circuits via `|| return 1` and stdout stays clean. The cold-start CLI then runs with a virgin output stream. Test: `daemon that writes partial bytes then drops the connection leaks no output` stands up a mock that responds 200 to /health, opens /search with a partial `{"partial":` then destroys the socket. Assertion: `stdout` contains no `partial`.
…ough /search Codex P2 on #5 round 9: `qmd vsearch` cold-start sets `minScore: 0.3` (vs 0 for search), runs a vector-only pipeline, and does not rerank. The daemon's `/search` REST endpoint exposes none of those knobs — it always calls `store.search(...)` (hybrid + rerank) with the caller-supplied minScore (default 0). Routing vsearch through the fast-path would silently change result ordering and filtering for scripts that rely on vsearch's cold-start behaviour. Rather than add pseudo-plumbing, vsearch is now *always* handled by the cold-start CLI. The fast-path is search-only. This matches the same strict-subset principle the PR has been applying to unknown flags and weird -n values: if the semantics aren't reliably expressible on the daemon side, cold-start owns it. Changes: - `_qmd_try_daemon` no longer accepts `vsearch`. - Top-level dispatch is `case "${1:-}" in search)` only. - Test `vsearch is intentionally NOT handled by the fast-path` asserts zero captures on the mock when vsearch is invoked with QMD_DAEMON_URL set. - README and CHANGELOG updated: fast-path is search-only, vsearch semantics rationale documented.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Upstream already exposes
POST /search(and/queryalias) on the MCP HTTPdaemon (
src/mcp/server.ts). Spinning up the full Node CLI for everyqmd searchcall — tokenizer load, SQLite open, etc. — is wasted work when adaemon is already running. This PR teaches
bin/qmdan opt-in fast-paththat POSTs to the daemon when
QMD_DAEMON_URLis set, and falls back to thenormal CLI on any error.
Zero changes to server-side code. No call sites changed. Everything is
opt-in via a single env var.
Behavior
QMD_DAEMON_URLunset → unchanged behavior; the shim spawns the fullCLI as before.
QMD_DAEMON_URL=http://127.0.0.1:7433+qmd search "query"→ a1-second
/healthprobe, then a 30-secondPOST /search. Output isbuffered so a failed request can't leak partial bytes before fallback.
curlmissing, health probe fails, non-2xx response, empty query, unknownflag, etc. The user never sees a fast-path-specific error.
vsearchis not routed through the fast-path — its semantics can't befaithfully expressed through
/search(dropped in commitd5b01e9afterreview).
Commits
One feature commit plus nine hardening follow-ups from review rounds:
feat(bin): opt-in daemon fast-path for search and vsearch— initialfast-path with
QMD_DAEMON_URLgating,/healthprobe, POST to/search.fix(bin): preserve argv + accept -n + accumulate multi -c— keep theshim's argv intact on fallback; support
-n(CLI's limit flag) andrepeated
-c <collection>.fix(bin): fall through on unknown flags instead of mangling the query—any flag we don't recognise defers to the full CLI rather than silently
being interpreted as part of the query string.
fix(bin): guard value-taking flags against missing arg before shift—avoid
shifterrors when-cor-nis the last argv token.fix(bin): match CLI default limit of 5 in fast-path— align with thefull CLI's documented default when
-nis absent.fix(bin): normalise -n to match parseInt(n, 10) semantics— threereview rounds iterating on
-n 3abc,-n "",-n -5to behave the wayNode's
parseInt(n, 10)does (or cleanly refuse).fix(bin): buffer daemon response so partial output can't leak before fallback— capture stdout, only emit on 2xx; on error, discard bufferand fall through to the CLI with the user seeing one clean result.
fix(bin): drop vsearch from fast-path — its semantics can't route through /search— vsearch uses vector-only ranking and takes parameters/searchdoesn't accept; better to leave it on the full CLI path than produce subtly
different results.
Test plan
npm run buildpasses (depends on fix(db): declare transaction() on local Database interface #579 forDatabase.transaction()type — see below)npm test— new coverage:test/bin-qmd-daemon-fastpath.test.ts(284 lines): fast-path hits/searchon healthy daemon, falls back on every failure mode above, argv
preservation across fallback,
-n/-cparsing equivalence with CLI,buffered-response invariants.
test/mcp.test.ts(+66 lines):/searchREST handler contract thefast-path relies on.
Dependencies
fix(db): declare transaction() on local Database interface).maincurrently failstsc -p tsconfig.build.jsonon an unrelated call site, so CI on this PR will be red until fix(db): declare transaction() on local Database interface #579 merges; I'll rebase onto post-mergemainthen.