Skip to content

feat(bin): opt-in daemon fast-path for qmd search via QMD_DAEMON_URL#582

Open
lukeboyett wants to merge 11 commits intotobi:mainfrom
lukeboyett:feat/bin-qmd-daemon-fastpath
Open

feat(bin): opt-in daemon fast-path for qmd search via QMD_DAEMON_URL#582
lukeboyett wants to merge 11 commits intotobi:mainfrom
lukeboyett:feat/bin-qmd-daemon-fastpath

Conversation

@lukeboyett
Copy link
Copy Markdown

Summary

Upstream already exposes POST /search (and /query alias) on the MCP HTTP
daemon (src/mcp/server.ts). Spinning up the full Node CLI for every
qmd search call — tokenizer load, SQLite open, etc. — is wasted work when a
daemon is already running. This PR teaches bin/qmd an opt-in fast-path
that POSTs to the daemon when QMD_DAEMON_URL is set, and falls back to the
normal CLI on any error.

Zero changes to server-side code. No call sites changed. Everything is
opt-in via a single env var.

Behavior

  • QMD_DAEMON_URL unset → unchanged behavior; the shim spawns the full
    CLI as before.
  • QMD_DAEMON_URL=http://127.0.0.1:7433 + qmd search "query" → a
    1-second /health probe, then a 30-second POST /search. Output is
    buffered so a failed request can't leak partial bytes before fallback.
  • Any failure mode falls through silently to the normal CLI: env unset,
    curl missing, health probe fails, non-2xx response, empty query, unknown
    flag, etc. The user never sees a fast-path-specific error.
  • vsearch is not routed through the fast-path — its semantics can't be
    faithfully expressed through /search (dropped in commit d5b01e9 after
    review).

Commits

One feature commit plus nine hardening follow-ups from review rounds:

  • feat(bin): opt-in daemon fast-path for search and vsearch — initial
    fast-path with QMD_DAEMON_URL gating, /health probe, POST to /search.
  • fix(bin): preserve argv + accept -n + accumulate multi -c — keep the
    shim's argv intact on fallback; support -n (CLI's limit flag) and
    repeated -c <collection>.
  • fix(bin): fall through on unknown flags instead of mangling the query
    any flag we don't recognise defers to the full CLI rather than silently
    being interpreted as part of the query string.
  • fix(bin): guard value-taking flags against missing arg before shift
    avoid shift errors when -c or -n is the last argv token.
  • fix(bin): match CLI default limit of 5 in fast-path — align with the
    full CLI's documented default when -n is absent.
  • fix(bin): normalise -n to match parseInt(n, 10) semantics — three
    review rounds iterating on -n 3abc, -n "", -n -5 to behave the way
    Node's parseInt(n, 10) does (or cleanly refuse).
  • fix(bin): buffer daemon response so partial output can't leak before fallback — capture stdout, only emit on 2xx; on error, discard buffer
    and fall through to the CLI with the user seeing one clean result.
  • fix(bin): drop vsearch from fast-path — its semantics can't route through /search — vsearch uses vector-only ranking and takes parameters /search
    doesn't accept; better to leave it on the full CLI path than produce subtly
    different results.

Test plan

  • npm run build passes (depends on fix(db): declare transaction() on local Database interface #579 for Database.transaction() type — see below)
  • npm test — new coverage:
    • test/bin-qmd-daemon-fastpath.test.ts (284 lines): fast-path hits /search
      on healthy daemon, falls back on every failure mode above, argv
      preservation across fallback, -n / -c parsing equivalence with CLI,
      buffered-response invariants.
    • test/mcp.test.ts (+66 lines): /search REST handler contract the
      fast-path relies on.

Dependencies

Luke Boyett added 11 commits April 14, 2026 08:34
The narrow cross-runtime Database interface in src/db.ts defines the
subset of better-sqlite3 / bun:sqlite methods used throughout QMD.
Commit fee576b ("fix: migrate legacy lowercase paths on reindex")
introduced a db.transaction(...) call in src/store.ts but did not
extend the interface, breaking `tsc -p tsconfig.build.json`:

  src/store.ts(2142,22): error TS2339: Property 'transaction' does
  not exist on type 'Database'.

Both underlying engines expose transaction(fn), so this just makes
the type reflect reality.
`qmd search foo` on a large index pays ~500–800 ms of Node + better-
sqlite3 + sqlite-vec bootstrap every call — most of the wall clock is
module initialization, not the query itself. `qmd mcp --http --daemon`
already ships as an always-warm server and already exposes a
`POST /search` REST endpoint (used today by MCP-adjacent tooling), so
the cold-start cost is avoidable for callers willing to keep a daemon
running.

This PR wires `bin/qmd` to that existing endpoint. When `QMD_DAEMON_URL`
is set, the shell wrapper:

- Health-checks `GET /health` with a 1 s timeout. On failure it falls
  through to the normal cold-start CLI.
- Translates `qmd search foo -c alma --limit 5` into
  `POST /search {"searches":[{"type":"lex","query":"foo"}],
  "collections":["alma"],"limit":5}`.
- Does the same for `vsearch` with `type:"vec"`.
- Prints whatever JSON the daemon returns (users that want formatted
  text leave `QMD_DAEMON_URL` unset).
- Silently falls through when `--index <name>` is used (one daemon =
  one index) or when any curl error occurs.

No new daemon or new endpoints are introduced; this is purely a CLI
routing change over existing server surface. Stays POSIX sh so it
works under the current #!/bin/sh shebang.

Tests:

- `test/mcp.test.ts` "MCP HTTP Transport" suite gains contract tests
  for the REST path `bin/qmd` depends on: happy path with
  `type="lex"` + collections, happy path with `null` collections
  (unset `-c`), happy path with `type="vec"` (`vsearch`), and a 400
  for missing `searches`. These pin the payload/response shape so a
  future upstream change to that endpoint can't silently break the
  fast-path.

Docs:

- README gains a "Fast CLI via daemon (scripts / agents)" section with
  the opt-in env var, example payload, and fall-through rules.
- CHANGELOG entry under Changes.
Three linked fixes for the daemon fast-path, addressing Codex round 1
on #5:

- P1 "Preserve CLI args when daemon POST falls through": the previous
  implementation did `shift` in the outer shell and then iterated `$@`
  destructively. A healthy /health followed by a failed POST would
  reach the fallback `exec` with mutated/empty argv, so the cold-start
  CLI would see the wrong command. Moved the whole fast-path into a
  shell function (`_qmd_try_daemon`) so its parsing operates on a
  local copy of the args. The outer shell's `$@` is never touched,
  and on any `return 1` (health fail, POST fail, empty query) control
  falls through to the lockfile-gated `exec` with the exact argv the
  user typed.

- P1 "Support -n limit flag in daemon fast-path parser": upstream's
  search limit flag is `-n <num>` (`src/cli/qmd.ts:2486`, documented
  at `--help` line 2773). The fast-path was parsing `-l|--limit` which
  does not exist on `search` (`-l` is a `get` option). `qmd search
  foo -n 3` would discard `-n`, treat `3` as part of the query, and
  use the default limit of 10. Now parses `-n`, `--n`, `--n=...`.

- P2 "Preserve all collection filters": `collection` is declared
  `multiple: true` in `src/cli/qmd.ts:2496` — `qmd search foo -c a -c
  b` should hit both collections. The parser stored a single scalar
  so only the last `-c` survived. Collections are now accumulated
  into a JSON array (`["a","b"]`) or rendered as `null` when no `-c`
  is passed.

Tests: new `test/bin-qmd-daemon-fastpath.test.ts` spins up a tiny HTTP
mock on a random port, exports QMD_DAEMON_URL, and spawns the shell
wrapper via async spawn (spawnSync would deadlock the mock's event
loop). It covers:

  - payload shape for `search -c -n`,
  - `vsearch` → `type:"vec"`,
  - multiple `-c` flags accumulate,
  - `--collection=` long form mixes with `-c`,
  - unset `-c` yields `collections: null`,
  - `--index` bypasses (zero captures),
  - non-`search` subcommands don't touch the daemon,
  - health-check 500 falls through without attempting POST.
Codex P1 on #5 round 2: the fast-path's catch-all for unrecognised
dashed flags was `shift` (drop the flag token only). Several upstream
`search` / `vsearch` options take a value — `--min-score <N>`,
`-C/--candidate-limit`, `--intent <text>`, etc. — while others are
booleans. A call like `qmd search foo --min-score 0.8` would drop
`--min-score`, then on the next loop the `*)` arm would treat `0.8`
as part of the query text. The daemon would receive `foo 0.8` and
the min-score semantics would vanish silently.

Rather than enumerate every upstream flag (brittle — breaks each time
upstream adds one), the unknown-flag branch now `return 1`s out of
the fast-path and lets the cold-start CLI handle the invocation.
The opt-in fast-path stays a strict subset of interactive CLI
semantics: if we can't decode the argument safely, the user gets
correct results via the cold-start path instead of wrong results
fast.

Tests cover both a value-taking unknown flag (`--min-score 0.8`)
and a boolean one (`--json`). Both now fall through; the mock
server records no `/search` request.
Codex P2 on #5 round 3: `-n`, `--n`, `-c`, `--collection` unconditionally
called `shift 2`. If the user omitted the value (e.g. `qmd search foo -n`
or the flag happens to be the last token), `/bin/sh` aborted the whole
script with "shift: can't shift that many" (exit 2) instead of letting
the fast-path `return 1` and fall back to the cold-start CLI.

Added `[ $# -ge 2 ] || return 1` guards before every `shift 2` in the
fast-path. Malformed input is now indistinguishable from any other
reason to fall through — the cold-start CLI handles the error reporting,
which keeps the fast-path a strict subset of CLI semantics.

Tests:
- `dangling value flag (-n with no argument) falls through without
  aborting shell`
- `dangling -c with no argument also falls through cleanly`

Both assert stderr does NOT contain "can't shift" and that no
`/search` request was made.
Codex P2 on #5 round 4: the fast-path defaulted to `limit: 10` when
the user didn't pass `-n`, but the interactive CLI defaults to 5
(src/cli/qmd.ts --help line 2773: "Max results (default 5, or 20
for --files/--json)"). Scripts written against the CLI's default
would see different result counts depending on whether
QMD_DAEMON_URL was set — a behavior regression.

Fix: `_qmd_limit=5`. The `--files` / `--json` special case (default
20) isn't reachable here because those flags are unknown to the
fast-path parser and already route through the cold-start branch.

Test: new `unset -n uses 5 to match the interactive CLI default`.
Codex P2 on #5 round 5: the fast-path forwarded the raw -n string into
the JSON payload, but the interactive CLI resolves the limit with
`parseInt(values.n, 10) || defaultLimit` (src/cli/qmd.ts:2550). As a
result some inputs produced transport-dependent results:

  - `qmd search foo -n 0` → CLI resolves to 5 (default), fast-path was
    sending `"limit":0` (typically zero results).
  - `-n 1e2` → CLI `parseInt("1e2",10)` = 1 (stops at 'e'), fast-path
    was forwarding "1e2" which the daemon would fail/misinterpret.
  - Any non-numeric (`-n abc`, `-n -1`, empty) → CLI → default,
    fast-path → garbage payload.

Fix: after parsing, normalise `_qmd_limit`. Empty, zero, or any value
containing a non-digit collapses to the default of 5 — matching the
`parseInt || default` semantics as closely as POSIX shell can.

Tests:
- `-n 0 normalises to the default 5`
- `-n <non-numeric> normalises to the default 5`
Codex P1 on #5 round 6: the previous normalisation collapsed any
value containing a non-digit to the default, and left zero-padded
numbers like "00" intact. Both diverge from the interactive CLI:

  - `parseInt("1e2", 10)` === 1, not the default. The fast-path was
    collapsing "1e2" to 5, so a user passing `-n 1e2` would get
    different result counts depending on QMD_DAEMON_URL.
  - `"00"` with my all-digits regex stayed "00", producing
    `"limit":00` — invalid JSON — and causing the daemon request
    to fail (silently falling through to cold-start, but with
    user-observable confusion and no daemon speedup).

Fix mirrors parseInt more precisely:

  1. sed extracts the leading run of digits (`1e2` -> `1`, `5abc`
     -> `5`, `abc` -> empty).
  2. Empty leading digits -> default 5.
  3. Otherwise coerce via `$((expr))`, which normalises padded
     values (`007` -> 7) and emits plain integers.
  4. If the coerced value is <= 0, fall back to default — matching
     `parseInt(...) || defaultLimit`.

New tests exercise the specific divergence cases:
  - `-n 1e2` -> 1 (matches parseInt)
  - `-n 5abc` -> 5 (matches parseInt)
  - `-n abc` -> 5 (no leading digits -> default)
  - `-n 00` -> 5 (zero -> default, emits valid JSON)
  - `-n 007` -> 7 (strips leading zeros, emits valid JSON)

Each JSON payload is round-tripped through JSON.parse to prove it's
structurally valid — a payload with leading zeros would have thrown.
Codex rounds 5-7 on #5 kept surfacing new edge cases while trying to
match `parseInt(values.n, 10) || defaultLimit` in POSIX shell
(`1e2` → 1, `00` as invalid JSON, `-1`/`+7` signed values, zero-padding).
Each fix exposed another subtle divergence from JS number parsing.

Shifting approach: the fast-path now only accepts a plain positive
integer with no sign, no leading zeros, no exponent, no trailing
characters. Everything else (empty, `0`, `00`, `007`, `-1`, `+7`,
`1e2`, `5abc`, `abc`, …) returns 1 from the fast-path function so
the cold-start CLI handles the input with its real `parseInt` logic.

This matches the PR's stated philosophy — "fast-path stays a strict
subset of CLI semantics" — and eliminates the sh-vs-JS number-parsing
minefield entirely. Users passing weird `-n` values get the same
results they'd get without `QMD_DAEMON_URL` set, just slower.

Tests:
- `strict -n validation: non-plain-positive-integer values fall
  through` covers 0, 00, 007, -1, +7, 1e2, 5abc, abc.
- `plain positive integers in -n are forwarded as-is` covers 1, 3,
  20, 100.
…fallback

Codex P2 on #5 round 8: `curl` was streaming the daemon response
straight to stdout. If the daemon started writing then disconnected
(connection reset, `--max-time` expiring mid-response), partial
JSON would hit stdout before the fast-path's `return 1` triggered
the cold-start fallback. Script consumers would then see
  `{"partial":` + cold-start text
concatenated on one invocation, breaking expectations.

Fix: capture the response in a shell variable, only `printf` it on
success. A curl error (any exit code) short-circuits via `|| return
1` and stdout stays clean. The cold-start CLI then runs with a
virgin output stream.

Test: `daemon that writes partial bytes then drops the connection
leaks no output` stands up a mock that responds 200 to /health,
opens /search with a partial `{"partial":` then destroys the socket.
Assertion: `stdout` contains no `partial`.
…ough /search

Codex P2 on #5 round 9: `qmd vsearch` cold-start sets `minScore: 0.3`
(vs 0 for search), runs a vector-only pipeline, and does not rerank.
The daemon's `/search` REST endpoint exposes none of those knobs —
it always calls `store.search(...)` (hybrid + rerank) with the
caller-supplied minScore (default 0). Routing vsearch through the
fast-path would silently change result ordering and filtering for
scripts that rely on vsearch's cold-start behaviour.

Rather than add pseudo-plumbing, vsearch is now *always* handled by
the cold-start CLI. The fast-path is search-only. This matches the
same strict-subset principle the PR has been applying to unknown
flags and weird -n values: if the semantics aren't reliably
expressible on the daemon side, cold-start owns it.

Changes:
- `_qmd_try_daemon` no longer accepts `vsearch`.
- Top-level dispatch is `case "${1:-}" in search)` only.
- Test `vsearch is intentionally NOT handled by the fast-path`
  asserts zero captures on the mock when vsearch is invoked with
  QMD_DAEMON_URL set.
- README and CHANGELOG updated: fast-path is search-only, vsearch
  semantics rationale documented.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant