diff --git a/README.md b/README.md index 6f318446..f9cc2353 100644 --- a/README.md +++ b/README.md @@ -475,11 +475,12 @@ The `query` command uses **Reciprocal Rank Fusion (RRF)** with position-aware bl ### System Requirements - **Node.js** >= 22 -- **Bun** >= 1.0.0 +- **Bun** >= 1.0.0 on supported Bun platforms - **macOS**: Homebrew SQLite (for extension support) ```sh brew install sqlite ``` +- **FreeBSD**: Node.js-only core path for now. BM25 and sqlite-vec search are supported immediately; embeddings, reranking, and query expansion follow the upstream node-llama-cpp FreeBSD fix. See [docs/FREEBSD.md](docs/FREEBSD.md) for the verified source-install path and current smoke/certification entrypoints. ### GGUF Models (via node-llama-cpp) @@ -523,6 +524,18 @@ npm install -g @tobilu/qmd bun install -g @tobilu/qmd ``` +### FreeBSD + +The current FreeBSD upstream path is a Node.js source-install flow. + +See [docs/FREEBSD.md](docs/FREEBSD.md) for: + +- the core BM25 + sqlite-vec bring-up path +- `sqlite-vec` / `vec0.so` build steps +- `QMD_SQLITE_VEC_PATH` usage +- the immediate `test/freebsd-smoke.sh --quick` verification path +- the follow-up `--full` path once FreeBSD `node-llama-cpp` support lands upstream + ### Development ```sh diff --git a/docs/FREEBSD.md b/docs/FREEBSD.md new file mode 100644 index 00000000..fbed7d4f --- /dev/null +++ b/docs/FREEBSD.md @@ -0,0 +1,116 @@ +# FreeBSD + +This is the current source-checkout path for running QMD on FreeBSD. + +Current upstream scope: + +- Node.js only +- BM25 and sqlite-vec search supported now +- embeddings, query expansion, and reranking follow the upstream `node-llama-cpp` FreeBSD fix +- Bun is not supported on FreeBSD yet + +## Verified Prerequisites + +For the immediate core path on the current FreeBSD host, the working package set was: + +```sh +doas pkg install node24 corepack git python3 gmake bash sqlite3 gettext-runtime +``` + +If you are validating the follow-up LLM path locally after the `node-llama-cpp` FreeBSD fix lands, also install: + +```sh +doas pkg install cmake ninja +``` + +Notes: + +- `node24` is the validated package name on this host. Any Node.js package providing Node `>=22` should be acceptable. +- `gettext-runtime` provides `envsubst`, which `sqlite-vec` uses while generating `sqlite-vec.h`. +- `python3` is needed for native Node addon builds such as `better-sqlite3`. +- `git` is needed to clone `qmd` and `sqlite-vec`. The later `node-llama-cpp` follow-up may also clone `llama.cpp`. + +## Build `vec0.so` + +QMD's FreeBSD path uses a real SQLite loadable extension, similar in spirit to the macOS Homebrew flow. + +Clone `sqlite-vec` next to `qmd`: + +```sh +git clone https://github.com/asg017/sqlite-vec ../sqlite-vec +cd ../sqlite-vec +gmake sqlite-vec.h loadable +``` + +Verify the extension directly: + +```sh +sqlite3 :memory: ".load $(pwd)/dist/vec0.so" "select vec_version();" +``` + +Expected output: + +```text +v0.1.10-alpha.3 +``` + +## Install QMD From Source + +```sh +cd ../qmd +corepack pnpm install --frozen-lockfile +QMD_SQLITE_VEC_PATH="$(cd ../sqlite-vec && pwd)/dist/vec0.so" corepack pnpm qmd status +``` + +The install is still usable if `pnpm` reports that the optional `node-llama-cpp` dependency was skipped on FreeBSD. + +## Core Smoke Test + +Run the immediate FreeBSD verification path from the QMD repo: + +```sh +test/freebsd-smoke.sh --quick +``` + +The script will: + +- reuse `QMD_SQLITE_VEC_PATH` if you set it +- otherwise use `../sqlite-vec/dist/vec0.so` +- otherwise build `../sqlite-vec/dist/vec0.so` automatically if `../sqlite-vec` exists + +`--quick` validates: + +- collection add/index +- BM25 search +- `qmd ls` +- `qmd get` +- direct `sqlite-vec` loading through `sqlite3` + +## Full Smoke Follow-up + +`test/freebsd-smoke.sh --full` remains in-tree for the later stage where FreeBSD `node-llama-cpp` support is available. + +On this immediate upstream path, `--full` requires either the adjacent `node-llama-cpp` FreeBSD fix or another locally working FreeBSD `node-llama-cpp` build. + +When that backend is available, `--full` validates: + +- collection add/index +- BM25 search +- context add/list/get/remove +- direct `sqlite-vec` loading through `sqlite3` +- `qmd embed` +- `qmd vsearch` +- `qmd query --explain` +- `qmd update` with a collection update hook +- stale-file removal after re-index +- `qmd cleanup` orphan-vector reclamation +- collection rename with preserved context/update settings +- secondary collection add/remove +- optional `qmd status` device probing through `QMD_STATUS_LLM_PROBE=1` + +## Runtime Notes + +- On FreeBSD, `qmd status` skips the LLM device probe by default to keep status fast and resilient. Use `QMD_STATUS_LLM_PROBE=1` to force the probe. +- On this immediate upstream path, FreeBSD LLM commands use a non-building `node-llama-cpp` policy and fail fast until the upstream FreeBSD `node-llama-cpp` fix lands. +- Set `QMD_LLAMA_GPU=false` to force CPU mode explicitly when the later LLM path is enabled. +- `QMD_SQLITE_VEC_PATH` remains the most deterministic way to point QMD at `vec0.so`, even though QMD also probes known FreeBSD locations. diff --git a/package.json b/package.json index 0ec04c9c..ceeecf4b 100644 --- a/package.json +++ b/package.json @@ -48,7 +48,6 @@ "@modelcontextprotocol/sdk": "1.29.0", "better-sqlite3": "12.8.0", "fast-glob": "3.3.3", - "node-llama-cpp": "3.18.1", "picomatch": "4.0.4", "sqlite-vec": "0.1.9", "web-tree-sitter": "0.26.7", @@ -56,6 +55,7 @@ "zod": "4.2.1" }, "optionalDependencies": { + "node-llama-cpp": "3.18.1", "sqlite-vec-darwin-arm64": "0.1.9", "sqlite-vec-darwin-x64": "0.1.9", "sqlite-vec-linux-arm64": "0.1.9", diff --git a/pnpm-lock.yaml b/pnpm-lock.yaml index ad7723c1..6fb4dbdd 100644 --- a/pnpm-lock.yaml +++ b/pnpm-lock.yaml @@ -17,9 +17,6 @@ importers: fast-glob: specifier: 3.3.3 version: 3.3.3 - node-llama-cpp: - specifier: 3.18.1 - version: 3.18.1(typescript@5.9.3) picomatch: specifier: 4.0.4 version: 4.0.4 @@ -49,6 +46,9 @@ importers: specifier: 3.2.4 version: 3.2.4(@types/node@25.5.2)(tsx@4.21.0)(yaml@2.8.3) optionalDependencies: + node-llama-cpp: + specifier: 3.18.1 + version: 3.18.1(typescript@5.9.3) sqlite-vec-darwin-arm64: specifier: 0.1.9 version: 0.1.9 @@ -1856,11 +1856,13 @@ snapshots: dependencies: hono: 4.12.10 - '@huggingface/jinja@0.5.6': {} + '@huggingface/jinja@0.5.6': + optional: true '@isaacs/fs-minipass@4.0.1': dependencies: minipass: 7.1.3 + optional: true '@jridgewell/sourcemap-codec@1.5.5': {} @@ -1869,8 +1871,10 @@ snapshots: debug: 4.4.3 transitivePeerDependencies: - supports-color + optional: true - '@kwsites/promise-deferred@1.1.1': {} + '@kwsites/promise-deferred@1.1.1': + optional: true '@modelcontextprotocol/sdk@1.29.0(zod@4.2.1)': dependencies: @@ -2056,7 +2060,8 @@ snapshots: '@rollup/rollup-win32-x64-msvc@4.60.1': optional: true - '@tinyhttp/content-disposition@2.2.4': {} + '@tinyhttp/content-disposition@2.2.4': + optional: true '@types/better-sqlite3@7.6.13': dependencies: @@ -2133,23 +2138,29 @@ snapshots: json-schema-traverse: 1.0.0 require-from-string: 2.0.2 - ansi-escapes@6.2.1: {} + ansi-escapes@6.2.1: + optional: true - ansi-regex@5.0.1: {} + ansi-regex@5.0.1: + optional: true - ansi-regex@6.2.2: {} + ansi-regex@6.2.2: + optional: true ansi-styles@4.3.0: dependencies: color-convert: 2.0.1 + optional: true - ansi-styles@6.2.3: {} + ansi-styles@6.2.3: + optional: true assertion-error@2.0.1: {} async-retry@1.3.3: dependencies: retry: 0.13.1 + optional: true base64-js@1.5.1: {} @@ -2213,31 +2224,39 @@ snapshots: loupe: 3.2.1 pathval: 2.0.1 - chalk@5.6.2: {} + chalk@5.6.2: + optional: true check-error@2.1.3: {} - chmodrp@1.0.2: {} + chmodrp@1.0.2: + optional: true chownr@1.1.4: {} - chownr@3.0.0: {} + chownr@3.0.0: + optional: true - ci-info@4.4.0: {} + ci-info@4.4.0: + optional: true cli-cursor@5.0.0: dependencies: restore-cursor: 5.1.0 + optional: true - cli-spinners@2.9.2: {} + cli-spinners@2.9.2: + optional: true - cli-spinners@3.4.0: {} + cli-spinners@3.4.0: + optional: true cliui@8.0.1: dependencies: string-width: 4.2.3 strip-ansi: 6.0.1 wrap-ansi: 7.0.0 + optional: true cmake-js@8.0.0: dependencies: @@ -2252,14 +2271,18 @@ snapshots: yargs: 17.7.2 transitivePeerDependencies: - supports-color + optional: true color-convert@2.0.1: dependencies: color-name: 1.1.4 + optional: true - color-name@1.1.4: {} + color-name@1.1.4: + optional: true - commander@10.0.1: {} + commander@10.0.1: + optional: true content-disposition@1.0.1: {} @@ -2304,9 +2327,11 @@ snapshots: ee-first@1.1.1: {} - emoji-regex@10.6.0: {} + emoji-regex@10.6.0: + optional: true - emoji-regex@8.0.0: {} + emoji-regex@8.0.0: + optional: true encodeurl@2.0.0: {} @@ -2314,7 +2339,8 @@ snapshots: dependencies: once: 1.4.0 - env-var@7.5.0: {} + env-var@7.5.0: + optional: true es-define-property@1.0.1: {} @@ -2355,7 +2381,8 @@ snapshots: '@esbuild/win32-ia32': 0.27.7 '@esbuild/win32-x64': 0.27.7 - escalade@3.2.0: {} + escalade@3.2.0: + optional: true escape-html@1.0.3: {} @@ -2365,7 +2392,8 @@ snapshots: etag@1.8.1: {} - eventemitter3@5.0.4: {} + eventemitter3@5.0.4: + optional: true eventsource-parser@3.0.6: {} @@ -2437,11 +2465,13 @@ snapshots: file-uri-to-path@1.0.0: {} - filename-reserved-regex@3.0.0: {} + filename-reserved-regex@3.0.0: + optional: true filenamify@6.0.0: dependencies: filename-reserved-regex: 3.0.0 + optional: true fill-range@7.1.1: dependencies: @@ -2469,15 +2499,18 @@ snapshots: graceful-fs: 4.2.11 jsonfile: 6.2.0 universalify: 2.0.1 + optional: true fsevents@2.3.3: optional: true function-bind@1.1.2: {} - get-caller-file@2.0.5: {} + get-caller-file@2.0.5: + optional: true - get-east-asian-width@1.5.0: {} + get-east-asian-width@1.5.0: + optional: true get-intrinsic@1.3.0: dependencies: @@ -2509,7 +2542,8 @@ snapshots: gopd@1.2.0: {} - graceful-fs@4.2.11: {} + graceful-fs@4.2.11: + optional: true has-symbols@1.1.0: {} @@ -2533,7 +2567,8 @@ snapshots: ieee754@1.2.1: {} - ignore@7.0.5: {} + ignore@7.0.5: + optional: true inherits@2.0.4: {} @@ -2566,30 +2601,36 @@ snapshots: strip-ansi: 7.2.0 optionalDependencies: '@reflink/reflink': 0.1.19 + optional: true is-extglob@2.1.1: {} - is-fullwidth-code-point@3.0.0: {} + is-fullwidth-code-point@3.0.0: + optional: true is-fullwidth-code-point@5.1.0: dependencies: get-east-asian-width: 1.5.0 + optional: true is-glob@4.0.3: dependencies: is-extglob: 2.1.1 - is-interactive@2.0.0: {} + is-interactive@2.0.0: + optional: true is-number@7.0.0: {} is-promise@4.0.0: {} - is-unicode-supported@2.1.0: {} + is-unicode-supported@2.1.0: + optional: true isexe@2.0.0: {} - isexe@4.0.0: {} + isexe@4.0.0: + optional: true jose@6.2.2: {} @@ -2604,23 +2645,29 @@ snapshots: universalify: 2.0.1 optionalDependencies: graceful-fs: 4.2.11 + optional: true - lifecycle-utils@2.1.0: {} + lifecycle-utils@2.1.0: + optional: true - lifecycle-utils@3.1.1: {} + lifecycle-utils@3.1.1: + optional: true - lodash.debounce@4.0.8: {} + lodash.debounce@4.0.8: + optional: true log-symbols@7.0.1: dependencies: is-unicode-supported: 2.1.0 yoctocolors: 2.1.2 + optional: true loupe@3.2.1: {} lowdb@7.0.1: dependencies: steno: 4.0.2 + optional: true magic-string@0.30.21: dependencies: @@ -2645,17 +2692,20 @@ snapshots: dependencies: mime-db: 1.54.0 - mimic-function@5.0.1: {} + mimic-function@5.0.1: + optional: true mimic-response@3.1.0: {} minimist@1.2.8: {} - minipass@7.1.3: {} + minipass@7.1.3: + optional: true minizlib@3.1.0: dependencies: minipass: 7.1.3 + optional: true mkdirp-classic@0.5.3: {} @@ -2663,7 +2713,8 @@ snapshots: nanoid@3.3.11: {} - nanoid@5.1.7: {} + nanoid@5.1.7: + optional: true napi-build-utils@2.0.0: {} @@ -2673,9 +2724,11 @@ snapshots: dependencies: semver: 7.7.4 - node-addon-api@8.7.0: {} + node-addon-api@8.7.0: + optional: true - node-api-headers@1.8.0: {} + node-api-headers@1.8.0: + optional: true node-gyp-build@4.8.4: optional: true @@ -2727,6 +2780,7 @@ snapshots: typescript: 5.9.3 transitivePeerDependencies: - supports-color + optional: true object-assign@4.1.1: {} @@ -2743,6 +2797,7 @@ snapshots: onetime@7.0.0: dependencies: mimic-function: 5.0.1 + optional: true ora@9.3.0: dependencies: @@ -2754,10 +2809,13 @@ snapshots: log-symbols: 7.0.1 stdin-discarder: 0.3.1 string-width: 8.2.0 + optional: true - parse-ms@3.0.0: {} + parse-ms@3.0.0: + optional: true - parse-ms@4.0.0: {} + parse-ms@4.0.0: + optional: true parseurl@1.3.3: {} @@ -2798,21 +2856,25 @@ snapshots: tar-fs: 2.1.4 tunnel-agent: 0.6.0 - pretty-bytes@6.1.1: {} + pretty-bytes@6.1.1: + optional: true pretty-ms@8.0.0: dependencies: parse-ms: 3.0.0 + optional: true pretty-ms@9.3.0: dependencies: parse-ms: 4.0.0 + optional: true proper-lockfile@4.1.2: dependencies: graceful-fs: 4.2.11 retry: 0.12.0 signal-exit: 3.0.7 + optional: true proxy-addr@2.0.7: dependencies: @@ -2852,7 +2914,8 @@ snapshots: string_decoder: 1.3.0 util-deprecate: 1.0.2 - require-directory@2.1.1: {} + require-directory@2.1.1: + optional: true require-from-string@2.0.2: {} @@ -2862,10 +2925,13 @@ snapshots: dependencies: onetime: 7.0.0 signal-exit: 4.1.0 + optional: true - retry@0.12.0: {} + retry@0.12.0: + optional: true - retry@0.13.1: {} + retry@0.13.1: + optional: true reusify@1.1.0: {} @@ -2983,9 +3049,11 @@ snapshots: siginfo@2.0.0: {} - signal-exit@3.0.7: {} + signal-exit@3.0.7: + optional: true - signal-exit@4.1.0: {} + signal-exit@4.1.0: + optional: true simple-concat@1.0.1: {} @@ -3002,18 +3070,22 @@ snapshots: debug: 4.4.3 transitivePeerDependencies: - supports-color + optional: true - sleep-promise@9.1.0: {} + sleep-promise@9.1.0: + optional: true slice-ansi@7.1.2: dependencies: ansi-styles: 6.2.3 is-fullwidth-code-point: 5.1.0 + optional: true slice-ansi@8.0.0: dependencies: ansi-styles: 6.2.3 is-fullwidth-code-point: 5.1.0 + optional: true source-map-js@1.2.1: {} @@ -3046,7 +3118,8 @@ snapshots: std-env@3.10.0: {} - stdin-discarder@0.3.1: {} + stdin-discarder@0.3.1: + optional: true stdout-update@4.0.1: dependencies: @@ -3054,25 +3127,30 @@ snapshots: ansi-styles: 6.2.3 string-width: 7.2.0 strip-ansi: 7.2.0 + optional: true - steno@4.0.2: {} + steno@4.0.2: + optional: true string-width@4.2.3: dependencies: emoji-regex: 8.0.0 is-fullwidth-code-point: 3.0.0 strip-ansi: 6.0.1 + optional: true string-width@7.2.0: dependencies: emoji-regex: 10.6.0 get-east-asian-width: 1.5.0 strip-ansi: 7.2.0 + optional: true string-width@8.2.0: dependencies: get-east-asian-width: 1.5.0 strip-ansi: 7.2.0 + optional: true string_decoder@1.3.0: dependencies: @@ -3081,10 +3159,12 @@ snapshots: strip-ansi@6.0.1: dependencies: ansi-regex: 5.0.1 + optional: true strip-ansi@7.2.0: dependencies: ansi-regex: 6.2.2 + optional: true strip-json-comments@2.0.1: {} @@ -3114,6 +3194,7 @@ snapshots: minipass: 7.1.3 minizlib: 3.1.0 yallist: 5.0.0 + optional: true tinybench@2.9.0: {} @@ -3188,15 +3269,18 @@ snapshots: undici-types@7.18.2: {} - universalify@2.0.1: {} + universalify@2.0.1: + optional: true unpipe@1.0.0: {} - url-join@4.0.1: {} + url-join@4.0.1: + optional: true util-deprecate@1.0.2: {} - validate-npm-package-name@7.0.2: {} + validate-npm-package-name@7.0.2: + optional: true vary@1.1.2: {} @@ -3285,6 +3369,7 @@ snapshots: which@6.0.1: dependencies: isexe: 4.0.0 + optional: true why-is-node-running@2.3.0: dependencies: @@ -3296,16 +3381,20 @@ snapshots: ansi-styles: 4.3.0 string-width: 4.2.3 strip-ansi: 6.0.1 + optional: true wrappy@1.0.2: {} - y18n@5.0.8: {} + y18n@5.0.8: + optional: true - yallist@5.0.0: {} + yallist@5.0.0: + optional: true yaml@2.8.3: {} - yargs-parser@21.1.1: {} + yargs-parser@21.1.1: + optional: true yargs@17.7.2: dependencies: @@ -3316,8 +3405,10 @@ snapshots: string-width: 4.2.3 y18n: 5.0.8 yargs-parser: 21.1.1 + optional: true - yoctocolors@2.1.2: {} + yoctocolors@2.1.2: + optional: true zod-to-json-schema@3.25.2(zod@4.2.1): dependencies: diff --git a/src/cli/qmd.ts b/src/cli/qmd.ts index a09ffb33..cd5004d1 100755 --- a/src/cli/qmd.ts +++ b/src/cli/qmd.ts @@ -461,33 +461,41 @@ async function showStatus(): Promise { } // Device / GPU info - try { - const llm = getDefaultLlamaCpp(); - const device = await llm.getDeviceInfo(); - console.log(`\n${c.bold}Device${c.reset}`); - if (device.gpu) { - console.log(` GPU: ${c.green}${device.gpu}${c.reset} (offloading: ${device.gpuOffloading ? 'yes' : 'no'})`); - if (device.gpuDevices.length > 0) { - // Deduplicate and count GPUs - const counts = new Map(); - for (const name of device.gpuDevices) { - counts.set(name, (counts.get(name) || 0) + 1); + const shouldProbeLlmDevice = + process.platform !== "freebsd" || process.env.QMD_STATUS_LLM_PROBE === "1"; + + if (shouldProbeLlmDevice) { + try { + const llm = getDefaultLlamaCpp(); + const device = await llm.getDeviceInfo(); + console.log(`\n${c.bold}Device${c.reset}`); + if (device.gpu) { + console.log(` GPU: ${c.green}${device.gpu}${c.reset} (offloading: ${device.gpuOffloading ? 'yes' : 'no'})`); + if (device.gpuDevices.length > 0) { + // Deduplicate and count GPUs + const counts = new Map(); + for (const name of device.gpuDevices) { + counts.set(name, (counts.get(name) || 0) + 1); + } + const deviceStr = Array.from(counts.entries()) + .map(([name, count]) => count > 1 ? `${count}× ${name}` : name) + .join(', '); + console.log(` Devices: ${deviceStr}`); } - const deviceStr = Array.from(counts.entries()) - .map(([name, count]) => count > 1 ? `${count}× ${name}` : name) - .join(', '); - console.log(` Devices: ${deviceStr}`); - } - if (device.vram) { - console.log(` VRAM: ${formatBytes(device.vram.free)} free / ${formatBytes(device.vram.total)} total`); + if (device.vram) { + console.log(` VRAM: ${formatBytes(device.vram.free)} free / ${formatBytes(device.vram.total)} total`); + } + } else { + console.log(` GPU: ${c.yellow}none${c.reset} (running on CPU — models will be slow)`); + console.log(` ${c.dim}Tip: Install CUDA, Vulkan, or Metal support for GPU acceleration.${c.reset}`); } - } else { - console.log(` GPU: ${c.yellow}none${c.reset} (running on CPU — models will be slow)`); - console.log(` ${c.dim}Tip: Install CUDA, Vulkan, or Metal support for GPU acceleration.${c.reset}`); + console.log(` CPU: ${device.cpuCores} math cores`); + } catch { + // Don't fail status if LLM init fails } - console.log(` CPU: ${device.cpuCores} math cores`); - } catch { - // Don't fail status if LLM init fails + } else { + console.log(`\n${c.bold}Device${c.reset}`); + console.log(` ${c.dim}LLM device probe skipped on FreeBSD. Set QMD_STATUS_LLM_PROBE=1 to probe node-llama-cpp.${c.reset}`); } // Tips section @@ -3091,21 +3099,26 @@ if (isMain) { break; case "pull": { - const refresh = cli.values.refresh === undefined ? false : Boolean(cli.values.refresh); - const models = [ - DEFAULT_EMBED_MODEL_URI, - DEFAULT_GENERATE_MODEL_URI, - DEFAULT_RERANK_MODEL_URI, - ]; - console.log(`${c.bold}Pulling models${c.reset}`); - const results = await pullModels(models, { - refresh, - cacheDir: DEFAULT_MODEL_CACHE_DIR, - }); - for (const result of results) { - const size = formatBytes(result.sizeBytes); - const note = result.refreshed ? "refreshed" : "cached/checked"; - console.log(`- ${result.model} -> ${result.path} (${size}, ${note})`); + try { + const refresh = cli.values.refresh === undefined ? false : Boolean(cli.values.refresh); + const models = [ + DEFAULT_EMBED_MODEL_URI, + DEFAULT_GENERATE_MODEL_URI, + DEFAULT_RERANK_MODEL_URI, + ]; + console.log(`${c.bold}Pulling models${c.reset}`); + const results = await pullModels(models, { + refresh, + cacheDir: DEFAULT_MODEL_CACHE_DIR, + }); + for (const result of results) { + const size = formatBytes(result.sizeBytes); + const note = result.refreshed ? "refreshed" : "cached/checked"; + console.log(`- ${result.model} -> ${result.path} (${size}, ${note})`); + } + } catch (error) { + console.error(error instanceof Error ? error.message : String(error)); + process.exit(1); } break; } diff --git a/src/db.ts b/src/db.ts index 5fe7ab47..da67dfd7 100644 --- a/src/db.ts +++ b/src/db.ts @@ -11,10 +11,24 @@ * SQLite build before creating any database instances. */ -export const isBun = typeof globalThis.Bun !== "undefined"; +import { + createSqliteVecUnavailableError, + resolveSqliteVecLoadablePath, +} from "./platform/sqlite-vec.js"; + +const bunGlobal = globalThis as typeof globalThis & { Bun?: unknown }; +export const isBun = typeof bunGlobal.Bun !== "undefined"; let _Database: any; let _sqliteVecLoad: ((db: any) => void) | null; +let _sqliteVecUnavailableReason: string | null = null; + +function getErrorMessage(err: unknown): string { + return err instanceof Error ? err.message : String(err); +} + +const sqliteVec = resolveSqliteVecLoadablePath(); +const sqliteVecPath = sqliteVec.path; if (isBun) { // Dynamic string prevents tsc from resolving bun:sqlite on Node.js builds @@ -38,21 +52,31 @@ if (isBun) { _Database = BunDatabase; // setCustomSQLite may have silently failed — test that extensions actually work. - try { - const { getLoadablePath } = await import("sqlite-vec"); - const vecPath = getLoadablePath(); - const testDb = new BunDatabase(":memory:"); - testDb.loadExtension(vecPath); - testDb.close(); - _sqliteVecLoad = (db: any) => db.loadExtension(vecPath); - } catch { - // Vector search won't work, but BM25 and other operations are unaffected. + if (sqliteVecPath) { + try { + const testDb = new BunDatabase(":memory:"); + testDb.loadExtension(sqliteVecPath); + testDb.close(); + _sqliteVecLoad = (db: any) => db.loadExtension(sqliteVecPath); + _sqliteVecUnavailableReason = null; + } catch (err) { + // Vector search won't work, but BM25 and other operations are unaffected. + _sqliteVecLoad = null; + _sqliteVecUnavailableReason = `sqlite-vec probe failed (${getErrorMessage(err)})`; + } + } else { _sqliteVecLoad = null; + _sqliteVecUnavailableReason = "No loadable sqlite-vec extension was found"; } } else { _Database = (await import("better-sqlite3")).default; - const sqliteVec = await import("sqlite-vec"); - _sqliteVecLoad = (db: any) => sqliteVec.load(db); + if (sqliteVecPath) { + _sqliteVecLoad = (db: any) => db.loadExtension(sqliteVecPath); + _sqliteVecUnavailableReason = null; + } else { + _sqliteVecLoad = null; + _sqliteVecUnavailableReason = "No loadable sqlite-vec extension was found"; + } } /** @@ -86,11 +110,17 @@ export interface Statement { */ export function loadSqliteVec(db: Database): void { if (!_sqliteVecLoad) { - const hint = isBun && process.platform === "darwin" - ? "On macOS with Bun, install Homebrew SQLite: brew install sqlite\n" + - "Or install qmd with npm instead: npm install -g @tobilu/qmd" - : "Ensure the sqlite-vec native module is installed correctly."; - throw new Error(`sqlite-vec extension is unavailable. ${hint}`); + throw createSqliteVecUnavailableError( + _sqliteVecUnavailableReason ?? "No loadable sqlite-vec extension was found", + { isBun }, + ); + } + try { + _sqliteVecLoad(db); + } catch (err) { + throw createSqliteVecUnavailableError( + `sqlite-vec load failed (${getErrorMessage(err)})`, + { isBun }, + ); } - _sqliteVecLoad(db); } diff --git a/src/llm.ts b/src/llm.ts index 485ca7b6..454c19cc 100644 --- a/src/llm.ts +++ b/src/llm.ts @@ -4,19 +4,17 @@ * Provides embeddings, text generation, and reranking using local GGUF models. */ -import { - getLlama, - resolveModelFile, - LlamaChatSession, - LlamaLogLevel, - type Llama, - type LlamaModel, - type LlamaEmbeddingContext, - type Token as LlamaToken, -} from "node-llama-cpp"; import { homedir } from "os"; import { join } from "path"; import { existsSync, mkdirSync, statSync, unlinkSync, readdirSync, readFileSync, writeFileSync } from "fs"; +import { + NodeLlamaCppUnavailableError, + loadNodeLlamaCpp, + type Llama, + type LlamaEmbeddingContext, + type LlamaModel, + type LlamaToken, +} from "./platform/node-llama-cpp.js"; // ============================================================================= // Embedding Formatting Functions @@ -252,6 +250,7 @@ export async function pullModels( models: string[], options: { refresh?: boolean; cacheDir?: string } = {} ): Promise { + const nodeLlamaCpp = await loadNodeLlamaCpp(); const cacheDir = options.cacheDir || MODEL_CACHE_DIR; if (!existsSync(cacheDir)) { mkdirSync(cacheDir, { recursive: true }); @@ -292,7 +291,7 @@ export async function pullModels( } } - const path = await resolveModelFile(model, cacheDir); + const path = await nodeLlamaCpp.resolveModelFile(model, cacheDir); const sizeBytes = existsSync(path) ? statSync(path).size : 0; if (hfRef && filename) { const remoteEtag = await getRemoteEtag(hfRef); @@ -552,20 +551,26 @@ export class LlamaCpp implements LLM { */ private async ensureLlama(): Promise { if (!this.llama) { + const nodeLlamaCpp = await loadNodeLlamaCpp(); // Allow override via QMD_LLAMA_GPU: "false" | "off" | "none" forces CPU const gpuOverride = (process.env.QMD_LLAMA_GPU ?? "").toLowerCase(); const forceCpu = ["false", "off", "none", "disable", "disabled", "0"].includes(gpuOverride); const loadLlama = async (gpu: "auto" | false) => - await getLlama({ - build: "autoAttempt", - logLevel: LlamaLogLevel.error, + await nodeLlamaCpp.getLlama({ + build: process.platform === "freebsd" ? "never" : "autoAttempt", + logLevel: nodeLlamaCpp.LlamaLogLevel.error, gpu, + skipDownload: process.platform !== "freebsd", }); let llama: Llama; if (forceCpu) { - llama = await loadLlama(false); + try { + llama = await loadLlama(false); + } catch (error) { + throw new NodeLlamaCppUnavailableError(error); + } } else { try { llama = await loadLlama("auto"); @@ -575,7 +580,11 @@ export class LlamaCpp implements LLM { process.stderr.write( `QMD Warning: GPU init failed (${err instanceof Error ? err.message : String(err)}), falling back to CPU.\n` ); - llama = await loadLlama(false); + try { + llama = await loadLlama(false); + } catch (cpuError) { + throw new NodeLlamaCppUnavailableError(cpuError); + } } } @@ -594,8 +603,9 @@ export class LlamaCpp implements LLM { */ private async resolveModel(modelUri: string): Promise { this.ensureModelCacheDir(); + const nodeLlamaCpp = await loadNodeLlamaCpp(); // resolveModelFile handles HF URIs and downloads to the cache dir - return await resolveModelFile(modelUri, this.modelCacheDir); + return await nodeLlamaCpp.resolveModelFile(modelUri, this.modelCacheDir); } /** @@ -914,6 +924,9 @@ export class LlamaCpp implements LLM { model: options.model ?? this.embedModelUri, }; } catch (error) { + if (error instanceof NodeLlamaCppUnavailableError) { + throw error; + } console.error("Embedding error:", error); return null; } @@ -985,6 +998,9 @@ export class LlamaCpp implements LLM { return chunkResults.flat(); } catch (error) { + if (error instanceof NodeLlamaCppUnavailableError) { + throw error; + } console.error("Batch embedding error:", error); return texts.map(() => null); } @@ -995,13 +1011,14 @@ export class LlamaCpp implements LLM { // Ping activity at start to keep models alive during this operation this.touchActivity(); + const nodeLlamaCpp = await loadNodeLlamaCpp(); // Ensure model is loaded await this.ensureGenerateModel(); // Create fresh context -> sequence -> session for each call const context = await this.generateModel!.createContext(); const sequence = context.getSequence(); - const session = new LlamaChatSession({ contextSequence: sequence }); + const session = new nodeLlamaCpp.LlamaChatSession({ contextSequence: sequence }); const maxTokens = options.maxTokens ?? 150; // Qwen3 recommends temp=0.7, topP=0.8, topK=20 for non-thinking mode @@ -1055,6 +1072,7 @@ export class LlamaCpp implements LLM { // Ping activity at start to keep models alive during this operation this.touchActivity(); + const nodeLlamaCpp = await loadNodeLlamaCpp(); const llama = await this.ensureLlama(); await this.ensureGenerateModel(); @@ -1080,7 +1098,7 @@ export class LlamaCpp implements LLM { contextSize: this.expandContextSize, }); const sequence = genContext.getSequence(); - const session = new LlamaChatSession({ contextSequence: sequence }); + const session = new nodeLlamaCpp.LlamaChatSession({ contextSequence: sequence }); try { // Qwen3 recommended settings for non-thinking mode: @@ -1129,6 +1147,9 @@ export class LlamaCpp implements LLM { ]; return includeLexical ? fallback : fallback.filter(q => q.type !== 'lex'); } catch (error) { + if (error instanceof NodeLlamaCppUnavailableError) { + throw error; + } console.error("Structured query expansion failed:", error); // Fallback to original query const fallback: Queryable[] = [{ type: 'vec', text: query }]; diff --git a/src/platform/node-llama-cpp.ts b/src/platform/node-llama-cpp.ts new file mode 100644 index 00000000..8642f35f --- /dev/null +++ b/src/platform/node-llama-cpp.ts @@ -0,0 +1,174 @@ +const NODE_LLAMA_CPP_MODULE_ID = "node-llama-cpp"; + +type LlamaContextSequence = unknown; +type LlamaGrammar = unknown; +type LlamaLogLevelValue = unknown; + +export type LlamaToken = unknown; + +export type LlamaVramState = { + total: number; + used: number; + free: number; +}; + +export interface LlamaEmbeddingResult { + vector: Iterable | ArrayLike; +} + +export interface LlamaEmbeddingContext { + getEmbeddingFor(text: string): Promise; + dispose(): Promise; +} + +export interface LlamaRankingContext { + rankAll(query: string, docs: string[]): Promise; + dispose(): Promise; +} + +export interface LlamaContext { + getSequence(): LlamaContextSequence; + dispose(): Promise; +} + +export interface LlamaModel { + trainContextSize: number; + tokenize(text: string): readonly LlamaToken[]; + detokenize(tokens: readonly LlamaToken[]): string; + createEmbeddingContext(options: { + contextSize: number; + threads?: number; + }): Promise; + createContext(options?: { + contextSize?: number; + }): Promise; + createRankingContext(options: { + contextSize: number; + flashAttention?: boolean; + threads?: number; + }): Promise; + dispose(): Promise; +} + +export interface Llama { + gpu: string | false; + supportsGpuOffloading: boolean; + cpuMathCores: number; + loadModel(options: { modelPath: string }): Promise; + getGpuDeviceNames(): Promise; + getVramState(): Promise; + createGrammar(options: { grammar: string }): Promise; + dispose(): Promise; +} + +export interface LlamaChatSession { + prompt(prompt: string, options?: Record): Promise; +} + +export interface NodeLlamaCppModule { + getLlama(options: { + build: "autoAttempt" | "never"; + logLevel: LlamaLogLevelValue; + gpu: "auto" | false; + skipDownload?: boolean; + }): Promise; + resolveModelFile(modelUri: string, cacheDir: string): Promise; + LlamaChatSession: new (options: { contextSequence: LlamaContextSequence }) => LlamaChatSession; + LlamaLogLevel: { + error: LlamaLogLevelValue; + }; +} + +let nodeLlamaCppModulePromise: Promise | null = null; +let nodeLlamaCppLoader: (() => Promise) | null = null; + +function isNodeLlamaCppModule(value: unknown): value is NodeLlamaCppModule { + if (!value || typeof value !== "object") return false; + const candidate = value as Partial; + return ( + typeof candidate.getLlama === "function" && + typeof candidate.resolveModelFile === "function" && + typeof candidate.LlamaChatSession === "function" && + !!candidate.LlamaLogLevel && + "error" in candidate.LlamaLogLevel + ); +} + +function normalizeNodeLlamaCppModule(moduleValue: unknown): NodeLlamaCppModule { + if (isNodeLlamaCppModule(moduleValue)) { + return moduleValue; + } + + const defaultValue = + moduleValue && + typeof moduleValue === "object" && + "default" in moduleValue + ? (moduleValue as { default?: unknown }).default + : undefined; + + if (isNodeLlamaCppModule(defaultValue)) { + return defaultValue; + } + + throw new Error("Loaded node-llama-cpp module has an unexpected shape."); +} + +function defaultNodeLlamaCppLoader(): Promise { + const moduleId = NODE_LLAMA_CPP_MODULE_ID; + return import(moduleId).then((moduleValue) => normalizeNodeLlamaCppModule(moduleValue)); +} + +export function formatNodeLlamaCppUnavailableMessage(cause?: unknown): string { + const detail = + cause instanceof Error + ? cause.message + : cause !== undefined && cause !== null + ? String(cause) + : null; + + const parts = [ + "node-llama-cpp is unavailable.", + "QMD can still use BM25 and sqlite-vec features, but embeddings, query expansion, reranking, and model downloads require a working node-llama-cpp install.", + ]; + + if (process.platform === "freebsd") { + parts.push( + "On FreeBSD this usually means the optional dependency failed to build. Install the required C/C++ toolchain and node-llama-cpp build prerequisites, then reinstall qmd." + ); + } + + if (detail) { + parts.push(`Original error: ${detail}`); + } + + return parts.join(" "); +} + +export class NodeLlamaCppUnavailableError extends Error { + override readonly cause: unknown; + + constructor(cause?: unknown) { + super(formatNodeLlamaCppUnavailableMessage(cause)); + this.name = "NodeLlamaCppUnavailableError"; + this.cause = cause; + } +} + +export async function loadNodeLlamaCpp(): Promise { + if (!nodeLlamaCppModulePromise) { + const loader = nodeLlamaCppLoader ?? defaultNodeLlamaCppLoader; + nodeLlamaCppModulePromise = loader().catch((error) => { + nodeLlamaCppModulePromise = null; + throw new NodeLlamaCppUnavailableError(error); + }); + } + + return await nodeLlamaCppModulePromise; +} + +export function _setNodeLlamaCppLoaderForTesting( + loader: (() => Promise) | null +): void { + nodeLlamaCppLoader = loader; + nodeLlamaCppModulePromise = null; +} diff --git a/src/platform/sqlite-vec.ts b/src/platform/sqlite-vec.ts new file mode 100644 index 00000000..832bca40 --- /dev/null +++ b/src/platform/sqlite-vec.ts @@ -0,0 +1,150 @@ +import { existsSync } from "node:fs"; +import { createRequire } from "node:module"; +import { join } from "node:path"; + +const SQLITE_VEC_ENV_VAR = "QMD_SQLITE_VEC_PATH"; +const SQLITE_VEC_ENTRYPOINT = "vec0"; +const requireForResolve = createRequire(import.meta.url); +const bunGlobal = globalThis as typeof globalThis & { Bun?: unknown }; + +export type SqliteVecLoadSource = "env" | "system" | "npm"; + +export type SqliteVecLoadableResolution = { + path: string | null; + source: SqliteVecLoadSource | null; + packageName?: string; +}; + +export type SqliteVecResolveOptions = { + platform?: string; + arch?: string; + env?: NodeJS.ProcessEnv; + fileExists?: (path: string) => boolean; + packageResolve?: (specifier: string) => string; +}; + +function uniqueStrings(values: Array): string[] { + const deduped = new Set(); + for (const value of values) { + const normalized = value?.trim(); + if (normalized) deduped.add(normalized); + } + return [...deduped]; +} + +export function isSqliteVecNpmPlatformSupported( + platform: string = process.platform, + arch: string = process.arch, +): boolean { + return ( + (platform === "darwin" && (arch === "arm64" || arch === "x64")) || + (platform === "linux" && (arch === "arm64" || arch === "x64")) || + (platform === "win32" && arch === "x64") + ); +} + +export function getSqliteVecNpmPackageName( + platform: string = process.platform, + arch: string = process.arch, +): string | null { + if (!isSqliteVecNpmPlatformSupported(platform, arch)) { + return null; + } + const os = platform === "win32" ? "windows" : platform; + return `sqlite-vec-${os}-${arch}`; +} + +export function getSqliteVecEntrypointFilename(platform: string = process.platform): string { + if (platform === "win32") return `${SQLITE_VEC_ENTRYPOINT}.dll`; + if (platform === "darwin") return `${SQLITE_VEC_ENTRYPOINT}.dylib`; + return `${SQLITE_VEC_ENTRYPOINT}.so`; +} + +export function getFreebsdSqliteVecProbePaths( + options: Pick = {}, +): string[] { + const env = options.env ?? process.env; + const prefixes = uniqueStrings([ + env.LOCALBASE, + env.PREFIX, + "/usr/local", + ]); + + const paths: string[] = []; + for (const prefix of prefixes) { + paths.push(join(prefix, "lib", "sqlite3", "vec0.so")); + paths.push(join(prefix, "lib", "sqlite-vec", "vec0.so")); + paths.push(join(prefix, "lib", "vec0.so")); + paths.push(join(prefix, "libexec", "sqlite3", "vec0.so")); + paths.push(join(prefix, "libexec", "sqlite-vec", "vec0.so")); + } + + return uniqueStrings(paths); +} + +export function resolveSqliteVecLoadablePath( + options: SqliteVecResolveOptions = {}, +): SqliteVecLoadableResolution { + const platform = options.platform ?? process.platform; + const arch = options.arch ?? process.arch; + const env = options.env ?? process.env; + const fileExists = options.fileExists ?? existsSync; + const packageResolve = options.packageResolve ?? requireForResolve.resolve.bind(requireForResolve); + + const overridePath = env[SQLITE_VEC_ENV_VAR]?.trim(); + if (overridePath) { + return { path: overridePath, source: "env" }; + } + + if (platform === "freebsd") { + for (const candidate of getFreebsdSqliteVecProbePaths({ env })) { + if (fileExists(candidate)) { + return { path: candidate, source: "system" }; + } + } + } + + const packageName = getSqliteVecNpmPackageName(platform, arch); + if (packageName) { + try { + const path = packageResolve(`${packageName}/${getSqliteVecEntrypointFilename(platform)}`); + return { path, source: "npm", packageName }; + } catch {} + } + + return { path: null, source: null }; +} + +export function getSqliteVecUnavailableHint( + options: { platform?: string; isBun?: boolean } = {}, +): string { + const platform = options.platform ?? process.platform; + const isBun = options.isBun ?? (typeof bunGlobal.Bun !== "undefined"); + + if (isBun && platform === "darwin") { + return "On macOS with Bun, install Homebrew SQLite: brew install sqlite\n" + + "Or install qmd with npm instead: npm install -g @tobilu/qmd"; + } + + if (platform === "freebsd") { + return "On FreeBSD, install SQLite with extension loading support and make vec0.so available. " + + "Set QMD_SQLITE_VEC_PATH=/path/to/vec0.so if auto-discovery does not find it."; + } + + if (isSqliteVecNpmPlatformSupported(platform)) { + return "Ensure the sqlite-vec native module is installed correctly."; + } + + return "Install a SQLite build with extension loading support and set " + + "QMD_SQLITE_VEC_PATH=/path/to/vec0.so if needed."; +} + +export function createSqliteVecUnavailableError( + reason: string, + options: { platform?: string; isBun?: boolean } = {}, +): Error { + const cleanedReason = reason.trim().replace(/[.\s]+$/g, ""); + return new Error( + `sqlite-vec extension is unavailable. ${cleanedReason}. ${getSqliteVecUnavailableHint(options)}` + ); +} diff --git a/src/store.ts b/src/store.ts index ab4cbf4e..9fd93524 100644 --- a/src/store.ts +++ b/src/store.ts @@ -11,13 +11,14 @@ * const store = createStore(); */ -import { openDatabase, loadSqliteVec } from "./db.js"; +import { isBun, openDatabase, loadSqliteVec } from "./db.js"; import type { Database } from "./db.js"; import picomatch from "picomatch"; import { createHash } from "crypto"; import { readFileSync, realpathSync, statSync, mkdirSync } from "node:fs"; // Note: node:path resolve is not imported — we export our own cross-platform resolve() import fastGlob from "fast-glob"; +import { createSqliteVecUnavailableError as createPlatformSqliteVecUnavailableError } from "./platform/sqlite-vec.js"; import { LlamaCpp, getDefaultLlamaCpp, @@ -700,12 +701,7 @@ export function toVirtualPath(db: Database, absolutePath: string): string | null function createSqliteVecUnavailableError(reason: string): Error { - return new Error( - "sqlite-vec extension is unavailable. " + - `${reason}. ` + - "Install Homebrew SQLite so the sqlite-vec extension can be loaded, " + - "and set BREW_PREFIX if Homebrew is installed in a non-standard location." - ); + return createPlatformSqliteVecUnavailableError(reason, { isBun }); } function getErrorMessage(err: unknown): string { diff --git a/test/freebsd-smoke.sh b/test/freebsd-smoke.sh new file mode 100755 index 00000000..efe5d7f6 --- /dev/null +++ b/test/freebsd-smoke.sh @@ -0,0 +1,356 @@ +#!/bin/sh +set -eu + +cd "$(dirname "$0")/.." + +usage() { + cat <<'EOF' +Usage: test/freebsd-smoke.sh [--quick|--full] [--keep-temp] + + --quick BM25 + sqlite-vec validation only + --full Search + embed + lifecycle/maintenance validation (requires working node-llama-cpp; default) + --keep-temp Keep the temporary corpus and state directory +EOF +} + +MODE="full" +KEEP_TEMP=0 + +while [ $# -gt 0 ]; do + case "$1" in + --quick) + MODE="quick" + ;; + --full) + MODE="full" + ;; + --keep-temp) + KEEP_TEMP=1 + ;; + -h|--help) + usage + exit 0 + ;; + *) + usage >&2 + exit 1 + ;; + esac + shift +done + +if [ "$(uname -s)" != "FreeBSD" ]; then + echo "Error: test/freebsd-smoke.sh is intended for FreeBSD hosts." >&2 + exit 1 +fi + +require_cmd() { + if ! command -v "$1" >/dev/null 2>&1; then + echo "Error: required command '$1' was not found in PATH." >&2 + exit 1 + fi +} + +for cmd in node corepack git cmake ninja python3 gmake bash sqlite3 envsubst; do + require_cmd "$cmd" +done + +if [ ! -d node_modules ]; then + echo "Error: node_modules is missing. Run 'corepack pnpm install --frozen-lockfile' first." >&2 + exit 1 +fi + +resolve_sqlite_vec_path() { + if [ -n "${QMD_SQLITE_VEC_PATH:-}" ]; then + if [ ! -f "${QMD_SQLITE_VEC_PATH}" ]; then + echo "Error: QMD_SQLITE_VEC_PATH points to a missing file: ${QMD_SQLITE_VEC_PATH}" >&2 + exit 1 + fi + printf '%s\n' "${QMD_SQLITE_VEC_PATH}" + return 0 + fi + + if [ -f "../sqlite-vec/dist/vec0.so" ]; then + printf '%s\n' "$(cd ../sqlite-vec && pwd)/dist/vec0.so" + return 0 + fi + + if [ -d "../sqlite-vec" ]; then + echo "==> Building ../sqlite-vec/dist/vec0.so" >&2 + ( + cd ../sqlite-vec + gmake sqlite-vec.h loadable + ) >&2 + if [ -f "../sqlite-vec/dist/vec0.so" ]; then + printf '%s\n' "$(cd ../sqlite-vec && pwd)/dist/vec0.so" + return 0 + fi + fi + + cat >&2 <<'EOF' +Error: could not locate vec0.so. + +Set QMD_SQLITE_VEC_PATH=/absolute/path/to/vec0.so, or clone sqlite-vec next to qmd: + + git clone https://github.com/asg017/sqlite-vec ../sqlite-vec + cd ../sqlite-vec + gmake sqlite-vec.h loadable +EOF + exit 1 +} + +SQLITE_VEC_PATH="$(resolve_sqlite_vec_path)" + +TMP_ROOT="$(mktemp -d -t qmd-freebsd-smoke)" +STATE_DIR="${TMP_ROOT}/state" +CORPUS_DIR="${TMP_ROOT}/notes" +mkdir -p "${STATE_DIR}/config" "${CORPUS_DIR}" + +cleanup() { + if [ "${KEEP_TEMP}" -eq 0 ]; then + rm -rf "${TMP_ROOT}" + fi +} + +trap cleanup EXIT INT TERM + +cat > "${CORPUS_DIR}/auth.md" <<'EOF' +# Authentication Flow + +Use bearer tokens for API authentication. +Rotate secrets every 90 days. +EOF + +cat > "${CORPUS_DIR}/db.md" <<'EOF' +# Database Notes + +Use connection pooling to avoid timeout spikes. +Prepared statements prevent SQL injection. +EOF + +mkdir -p "${CORPUS_DIR}/security" +cat > "${CORPUS_DIR}/security/tokens.md" <<'EOF' +# Token Handling + +Store API tokens in a dedicated secrets manager. +Never commit production credentials. +EOF + +ARCHIVE_DIR="${TMP_ROOT}/archive" +mkdir -p "${ARCHIVE_DIR}" +cat > "${ARCHIVE_DIR}/legacy.md" <<'EOF' +# Legacy Notes + +This collection is only used to validate collection removal. +EOF + +INDEX_DB="${STATE_DIR}/smoke.sqlite" +CONFIG_DIR="${STATE_DIR}/config" + +run_qmd() { + env \ + INDEX_PATH="${INDEX_DB}" \ + QMD_CONFIG_DIR="${CONFIG_DIR}" \ + QMD_SQLITE_VEC_PATH="${SQLITE_VEC_PATH}" \ + QMD_LLAMA_GPU=false \ + corepack pnpm qmd "$@" +} + +run_qmd_capture() { + set +e + RUN_OUTPUT="$( + env \ + INDEX_PATH="${INDEX_DB}" \ + QMD_CONFIG_DIR="${CONFIG_DIR}" \ + QMD_SQLITE_VEC_PATH="${SQLITE_VEC_PATH}" \ + QMD_LLAMA_GPU=false \ + corepack pnpm qmd "$@" 2>&1 + )" + RUN_STATUS=$? + set -e +} + +run_qmd_expect_success() { + run_qmd_capture "$@" + printf '%s\n' "$RUN_OUTPUT" + if [ "${RUN_STATUS}" -ne 0 ]; then + echo "Command failed: corepack pnpm qmd $*" >&2 + exit "${RUN_STATUS}" + fi +} + +assert_contains() { + haystack="$1" + needle="$2" + if ! printf '%s\n' "$haystack" | grep -F "$needle" >/dev/null 2>&1; then + echo "Expected output to contain: $needle" >&2 + exit 1 + fi +} + +assert_not_contains() { + haystack="$1" + needle="$2" + if printf '%s\n' "$haystack" | grep -F "$needle" >/dev/null 2>&1; then + echo "Expected output not to contain: $needle" >&2 + exit 1 + fi +} + +assert_status() { + actual="$1" + expected="$2" + if [ "$actual" -ne "$expected" ]; then + echo "Expected exit status $expected, got $actual" >&2 + exit 1 + fi +} + +echo "==> Using vec0.so: ${SQLITE_VEC_PATH}" +echo "==> Temporary root: ${TMP_ROOT}" + +run_qmd_expect_success collection add "${CORPUS_DIR}" --name notes +assert_contains "$RUN_OUTPUT" "Collection 'notes' created successfully" + +run_qmd_expect_success status +assert_contains "$RUN_OUTPUT" "Collections" + +run_qmd_expect_success search authentication +assert_contains "$RUN_OUTPUT" "qmd://notes/auth.md" + +run_qmd_expect_success ls notes +assert_contains "$RUN_OUTPUT" "qmd://notes/auth.md" +assert_contains "$RUN_OUTPUT" "qmd://notes/security/tokens.md" + +run_qmd_expect_success get qmd://notes/auth.md +assert_contains "$RUN_OUTPUT" "Authentication Flow" +assert_contains "$RUN_OUTPUT" "Rotate secrets every 90 days." + +OUT="$(sqlite3 :memory: ".load ${SQLITE_VEC_PATH}" "select vec_version();" 2>&1)" +printf '%s\n' "$OUT" +assert_contains "$OUT" "v0." + +if [ "${MODE}" = "full" ]; then + run_qmd_expect_success context add qmd://notes/ "Engineering knowledge base" + assert_contains "$RUN_OUTPUT" "Added context for: qmd://notes/" + + run_qmd_expect_success context list + assert_contains "$RUN_OUTPUT" "Configured Contexts" + assert_contains "$RUN_OUTPUT" "Engineering knowledge base" + + run_qmd_expect_success get qmd://notes/auth.md + assert_contains "$RUN_OUTPUT" "Folder Context: Engineering knowledge base" + + run_qmd_expect_success embed + assert_contains "$RUN_OUTPUT" "Embedded" + + run_qmd_expect_success vsearch "secure api authentication" + assert_contains "$RUN_OUTPUT" "qmd://notes/auth.md" + + run_qmd_expect_success query "secure api authentication" --explain + assert_contains "$RUN_OUTPUT" "qmd://notes/auth.md" + assert_contains "$RUN_OUTPUT" "Explain:" + + OUT="$( + env \ + INDEX_PATH="${INDEX_DB}" \ + QMD_CONFIG_DIR="${CONFIG_DIR}" \ + QMD_SQLITE_VEC_PATH="${SQLITE_VEC_PATH}" \ + QMD_LLAMA_GPU=false \ + QMD_STATUS_LLM_PROBE=1 \ + corepack pnpm qmd status 2>&1 + )" + printf '%s\n' "$OUT" + assert_contains "$OUT" "Device" + + run_qmd_expect_success collection update-cmd notes "printf 'update-hook-ran\n'" + assert_contains "$RUN_OUTPUT" "Set update command for 'notes'" + + cat > "${CORPUS_DIR}/db.md" <<'EOF' +# Database Notes + +Use connection pooling to avoid timeout spikes. +Prepared statements prevent SQL injection. +Enable WAL mode for better concurrent writes. +EOF + + rm -f "${CORPUS_DIR}/auth.md" + + cat > "${CORPUS_DIR}/ops.md" <<'EOF' +# Operations Notes + +Monitoring alerts should page the on-call engineer. +Runbook links belong next to each alert. +EOF + + run_qmd_expect_success update + assert_contains "$RUN_OUTPUT" "Running update command: printf 'update-hook-ran" + assert_contains "$RUN_OUTPUT" "update-hook-ran" + assert_contains "$RUN_OUTPUT" "1 new, 1 updated, 1 unchanged, 1 removed" + assert_contains "$RUN_OUTPUT" "Run 'qmd embed' to update embeddings" + + run_qmd_capture get qmd://notes/auth.md + printf '%s\n' "$RUN_OUTPUT" + assert_status "$RUN_STATUS" 1 + assert_contains "$RUN_OUTPUT" "Document not found" + + run_qmd_expect_success search monitoring + assert_contains "$RUN_OUTPUT" "qmd://notes/ops.md" + assert_not_contains "$RUN_OUTPUT" "qmd://notes/auth.md" + + run_qmd_expect_success cleanup + assert_contains "$RUN_OUTPUT" "Cleared" + assert_contains "$RUN_OUTPUT" "Removed" + assert_contains "$RUN_OUTPUT" "orphaned embedding chunks" + assert_contains "$RUN_OUTPUT" "Database vacuumed" + + run_qmd_expect_success embed + assert_contains "$RUN_OUTPUT" "Embedded" + + run_qmd_expect_success vsearch "on-call monitoring alerts" + assert_contains "$RUN_OUTPUT" "qmd://notes/ops.md" + + run_qmd_expect_success collection rename notes knowledge + assert_contains "$RUN_OUTPUT" "Renamed collection 'notes' to 'knowledge'" + assert_contains "$RUN_OUTPUT" "qmd://knowledge/" + + run_qmd_expect_success collection list + assert_contains "$RUN_OUTPUT" "qmd://knowledge/" + assert_not_contains "$RUN_OUTPUT" "qmd://notes/" + + run_qmd_expect_success get qmd://knowledge/db.md + assert_contains "$RUN_OUTPUT" "Folder Context: Engineering knowledge base" + assert_contains "$RUN_OUTPUT" "Enable WAL mode for better concurrent writes." + + run_qmd_expect_success update + assert_contains "$RUN_OUTPUT" "[1/1]" + assert_contains "$RUN_OUTPUT" "knowledge" + assert_contains "$RUN_OUTPUT" "update-hook-ran" + + run_qmd_expect_success context rm qmd://knowledge/ + assert_contains "$RUN_OUTPUT" "Removed context for: qmd://knowledge/" + + run_qmd_expect_success context list + assert_contains "$RUN_OUTPUT" "No contexts configured" + + run_qmd_expect_success collection add "${ARCHIVE_DIR}" --name archive + assert_contains "$RUN_OUTPUT" "Collection 'archive' created successfully" + + run_qmd_expect_success collection list + assert_contains "$RUN_OUTPUT" "qmd://archive/" + assert_contains "$RUN_OUTPUT" "qmd://knowledge/" + + run_qmd_expect_success collection remove archive + assert_contains "$RUN_OUTPUT" "Removed collection 'archive'" + assert_contains "$RUN_OUTPUT" "Deleted 1 documents" + + run_qmd_expect_success collection list + assert_contains "$RUN_OUTPUT" "qmd://knowledge/" + assert_not_contains "$RUN_OUTPUT" "qmd://archive/" +fi + +echo "==> FreeBSD smoke passed (${MODE})" + +if [ "${KEEP_TEMP}" -eq 1 ]; then + echo "==> Kept temp dir: ${TMP_ROOT}" +fi diff --git a/test/llm.test.ts b/test/llm.test.ts index d3360363..aceb7e42 100644 --- a/test/llm.test.ts +++ b/test/llm.test.ts @@ -7,17 +7,62 @@ * rerank functions first to trigger model downloads. */ -import { describe, test, expect, beforeAll, afterAll, vi } from "vitest"; +import { describe, test, expect, afterAll, afterEach, vi } from "vitest"; import { LlamaCpp, getDefaultLlamaCpp, disposeDefaultLlamaCpp, withLLMSession, + pullModels, canUnloadLLM, SessionReleasedError, type RerankDocument, type ILLMSession, } from "../src/llm.js"; +import { + _setNodeLlamaCppLoaderForTesting, + type NodeLlamaCppModule, + NodeLlamaCppUnavailableError, +} from "../src/platform/node-llama-cpp.js"; + +afterEach(() => { + _setNodeLlamaCppLoaderForTesting(null); +}); + +async function isLlamaCppRuntimeAvailable(): Promise { + if (process.env.CI) return false; + + const llm = new LlamaCpp({}); + try { + await llm.getDeviceInfo(); + return true; + } catch { + return false; + } finally { + await llm.dispose(); + } +} + +const llmRuntimeRequested = process.env.QMD_RUN_LLM_INTEGRATION_TESTS === "1"; +const llmRuntimeAvailable = llmRuntimeRequested && await isLlamaCppRuntimeAvailable(); +const runtimeTest = test.runIf(llmRuntimeAvailable); + +function createBrokenNodeLlamaCppModule(message: string): NodeLlamaCppModule { + return { + getLlama: async () => { + throw new Error(message); + }, + resolveModelFile: async (modelUri: string) => modelUri, + LlamaChatSession: class { + async prompt(): Promise { + return ""; + } + }, + LlamaLogLevel: { + error: "error", + }, + }; +} // ============================================================================= // Singleton Tests (no model loading required) @@ -193,11 +238,91 @@ describe("LlamaCpp rerank deduping", () => { }); }); +describe("node-llama-cpp availability", () => { + test("pullModels throws a backend unavailable error when the runtime loader fails", async () => { + _setNodeLlamaCppLoaderForTesting(async () => { + throw new Error("missing backend"); + }); + + await expect(pullModels(["hf:org/repo/model.gguf"])).rejects.toBeInstanceOf( + NodeLlamaCppUnavailableError + ); + await expect(pullModels(["hf:org/repo/model.gguf"])).rejects.toThrow( + /node-llama-cpp is unavailable/i + ); + }); + + test("embed surfaces backend unavailable errors instead of returning null", async () => { + _setNodeLlamaCppLoaderForTesting(async () => createBrokenNodeLlamaCppModule("broken native addon")); + + const llm = new LlamaCpp({}) as any; + llm._ciMode = false; + + await expect(llm.embed("hello")).rejects.toBeInstanceOf(NodeLlamaCppUnavailableError); + }); + + test("expandQuery surfaces backend unavailable errors instead of silently falling back", async () => { + _setNodeLlamaCppLoaderForTesting(async () => createBrokenNodeLlamaCppModule("broken native addon")); + + const llm = new LlamaCpp({}) as any; + llm._ciMode = false; + + await expect(llm.expandQuery("hello")).rejects.toBeInstanceOf(NodeLlamaCppUnavailableError); + }); + + test("getDeviceInfo uses the platform build policy for node-llama-cpp", async () => { + const getLlamaCalls: Array> = []; + + _setNodeLlamaCppLoaderForTesting(async () => ({ + getLlama: async (options: Record) => { + getLlamaCalls.push(options); + return { + gpu: false, + supportsGpuOffloading: false, + cpuMathCores: 8, + getGpuDeviceNames: async () => [], + getVramState: async () => ({ total: 0, used: 0, free: 0 }), + loadModel: async () => { + throw new Error("not used in this test"); + }, + createGrammar: async () => { + throw new Error("not used in this test"); + }, + dispose: async () => {}, + } as any; + }, + resolveModelFile: async (modelUri: string) => modelUri, + LlamaChatSession: class { + async prompt(): Promise { + return ""; + } + }, + LlamaLogLevel: { + error: "error", + }, + }) as NodeLlamaCppModule); + + const llm = new LlamaCpp({}) as any; + llm._ciMode = false; + + await expect(llm.getDeviceInfo()).resolves.toMatchObject({ + gpu: false, + gpuOffloading: false, + gpuDevices: [], + cpuCores: 8, + }); + expect(getLlamaCalls[0]?.build).toBe(process.platform === "freebsd" ? "never" : "autoAttempt"); + expect(getLlamaCalls[0]?.skipDownload).toBe(false); + + await llm.dispose(); + }); +}); + // ============================================================================= // Integration Tests (require actual models) // ============================================================================= -describe.skipIf(!!process.env.CI)("LlamaCpp Integration", () => { +describe.skipIf(!llmRuntimeAvailable)("LlamaCpp Integration", () => { // Use the singleton to avoid multiple Metal contexts const llm = getDefaultLlamaCpp(); @@ -602,7 +727,7 @@ describe.skipIf(!!process.env.CI)("LlamaCpp Integration", () => { describe.skipIf(!!process.env.CI)("LLM Session Management", () => { describe("withLLMSession", () => { - test("session provides access to LLM operations", async () => { + runtimeTest("session provides access to LLM operations", async () => { const result = await withLLMSession(async (session) => { expect(session.isValid).toBe(true); const embedding = await session.embed("test text"); @@ -626,7 +751,7 @@ describe.skipIf(!!process.env.CI)("LLM Session Management", () => { expect(capturedSession!.isValid).toBe(false); }); - test("session prevents idle unload during operations", async () => { + runtimeTest("session prevents idle unload during operations", async () => { await withLLMSession(async (session) => { // While inside a session, canUnloadLLM should return false expect(canUnloadLLM()).toBe(false); @@ -661,7 +786,7 @@ describe.skipIf(!!process.env.CI)("LLM Session Management", () => { expect(canUnloadLLM()).toBe(true); }); - test("session embedBatch works correctly", async () => { + runtimeTest("session embedBatch works correctly", async () => { await withLLMSession(async (session) => { const texts = ["Hello world", "Test text", "Another document"]; const results = await session.embedBatch(texts); @@ -674,7 +799,7 @@ describe.skipIf(!!process.env.CI)("LLM Session Management", () => { }); }); - test("session rerank works correctly", async () => { + runtimeTest("session rerank works correctly", async () => { await withLLMSession(async (session) => { const documents: RerankDocument[] = [ { file: "a.txt", text: "The capital of France is Paris." }, @@ -748,7 +873,7 @@ describe.skipIf(!!process.env.CI)("LLM Session Management", () => { test("returns value from callback", async () => { const result = await withLLMSession(async (session) => { - await session.embed("test"); + expect(session.isValid).toBe(true); return { status: "complete", count: 42 }; }); diff --git a/test/sqlite-vec.test.ts b/test/sqlite-vec.test.ts new file mode 100644 index 00000000..02300054 --- /dev/null +++ b/test/sqlite-vec.test.ts @@ -0,0 +1,138 @@ +import { describe, expect, test, vi } from "vitest"; +import { + createSqliteVecUnavailableError, + getFreebsdSqliteVecProbePaths, + getSqliteVecUnavailableHint, + resolveSqliteVecLoadablePath, +} from "../src/platform/sqlite-vec.js"; + +describe("resolveSqliteVecLoadablePath", () => { + test("prefers QMD_SQLITE_VEC_PATH over all other resolution paths", () => { + const fileExists = vi.fn(() => false); + const packageResolve = vi.fn((specifier: string) => `/resolved/${specifier}`); + + const result = resolveSqliteVecLoadablePath({ + platform: "freebsd", + arch: "x64", + env: { + QMD_SQLITE_VEC_PATH: " /custom/vec0.so ", + LOCALBASE: "/opt/local", + }, + fileExists, + packageResolve, + }); + + expect(result).toEqual({ + path: "/custom/vec0.so", + source: "env", + }); + expect(fileExists).not.toHaveBeenCalled(); + expect(packageResolve).not.toHaveBeenCalled(); + }); + + test("uses the first matching FreeBSD system path before npm resolution", () => { + const fileExists = vi.fn((path: string) => path === "/opt/local/lib/sqlite3/vec0.so"); + const packageResolve = vi.fn((specifier: string) => `/resolved/${specifier}`); + + const result = resolveSqliteVecLoadablePath({ + platform: "freebsd", + arch: "x64", + env: { + LOCALBASE: "/opt/local", + }, + fileExists, + packageResolve, + }); + + expect(result).toEqual({ + path: "/opt/local/lib/sqlite3/vec0.so", + source: "system", + }); + expect(packageResolve).not.toHaveBeenCalled(); + }); + + test("resolves the packaged sqlite-vec binary on supported npm platforms", () => { + const packageResolve = vi.fn((specifier: string) => `/resolved/${specifier}`); + + const result = resolveSqliteVecLoadablePath({ + platform: "darwin", + arch: "arm64", + env: {}, + fileExists: () => false, + packageResolve, + }); + + expect(packageResolve).toHaveBeenCalledWith("sqlite-vec-darwin-arm64/vec0.dylib"); + expect(result).toEqual({ + path: "/resolved/sqlite-vec-darwin-arm64/vec0.dylib", + source: "npm", + packageName: "sqlite-vec-darwin-arm64", + }); + }); + + test("returns null when no path can be resolved", () => { + const packageResolve = vi.fn(() => { + throw new Error("module not found"); + }); + + const result = resolveSqliteVecLoadablePath({ + platform: "openbsd", + arch: "x64", + env: {}, + fileExists: () => false, + packageResolve, + }); + + expect(result).toEqual({ + path: null, + source: null, + }); + expect(packageResolve).not.toHaveBeenCalled(); + }); +}); + +describe("getFreebsdSqliteVecProbePaths", () => { + test("includes LOCALBASE first and de-duplicates prefixes", () => { + const paths = getFreebsdSqliteVecProbePaths({ + env: { + LOCALBASE: "/usr/local", + PREFIX: "/usr/local", + }, + }); + + expect(paths[0]).toBe("/usr/local/lib/sqlite3/vec0.so"); + expect(new Set(paths).size).toBe(paths.length); + }); +}); + +describe("sqlite-vec diagnostics", () => { + test("macOS Bun hint mentions Homebrew and npm fallback", () => { + const hint = getSqliteVecUnavailableHint({ + platform: "darwin", + isBun: true, + }); + + expect(hint).toContain("brew install sqlite"); + expect(hint).toContain("npm install -g @tobilu/qmd"); + }); + + test("FreeBSD hint mentions QMD_SQLITE_VEC_PATH", () => { + const hint = getSqliteVecUnavailableHint({ + platform: "freebsd", + isBun: false, + }); + + expect(hint).toContain("QMD_SQLITE_VEC_PATH"); + expect(hint).toContain("vec0.so"); + }); + + test("error formatting avoids duplicate punctuation", () => { + const err = createSqliteVecUnavailableError("No loadable sqlite-vec extension was found.", { + platform: "freebsd", + isBun: false, + }); + + expect(err.message).toContain("sqlite-vec extension is unavailable."); + expect(err.message).not.toContain("found.."); + }); +});