feat: self-contained API keys (read from gbrain's own config, not just env)#121
feat: self-contained API keys (read from gbrain's own config, not just env)#121vinsew wants to merge 3 commits intogarrytan:masterfrom
Conversation
98cd225 to
83d3851
Compare
GBrain stores internal cross-page references in slug form (e.g. `[Alice](./alice)`) because the slug is the canonical identifier in the DB. That works inside GBrain's own resolution layer. But when those pages are exported as `.md` files on disk and opened in standard markdown viewers (Obsidian, VS Code preview, GitHub web view, typical mkdocs/jekyll renderers), the viewers look for a literal file at `./alice` — which doesn't exist. The actual file is `./alice.md`. Result: every internal link in an exported brain is silently broken on disk. The user clicks `[小龙]` in `龙虾群.md`, sees a 404 / empty page, and cannot navigate the brain outside of GBrain itself. This defeats half the value of having the brain stored as portable markdown. Fix: Add `normalizeInternalLinks(content)` that runs over each page's serialized markdown right before `writeFileSync` and rewrites slug-form internal links to filename-form by appending `.md`: [Alice](./alice) -> [Alice](./alice.md) [Alice](alice) -> [Alice](alice.md) [Alice](../people/alice) -> [Alice](../people/alice.md) [小龙](../people/小龙) -> [小龙](../people/小龙.md) Conservative: leaves untouched anything that looks external or already extended: - URL schemes (http:, https:, mailto:, ftp:, file:, tel:, ...) — skip - Anchors (#section) — skip - Empty targets — skip - Trailing slash (directory references) — skip - Already has any extension (.md, .png, .pdf, .MD, ...) — skip - Preserves query strings and anchors when appending: [Section](./alice#bio) -> [Section](./alice.md#bio) [Search](./alice?q=t) -> [Search](./alice.md?q=t) The DB content stays slug-form (GBrain's internal convention is unchanged). Only the on-disk export gets the `.md` annotation, so the exported markdown is viewable as-is by any standard renderer. Real-world reproduction this fix addresses: $ gbrain put 龙虾群 < <(echo '[小龙](./小龙)') $ gbrain export --dir /tmp/out $ cat /tmp/out/龙虾群.md # before this PR: contains [小龙](./小龙) — clicking 404s # after this PR: contains [小龙](./小龙.md) — clicking opens the file Impact: - 2 files changed, +149 / -1 lines (1 line of helper invocation + ~40 lines of helper + comment + 26 tests) - Zero behavior change for external URLs, anchors, or already-extended links - DB content unchanged — only the on-disk export representation gains the `.md` annotation - Existing exports remain valid (re-running export on an already-exported brain is idempotent because already-extended links are skipped) Tests: - 26 new tests covering: same-dir slug, parent-dir slug, deep nesting, CJK slugs, multiple links per line, multi-line markdown, all 6 external schemes (http/https/mailto/file/ftp/tel), all 4 extension cases (md/png/pdf/uppercase), anchor preservation, query preservation, empty/trailing-slash/no-link edge cases. - All 26 tests pass. - Full suite: 612 pass / no new regressions (4 pre-existing PGLiteEngine failures are unrelated and exist on master). Fifth in a series of practical PRs from a real Chinese-speaking deploy. Companion to: - garrytan#114 (chunker CJK) - garrytan#115 (slugify CJK) - garrytan#119 (sync git quotepath CJK) - garrytan#121 (self-contained API keys) Same theme: GBrain is meaningfully more useful when the markdown export is a first-class deliverable, not a half-broken side-effect.
83d3851 to
b8410f9
Compare
The "no keys when neither env nor file provides them" case became impossible once garrytan#121 (self-contained API keys) ships — loadConfig now reads keys from config.json if env is empty. Test now asserts the stronger invariant: after env deletion, the previous env sentinel value must not leak back via the returned config. File-level keys may legitimately persist and are no longer asserted undefined. Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
GBrain stores internal cross-page references in slug form (e.g. `[Alice](./alice)`) because the slug is the canonical identifier in the DB. That works inside GBrain's own resolution layer. But when those pages are exported as `.md` files on disk and opened in standard markdown viewers (Obsidian, VS Code preview, GitHub web view, typical mkdocs/jekyll renderers), the viewers look for a literal file at `./alice` — which doesn't exist. The actual file is `./alice.md`. Result: every internal link in an exported brain is silently broken on disk. The user clicks `[小龙]` in `龙虾群.md`, sees a 404 / empty page, and cannot navigate the brain outside of GBrain itself. This defeats half the value of having the brain stored as portable markdown. Fix: Add `normalizeInternalLinks(content)` that runs over each page's serialized markdown right before `writeFileSync` and rewrites slug-form internal links to filename-form by appending `.md`: [Alice](./alice) -> [Alice](./alice.md) [Alice](alice) -> [Alice](alice.md) [Alice](../people/alice) -> [Alice](../people/alice.md) [小龙](../people/小龙) -> [小龙](../people/小龙.md) Conservative: leaves untouched anything that looks external or already extended: - URL schemes (http:, https:, mailto:, ftp:, file:, tel:, ...) — skip - Anchors (#section) — skip - Empty targets — skip - Trailing slash (directory references) — skip - Already has any extension (.md, .png, .pdf, .MD, ...) — skip - Preserves query strings and anchors when appending: [Section](./alice#bio) -> [Section](./alice.md#bio) [Search](./alice?q=t) -> [Search](./alice.md?q=t) The DB content stays slug-form (GBrain's internal convention is unchanged). Only the on-disk export gets the `.md` annotation, so the exported markdown is viewable as-is by any standard renderer. Real-world reproduction this fix addresses: $ gbrain put 龙虾群 < <(echo '[小龙](./小龙)') $ gbrain export --dir /tmp/out $ cat /tmp/out/龙虾群.md # before this PR: contains [小龙](./小龙) — clicking 404s # after this PR: contains [小龙](./小龙.md) — clicking opens the file Impact: - 2 files changed, +149 / -1 lines (1 line of helper invocation + ~40 lines of helper + comment + 26 tests) - Zero behavior change for external URLs, anchors, or already-extended links - DB content unchanged — only the on-disk export representation gains the `.md` annotation - Existing exports remain valid (re-running export on an already-exported brain is idempotent because already-extended links are skipped) Tests: - 26 new tests covering: same-dir slug, parent-dir slug, deep nesting, CJK slugs, multiple links per line, multi-line markdown, all 6 external schemes (http/https/mailto/file/ftp/tel), all 4 extension cases (md/png/pdf/uppercase), anchor preservation, query preservation, empty/trailing-slash/no-link edge cases. - All 26 tests pass. - Full suite: 612 pass / no new regressions (4 pre-existing PGLiteEngine failures are unrelated and exist on master). Fifth in a series of practical PRs from a real Chinese-speaking deploy. Companion to: - garrytan#114 (chunker CJK) - garrytan#115 (slugify CJK) - garrytan#119 (sync git quotepath CJK) - garrytan#121 (self-contained API keys) Same theme: GBrain is meaningfully more useful when the markdown export is a first-class deliverable, not a half-broken side-effect.
Today embedding.ts and expansion.ts call `new OpenAI()` / `new Anthropic()`
with no arguments, which makes the SDKs read OPENAI_API_KEY /
ANTHROPIC_API_KEY from the process's env. That puts the burden on every
caller — shells, cron jobs, agent subprocesses, daemons — to propagate
those env vars correctly. When a caller's env doesn't have them (e.g.
launchd-spawned daemons, agent terminal tools with sanitized env), the
caller silently gets empty results from `gbrain query` / `gbrain embed`
because the SDK falls back to anonymous API calls that fail.
GBrain already has `openai_api_key` and `anthropic_api_key` fields in
its GBrainConfig schema (src/core/config.ts) and stores them in
~/.gbrain/config.json, but none of the runtime code actually reads
those fields — the config is populated but never consulted. This PR
connects that wiring so gbrain becomes self-contained: callers just
run `gbrain ...` and gbrain finds its own keys.
Changes:
- config.ts: merge ANTHROPIC_API_KEY from env into loaded config
(was silently dropped — only OPENAI_API_KEY was being merged)
- embedding.ts: read openai_api_key from loadConfig() and pass to
`new OpenAI({ apiKey })`. Falls back to SDK's env-default behavior
when config has no key (preserves current behavior for users who
rely on env vars).
- expansion.ts: same pattern for Anthropic.
Usage for callers:
# One-time setup (put keys in gbrain's own config file)
$ cat >> ~/.gbrain/config.json.fragment <<EOF
{"openai_api_key": "sk-...", "anthropic_api_key": "sk-ant-..."}
EOF
$ chmod 600 ~/.gbrain/config.json
# (or edit config.json directly)
# Then from ANY caller, no env vars needed:
$ gbrain query "..." # just works
$ gbrain embed --stale # just works
This is especially valuable for:
- Cron jobs run under launchd/systemd (which don't inherit shell env)
- Agent terminal tools with env sanitization
- Subprocess calls from Python/Node agents without env passthrough
- Docker containers without explicit env forwarding
Precedence (preserved from existing code):
env var > config file
So users who want to override per-process still can.
Impact:
- 4 files changed, +67 / -5 lines
- Zero behavior change for users who already have env vars set
- Callers without env vars in their subprocess context now work IF
the keys are written to ~/.gbrain/config.json
Tests:
- 4 new tests in test/config.test.ts cover: OPENAI env merge,
ANTHROPIC env merge (regression — was missing), both together,
and absence-when-neither.
- All 12 config tests pass; no pre-existing regressions.
The "no keys when neither env nor file provides them" case became impossible once garrytan#121 (self-contained API keys) ships — loadConfig now reads keys from config.json if env is empty. Test now asserts the stronger invariant: after env deletion, the previous env sentinel value must not leak back via the returned config. File-level keys may legitimately persist and are no longer asserted undefined. Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
b8410f9 to
18b0ebd
Compare
|
Hi, embedding.ts/expansion.ts now depend on loadConfig() to supply API keys, but loadConfig() always reads ~/.gbrain/config.json via os.homedir() and ignores the GBRAIN_HOME override used elsewhere. In environments that set GBRAIN_HOME (tests, Docker, multi-tenant), config-file keys won’t be found and SDKs will be instantiated without apiKey. Severity: action required | Category: correctness How to fix: Make loadConfig honor GBRAIN_HOME Agent prompt to fix - you can give this to your LLM of choice:
We noticed a couple of other issues in this PR as well - happy to share if helpful. Qodo code review - free for open-source. |
Per qodo-ai review on PR garrytan#121: loadConfig() / saveConfig() were going through a private getConfigDir/getConfigPath that called homedir() directly and ignored GBRAIN_HOME, while configDir/configPath (the public API used by getDbUrlSource and the docs) honored it. That split meant config-file API keys were invisible in tests, Docker containers, and multi-tenant deployments — exactly the contexts that motivated GBRAIN_HOME existing in the first place. The new self-contained API keys feature this PR adds depends on loadConfig() finding the keys, so the split also broke the feature under any GBRAIN_HOME-rooted setup. Fix collapses to one set: delete getConfigDir/getConfigPath, route loadConfig/saveConfig through configDir()/configPath(). Function declarations hoist so the forward reference is fine. homedir() import stays — configDir() still falls back to it when GBRAIN_HOME is unset. Tests: 5 new cases in test/config.test.ts covering configDir/configPath honoring GBRAIN_HOME, saveConfig writing under override, loadConfig reading from override, two-home isolation invariant (write A, read B sees null), and full round-trip. All 17 config tests pass; the 20 check-update tests stay green; the e2e/PGLite failures observed are pre-existing and unrelated (garrytan#223 macOS WASM bug per CLAUDE.md memory).
…#121) Two small ergonomics fixes folded together (#765 deferred — see TODOS.md follow-up; the CJK PGLite extraction was bigger than the plan estimated). #779 reworked (alexandreroumieu-codeapprentice): silence the missing-max_batch_tokens startup warning for recipes with genuinely dynamic batch capacity. New `EmbeddingTouchpoint.no_batch_cap?: true` field. Set on ollama (capacity depends on locally loaded model + OLLAMA_NUM_PARALLEL), litellm-proxy (depends on backend), llama-server (set by --ctx-size at server launch). Three less stderr warnings on every gateway configure; google still warns (it's a real fixed-cap provider that ought to ship a max_batch_tokens declaration). Bonus: litellm-proxy now declares `user_provided_models: true`, removing the last consumer of the legacy `recipe.id === 'litellm'` hardcode in gateway.ts:223 (D8=A wire-through completion). #121 reworked (vinsew): self-contained API keys. Two parts: 1. config.ts: ANTHROPIC_API_KEY env merge was silently missing. loadConfig() merged OPENAI_API_KEY but not ANTHROPIC_API_KEY into the file-config-shape result. One-line addition. 2. cli.ts:buildGatewayConfig: when ~/.gbrain/config.json declares openai_api_key / anthropic_api_key but the process env doesn't have those env vars set (common for launchd-spawned daemons, agent subprocess tools, containers that don't propagate ~/.zshrc), fold the config-file values into the gateway env snapshot. Process env still wins (loaded last) so per-process overrides keep working. Tests (4 cases in test/ai/no-batch-cap-suppression.test.ts): - Ollama / LiteLLM / llama-server all declare no_batch_cap: true - configureGateway does NOT warn for those three - configureGateway STILL warns for google (regression guard) - Cross-cutting invariant: empty-models recipes declare user_provided_models Tests: bun test test/ai/ — 128/128 (4 new + 124 prior). Plan: ~/.claude/plans/ok-lets-turn-this-enumerated-sonnet.md (commit 9 of 11). #765 (Hunyuan PGLite + CJK keyword fallback) deferred to TODOS.md follow-up; the CJK extraction (~150 lines + scoring logic + tests) is larger than the wave's adjacent-fix lane should carry. Closes that PR with a deferral note. Co-Authored-By: alexandreroumieu-codeapprentice <[email protected]> Co-Authored-By: vinsew <[email protected]> Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
…17-PR cluster) (#810) * feat(ai/types): add resolveAuth + probe + user_provided_models fields Foundation commit for the embedding-provider fix-wave (5 API-key recipes + discoverability pass). Three optional additions to the recipe contract: - `EmbeddingTouchpoint.user_provided_models?: true` (D8=A): flag for recipes that ship without a fixed model list. Consumed by the contract test (permits empty `models[]`), gateway.ts:223 (replaces hardcoded `recipe.id === 'litellm'` check in a follow-up commit), and init.ts:resolveAIOptions (refuses implicit "first model" pick for shorthand `--model <provider>`). - `Recipe.resolveAuth?(env): {headerName, token}` (D12=A): unified auth seam across embed / expansion / chat. Default behavior (returns `Authorization: Bearer <env-key>`) covers the existing 9 recipes unchanged. Recipes deviating (Azure with `api-key:`; future OAuth providers) override this single seam instead of adding parallel mechanisms in 3 places. Codex review caught that auth was triplicated at gateway.ts:281/728/931; D12=A unifies all three in one follow-up commit. - `Recipe.probe?(): Promise<{ready, hint?}>` (D13=A): recipe-owned readiness check for local-server providers (ollama, llama-server). Replaces the hardcoded `recipe.id === 'ollama'` special case in providers.ts. Wrapped in 200ms timeout at the call sites. Pure type additions — no behavior change. Typecheck green; existing 9 recipes work unchanged because all three fields are optional. Plan: ~/.claude/plans/ok-lets-turn-this-enumerated-sonnet.md (decisions D8=A, D11=C, D12=A, D13=A). Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]> * feat(ai/gateway): unify openai-compatible auth via Recipe.resolveAuth (D12=A) Pre-v0.32, openai-compatible auth was duplicated 3 times in gateway.ts at instantiateEmbedding, instantiateExpansion, instantiateChat — with subtle drift (embedding had a `${recipe.id.toUpperCase()}_API_KEY` fallback the other two lacked). Codex outside-voice review caught this during /plan-eng-review. D12=A: unify all three through `Recipe.resolveAuth?(env)` (declared in the prior commit). Two new module-level helpers: - `defaultResolveAuth(recipe, env, touchpoint)` — applied when a recipe doesn't declare its own resolver. Returns Authorization Bearer with `auth_env.required[0]`, falling back to the first present `auth_env.optional` env var, or 'unauthenticated' for no-auth recipes like Ollama. Throws AIConfigError with the recipe's setup_hint when required env is missing. - `applyResolveAuth(recipe, cfg, touchpoint)` — returns `createOpenAICompatible` options. Bearer-via-Authorization paths use the SDK's native `apiKey` field; custom-header paths (Azure: api-key) use `headers` and OMIT apiKey to avoid double-auth leaks. The 3 `case 'openai-compatible':` branches in instantiateEmbedding (line ~281), instantiateExpansion (line ~728), instantiateChat (line ~931) each collapse from ~10 lines of bespoke auth handling to a single `applyResolveAuth(recipe, cfg, '<touchpoint>')` call. Also: the litellm-template hardcode at gateway.ts:223 (`recipe.id === 'litellm'`) is replaced with a union check for `EmbeddingTouchpoint.user_provided_models === true` (D8=A wire-through per Codex finding #3). Pre-v0.32 builds keep working via back-compat `recipe.id === 'litellm'` clause; new recipes declaring user_provided_models pick up the same gating automatically. Existing 9 recipes (openai, anthropic, google, deepseek, groq, ollama, litellm-proxy, together, voyage) gain zero per-recipe edits — the default resolver covers their existing behavior. Behavior change for ollama expansion/chat only: now reads OLLAMA_API_KEY when set (pre-v0.32 silently passed 'unauthenticated' for those touchpoints; embedding already read it). Ollama servers ignore the header so no real-world impact; this aligns the 3 touchpoints. Tests: bun test test/ai/ — 77/77 pass. Plan: ~/.claude/plans/ok-lets-turn-this-enumerated-sonnet.md (D8=A, D12=A; addresses Codex findings #3, #4). Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]> * test(ai): IRON RULE regression test for v0.32 resolveAuth refactor Pins the contract that the v0.32 D2/D12=A resolveAuth refactor preserves auth behavior for the 9 existing recipes (openai, anthropic, google, deepseek, groq, ollama, litellm-proxy, together, voyage). 10 cases covering: - the 9 expected recipe ids are still registered - every recipe with non-empty required[] returns Authorization Bearer <key> - missing required env throws AIConfigError naming recipe + touchpoint + env-var - Ollama (empty required, optional set) reads first present optional env - Ollama (no env) falls back to "Bearer unauthenticated" - all 3 touchpoints (embedding/expansion/chat) produce identical auth shape for the same recipe + env (this is the core regression: pre-v0.32, embedding had a fallback the other two lacked) - applyResolveAuth converts Authorization Bearer to {apiKey} (SDK-native) - applyResolveAuth respects a custom-header override (Azure preview; the recipe ships in commit 8) and emits {headers} WITHOUT apiKey to avoid double-auth - native-* recipes (openai, anthropic, google) intentionally have no resolveAuth declared (they use AI-SDK adapters directly) - all openai-compatible recipes ship without resolveAuth in v0.32 (default applies); the first override is Azure in commit 8 Also: export `defaultResolveAuth` and `applyResolveAuth` as @internal gateway helpers so tests can pin them directly. Mirrors the pattern of `splitByTokenBudget` and `isTokenLimitError` already exported with the same @internal annotation. Tests: bun test test/ai/ — 87/87 pass (10 new + 77 existing). Typecheck: clean. Plan: ~/.claude/plans/ok-lets-turn-this-enumerated-sonnet.md (IRON RULE per Section 3 test review). Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]> * feat(ai): add llama-server recipe (#702 reworked) 10th recipe in the registry; first to ship Recipe.probe (D13=A) and the second user_provided_models recipe (litellm-proxy is the first). llama.cpp's llama-server exposes an OpenAI-compatible /v1/embeddings endpoint. Distinct from Ollama: different default port (8080), different model-management story (you launch it with --model <path>; the server serves whatever was passed). Recipe ships with `models: []`, `user_provided_models: true`, `default_dims: 0` so the wizard refuses implicit defaults and forces explicit --embedding-model + --embedding-dimensions. Added: - src/core/ai/recipes/llama-server.ts (61 lines) - probeLlamaServer() in src/core/ai/probes.ts; reads LLAMA_SERVER_BASE_URL with default http://localhost:8080/v1 - Registered in src/core/ai/recipes/index.ts (10 recipes total now) - test/ai/recipe-llama-server.test.ts (8 cases): registered + shape, user_provided_models flag, probe declared + reachability fail-with-hint, default-auth covering no-env / API_KEY / URL-shaped-only paths Hardening: defaultResolveAuth in gateway.ts now skips URL-shaped optional env entries (names ending in _URL or _BASE_URL) when picking a fallback auth token. Pre-fix, OLLAMA_BASE_URL=http://my-ollama would have become the Bearer token; Ollama ignores it but llama-server (and future local-server recipes) shouldn't depend on the server tolerating garbage auth. The regression test (recipes-existing-regression) gains one case pinning this contract. Per-recipe test file follows D7=B (per-recipe over DRY for readability). Plan: ~/.claude/plans/ok-lets-turn-this-enumerated-sonnet.md (commit 4 of 11). Reworked from #702 because the original PR didn't model the recipe-owned probe pattern (D13=A) or user_provided_models (D8=A). Tests: bun test test/ai/ — 95/95 pass (8 new + 87 existing). Co-Authored-By: SiyaoZheng <[email protected]> Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]> * feat(ai): add MiniMax recipe (#148 reworked) 11th recipe. embo-01 model, 1536 dims, $0.07/1M tokens. OpenAI-compatible at api.minimax.chat. MiniMax requires a `type: 'db' | 'query'` field for asymmetric retrieval (documents indexed with type='db', queries embedded with type='query'). gbrain has no query/document signal at the embed-call site today, so v1 defaults to type='db' for both indexing and retrieval — same vector space, symmetric similarity. Asymmetric query support is a follow-up TODO that needs the embed seam to thread query/document context. Plumbed via src/core/ai/dims.ts: dimsProviderOptions returns {openaiCompatible: {type: 'db'}} for modelId === 'embo-01'. Conservative max_batch_tokens=4096 declared (MiniMax docs don't publish the limit). Recursive halving in the gateway catches token-limit errors at runtime. Tests: bun test test/ai/ — 101/101 (6 new + 95 prior). Plan: ~/.claude/plans/ok-lets-turn-this-enumerated-sonnet.md (commit 5 of 11). Reworked from #148. Co-Authored-By: cacity <[email protected]> Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]> * feat(ai): add Alibaba DashScope recipe (#59 split, part 1/2) 12th recipe. text-embedding-v3 (current) + text-embedding-v2; 1024 default dims with Matryoshka options [64, 128, 256, 512, 768, 1024]. OpenAI-compatible at dashscope-intl.aliyuncs.com. China-region users override via cfg.base_urls['dashscope']; v0.32 ships with the international default. Conservative max_batch_tokens=8192 + chars_per_token=2 declared because Alibaba doesn't publish a hard batch limit and text-embedding-v3 mixes English + CJK heavily (CJK density closer to Voyage than OpenAI tiktoken). Tests: bun test test/ai/ — 106/106 (5 new + 101 prior). Plan: ~/.claude/plans/ok-lets-turn-this-enumerated-sonnet.md (commit 6 of 11). Reworked from #59 (DashScope+Zhipu split into 2 commits per the plan; Zhipu lands next). Co-Authored-By: Magicray1217 <[email protected]> Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]> * feat(ai): add Zhipu AI (BigModel) recipe (#59 split, part 2/2) 13th recipe. embedding-3 (current) + embedding-2; 1024 default dims with Matryoshka options [256, 512, 1024, 2048]. OpenAI-compatible at open.bigmodel.cn. embedding-3 at 2048 dims exceeds pgvector's HNSW cap of 2000 — those brains fall back to exact vector scans via the existing chunkEmbeddingIndexSql policy at src/core/vector-index.ts. Default stays at 1024 (HNSW-fast); users who want maximum fidelity opt into 2048 via --embedding-dimensions and accept the slower retrieval. Tests pin the HNSW boundary: 1024 returns the index SQL, 2048 returns the skip-index/exact-scan SQL. Tests: bun test test/ai/ — 112/112 (6 new + 106 prior). Plan: ~/.claude/plans/ok-lets-turn-this-enumerated-sonnet.md (commit 7 of 11). Reworked from #59. Together with DashScope (commit 6), closes the China-region embedding gap users repeatedly reported (DashScope covers Alibaba, Zhipu covers BigModel; both ship with international endpoints by default). Co-Authored-By: Magicray1217 <[email protected]> Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]> * feat(ai): add Azure OpenAI recipe (#459 reworked) 14th recipe and the first to exercise both v0.32 architectural seams: - resolveAuth (D12=A) returns `{headerName: 'api-key', token: <key>}` instead of the default Authorization Bearer. Azure rejects double-auth, so applyResolveAuth puts the key in `headers` and OMITS apiKey. - A new `Recipe.resolveOpenAICompatConfig?(env)` seam (Recipe.ts) lets the recipe template the baseURL from env (Azure: ENDPOINT + DEPLOYMENT combine into a non-/v1 path) and inject a custom fetch wrapper that splices ?api-version= onto every request URL. The fetch wrapper is type-safe via `as unknown as typeof fetch`; AI SDK never calls TS's strict `preconnect()` method on the wrapper so the cast is sound. `applyOpenAICompatConfig` (new gateway helper) routes through the recipe override or falls back to the pre-v0.32 base_urls/base_url_default behavior — existing 13 recipes get zero behavior change. API version defaults to `2024-10-21` (current stable as of 2026-05); override via AZURE_OPENAI_API_VERSION env. Endpoint trailing slash gets stripped during URL construction so users can copy-paste from the Azure portal. Tests (12 cases in test/ai/recipe-azure-openai.test.ts): - resolveAuth returns api-key NOT Authorization Bearer - applyResolveAuth puts key in headers, NOT apiKey (no double-auth) - baseURL templating from endpoint + deployment, with trailing-slash strip - AIConfigError on missing endpoint OR deployment - fetch wrapper splices api-version (default + AZURE_OPENAI_API_VERSION override) - fetch wrapper does NOT double-add api-version when caller already set it - applyOpenAICompatConfig honors recipe override IRON RULE regression test updated: now asserts azure-openai is the documented exception that overrides resolveAuth; any future override needs review. Tests: bun test test/ai/ — 124/124 (12 new + 112 prior). Plan: ~/.claude/plans/ok-lets-turn-this-enumerated-sonnet.md (commit 8 of 11, plus the resolveOpenAICompatConfig seam discovered during fold-in). Reworked from #459. The original PR proposed a hardcoded AzureOpenAI client switch; this implementation routes through the unified seams so future Azure-shaped providers (other custom-URL services) can reuse them. Co-Authored-By: JamesJZhang <[email protected]> Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]> * feat(ai): adjacent fixes — no_batch_cap (#779) + config-key fallbacks (#121) Two small ergonomics fixes folded together (#765 deferred — see TODOS.md follow-up; the CJK PGLite extraction was bigger than the plan estimated). #779 reworked (alexandreroumieu-codeapprentice): silence the missing-max_batch_tokens startup warning for recipes with genuinely dynamic batch capacity. New `EmbeddingTouchpoint.no_batch_cap?: true` field. Set on ollama (capacity depends on locally loaded model + OLLAMA_NUM_PARALLEL), litellm-proxy (depends on backend), llama-server (set by --ctx-size at server launch). Three less stderr warnings on every gateway configure; google still warns (it's a real fixed-cap provider that ought to ship a max_batch_tokens declaration). Bonus: litellm-proxy now declares `user_provided_models: true`, removing the last consumer of the legacy `recipe.id === 'litellm'` hardcode in gateway.ts:223 (D8=A wire-through completion). #121 reworked (vinsew): self-contained API keys. Two parts: 1. config.ts: ANTHROPIC_API_KEY env merge was silently missing. loadConfig() merged OPENAI_API_KEY but not ANTHROPIC_API_KEY into the file-config-shape result. One-line addition. 2. cli.ts:buildGatewayConfig: when ~/.gbrain/config.json declares openai_api_key / anthropic_api_key but the process env doesn't have those env vars set (common for launchd-spawned daemons, agent subprocess tools, containers that don't propagate ~/.zshrc), fold the config-file values into the gateway env snapshot. Process env still wins (loaded last) so per-process overrides keep working. Tests (4 cases in test/ai/no-batch-cap-suppression.test.ts): - Ollama / LiteLLM / llama-server all declare no_batch_cap: true - configureGateway does NOT warn for those three - configureGateway STILL warns for google (regression guard) - Cross-cutting invariant: empty-models recipes declare user_provided_models Tests: bun test test/ai/ — 128/128 (4 new + 124 prior). Plan: ~/.claude/plans/ok-lets-turn-this-enumerated-sonnet.md (commit 9 of 11). #765 (Hunyuan PGLite + CJK keyword fallback) deferred to TODOS.md follow-up; the CJK extraction (~150 lines + scoring logic + tests) is larger than the wave's adjacent-fix lane should carry. Closes that PR with a deferral note. Co-Authored-By: alexandreroumieu-codeapprentice <[email protected]> Co-Authored-By: vinsew <[email protected]> Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]> * feat(discoverability): doctor alt-provider advisory + init user_provided_models refusal Two small but high-leverage changes that address the discoverability problem the v0.32 wave is trying to fix. src/commands/doctor.ts: new `alternative_providers` check (8c). After the existing embedding-provider smoke test, walks listRecipes() and surfaces any recipe whose required env vars are ALL present in the process env but is not the currently configured provider. Reports as status: 'ok' with an informational message — never errors. Helps users discover that, e.g., `OPENAI_API_KEY=x DASHSCOPE_API_KEY=y` configured for openai means they have a Chinese-region alternative ready without extra setup. src/commands/init.ts: user_provided_models recipes (litellm, llama-server) now refuse the implicit "first model" pick from shorthand --model with a structured setup hint pointing the user at the explicit form `--embedding-model <provider>:<your-model-id> --embedding-dimensions <N>`. Pre-fix, shorthand --model litellm threw "no embedding models listed" which was technically correct but unhelpful. The new error includes the recipe's setup_hint when available. Tests: bun test test/ai/ — 128/128 pass; typecheck clean. Plan: ~/.claude/plans/ok-lets-turn-this-enumerated-sonnet.md (commit 10 of 11). The full interactive provider chooser in init.ts (the bigger piece of the discoverability lane) is deferred to a v0.32.x follow-up; this commit ships the doctor advisory + cleaner refusal that close the 80% case. Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]> * docs(v0.32.0): embedding-providers.md + README callout + CHANGELOG + TODOS.md Final commit of the v0.32 wave. Closes the discoverability gap that generated the 17-PR community cluster. - New docs/integrations/embedding-providers.md: capability matrix, decision tree, per-recipe one-pagers, OAuth provider notes, "my provider isn't listed" pointer to LiteLLM proxy. Voice: capability not marketing per CLAUDE.md voice rules. - README.md: embedding-providers callout near the top, naming the count (14 recipes) and pointing at the new doc. - CHANGELOG.md: v0.32.0 entry following the verdict-headline format from CLAUDE.md voice rules. Lead-with-numbers ("14 providers, 5 new"), what-this- means-for-users closer, "to take advantage" upgrade block, itemized changes, contributor credits, deferred-with-context list. - VERSION + package.json: 0.31.1 → 0.32.0. Minor bump justified by the new public Recipe surface (resolveAuth, resolveOpenAICompatConfig, probe, user_provided_models, no_batch_cap fields), the new OAuth subsystem scaffold (deferred to v0.32.x but typed in v0.32.0), and the 5 new recipes. - TODOS.md: 7 follow-up entries for the v0.32 wave's deferred work (Vertex ADC, Copilot OAuth, Codex OAuth, CJK PGLite, interactive wizard, real-credentials CI matrix, MiniMax asymmetric retrieval, multimodal hardcode un-stuck). Each entry has full context + the exact file paths + the spike work needed so a future contributor can pick up cleanly. Tests: bun test test/ai/ — 128/128 pass; typecheck clean. Plan: ~/.claude/plans/ok-lets-turn-this-enumerated-sonnet.md (commit 11 of 11). Wave complete: 11 commits, ~1500 net lines, 5 new recipes, full docs, doctor advisory, IRON RULE regression test, 7 TODOS for the v0.32.x follow-up wave. Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]> * docs: regenerate llms.txt + llms-full.txt for v0.32.0 After commit c384fad added the embedding-providers callout to README.md, the committed llms-full.txt drifted from the generator output and the build-llms test failed. Running `bun run build:llms` regenerates both files. The single line addition is the README callout pointing at docs/integrations/embedding-providers.md. Tests: bun test test/build-llms.test.ts — 7/7 pass. Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]> * test: hermetic GBRAIN_HOME for brain-registry serial flake + withEnv on recipe-llama-server Two test-isolation cleanups uncovered while shipping v0.32. test/brain-registry.serial.test.ts (the BrainRegistry "empty/null/undefined id routes to host" test): pre-existing flake on dev machines that have a real ~/.gbrain/config.json. The test asserts getBrain(null) REJECTS but on those machines the host-init path RESOLVES instead (it found the maintainer's actual brain). The fix pins GBRAIN_HOME to a guaranteed-empty tempdir for the test's duration so host-init has nothing to find and fails loudly with a non-UnknownBrainError — exactly what the assertion wants. File is .serial.test.ts so direct process.env mutation is allowed by the test-isolation linter (R1 quarantine). test/ai/recipe-llama-server.test.ts: rewrites the manual beforeEach/afterEach env save/restore as withEnv() per the canonical pattern in test/helpers/with-env.ts. The original was correct in behavior but tripped the test-isolation linter (R1: process.env mutation). withEnv() is exactly the cross-test-safe save+try/finally+restore the manual code did, just factored out. No behavior change. Tests: bun run test — 5217 pass / 0 fail (was 5027 / 1 pre-existing). Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]> * fix: address 5 codex pre-merge findings (dim passthrough + URL routing + MiniMax host) Codex adversarial review during /ship caught five real production bugs. All five fixed with regression test coverage. 1. **dimsProviderOptions on openai-compatible** (src/core/ai/dims.ts): text-embedding-3-* (Azure), text-embedding-v3 (DashScope), and embedding-3 (Zhipu) now thread `dimensions` to the wire. Without this, Azure-default 3072d hard-fails a 1536d brain on first embed; DashScope and Zhipu Matryoshka requests silently get the provider's default size instead of what the user asked for. New tests in recipe-azure-openai/dashscope/zhipu pin the contract. 2. **`gbrain init --embedding-model llama-server:foo` verbose path** (src/commands/init.ts): now refuses without `--embedding-dimensions` for user_provided_models recipes. Pre-fix, the shorthand `--model` path was guarded but the verbose `--embedding-model` path fell through to configureGateway's 1536d default and silently created the wrong- width schema; failure surfaced only at first real embed. 3. **MiniMax host correction** (src/core/ai/recipes/minimax.ts): `api.minimax.chat/v1` → `api.minimaxi.com/v1` matches MiniMax's current OpenAI-compatible docs. Default-config users would have hit the wrong endpoint before auth or model selection mattered. 4. **`LLAMA_SERVER_BASE_URL` reaches the gateway** (src/cli.ts: buildGatewayConfig): env-set local-server URLs (LLAMA_SERVER_BASE_URL, OLLAMA_BASE_URL, LMSTUDIO_BASE_URL, LITELLM_BASE_URL) now thread into `cfg.base_urls` so embed traffic hits the configured port. Pre-fix, the probe would succeed against a custom port while real embed calls went to localhost:8080. Caller-supplied `cfg.provider_base_urls` still wins over env. 5. **Recipe.probe(baseURL?) accepts the resolved URL** (src/core/ai/types.ts, src/core/ai/probes.ts, src/core/ai/recipes/llama-server.ts): when the user configures `provider_base_urls.llama-server` in config but no env var is set, the probe and gateway no longer disagree. Callers with cfg pass the resolved URL; legacy callers fall back to env / recipe default. CHANGELOG updated; llms-full.txt regenerated. Tests: bun run test — 5220/5220 pass / 0 fail (was 5217 / 0; +3 new codex-finding regression tests). Pre-merge codex adversarial: ran during /ship Step 11 against the v0.32 diff. All 5 findings addressed. Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]> * fix(ci): isolate v0.32 no-batch-cap test from mock.module leak (closes 19 CI fails) Three CI test-isolation fixes uncovered by yesterday's CI run on PR #810: 1. **`scripts/test-shard.sh` excludes `*.serial.test.ts`** (was running them in parallel shards). Without this, serial files race with non-serial files in the CI shard process. Mirrors `scripts/run-unit-shard.sh`'s exclusion set; 1-line `find` filter. 2. **`scripts/run-serial-tests.sh` runs each serial file in its own bun process**. Pre-fix, all serial files ran in ONE bun process with `--max-concurrency=1` — that limits intra-file concurrency but does NOT prevent module-registry leakage across files. When `eval-takes-quality-runner.serial.test.ts` does `mock.module('../src/core/ai/gateway.ts', () => ({chat, configureGateway}))` (a partial mock missing `resetGateway`, `defaultResolveAuth`, etc.), the next file in the same process gets the partial mock on import and `import { resetGateway }` fails with "Export named 'resetGateway' not found." Per-file processes give true isolation; cost is ~100ms × N files (negligible vs CI walltime). 3. **`test/ai/no-batch-cap-suppression.test.ts` → `.serial.test.ts`**. The test mutates `console.warn` globally (mock spy). When other tests in the same shard process load `src/core/ai/gateway.ts` and call `configureGateway()` first, they populate the module-scoped `_warnedRecipes` Set; the test's `resetGateway()` clears it but races if other gateway-touching code runs concurrently in the same process. Renaming to `.serial.test.ts` quarantines it via fix #1 + #2. 4. **CI workflow gains a serial-tests step on shard 1**. Pre-fix, shard 1 ran `bun run verify` + the parallel shard, but no shard ran `*.serial.test.ts` files. After fix #1 excludes them from shards, they need explicit invocation. New step: `bash scripts/run-serial-tests.sh` (shard 1 only). Tests: bun run test — 5220 / 0 fail (matches local pre-CI run; was showing 19 fails on CI for PR #810 due to fixes #1-#3 missing). Failure analysis from .context/attachments/test__2__75236697976.log: - 18 multimodal failures: caused by mock.module leak from eval-takes-quality-runner.serial.test.ts being run alongside voyage-multimodal.test.ts in the same parallel shard process. After fix #1 + fix #3, eval-takes-quality only runs in serial pass; after fix #2, its mock.module doesn't leak to subsequent serial files. - 1 no-batch-cap failure: same root cause; fix #3 quarantines it. Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]> --------- Co-authored-by: Claude Opus 4.7 (1M context) <[email protected]> Co-authored-by: SiyaoZheng <[email protected]> Co-authored-by: cacity <[email protected]> Co-authored-by: Magicray1217 <[email protected]> Co-authored-by: JamesJZhang <[email protected]>
GBrain stores internal cross-page references in slug form (e.g. `[Alice](./alice)`) because the slug is the canonical identifier in the DB. That works inside GBrain's own resolution layer. But when those pages are exported as `.md` files on disk and opened in standard markdown viewers (Obsidian, VS Code preview, GitHub web view, typical mkdocs/jekyll renderers), the viewers look for a literal file at `./alice` — which doesn't exist. The actual file is `./alice.md`. Result: every internal link in an exported brain is silently broken on disk. The user clicks `[小龙]` in `龙虾群.md`, sees a 404 / empty page, and cannot navigate the brain outside of GBrain itself. This defeats half the value of having the brain stored as portable markdown. Fix: Add `normalizeInternalLinks(content)` that runs over each page's serialized markdown right before `writeFileSync` and rewrites slug-form internal links to filename-form by appending `.md`: [Alice](./alice) -> [Alice](./alice.md) [Alice](alice) -> [Alice](alice.md) [Alice](../people/alice) -> [Alice](../people/alice.md) [小龙](../people/小龙) -> [小龙](../people/小龙.md) Conservative: leaves untouched anything that looks external or already extended: - URL schemes (http:, https:, mailto:, ftp:, file:, tel:, ...) — skip - Anchors (#section) — skip - Empty targets — skip - Trailing slash (directory references) — skip - Already has any extension (.md, .png, .pdf, .MD, ...) — skip - Preserves query strings and anchors when appending: [Section](./alice#bio) -> [Section](./alice.md#bio) [Search](./alice?q=t) -> [Search](./alice.md?q=t) The DB content stays slug-form (GBrain's internal convention is unchanged). Only the on-disk export gets the `.md` annotation, so the exported markdown is viewable as-is by any standard renderer. Real-world reproduction this fix addresses: $ gbrain put 龙虾群 < <(echo '[小龙](./小龙)') $ gbrain export --dir /tmp/out $ cat /tmp/out/龙虾群.md # before this PR: contains [小龙](./小龙) — clicking 404s # after this PR: contains [小龙](./小龙.md) — clicking opens the file Impact: - 2 files changed, +149 / -1 lines (1 line of helper invocation + ~40 lines of helper + comment + 26 tests) - Zero behavior change for external URLs, anchors, or already-extended links - DB content unchanged — only the on-disk export representation gains the `.md` annotation - Existing exports remain valid (re-running export on an already-exported brain is idempotent because already-extended links are skipped) Tests: - 26 new tests covering: same-dir slug, parent-dir slug, deep nesting, CJK slugs, multiple links per line, multi-line markdown, all 6 external schemes (http/https/mailto/file/ftp/tel), all 4 extension cases (md/png/pdf/uppercase), anchor preservation, query preservation, empty/trailing-slash/no-link edge cases. - All 26 tests pass. - Full suite: 612 pass / no new regressions (4 pre-existing PGLiteEngine failures are unrelated and exist on master). Fifth in a series of practical PRs from a real Chinese-speaking deploy. Companion to: - garrytan#114 (chunker CJK) - garrytan#115 (slugify CJK) - garrytan#119 (sync git quotepath CJK) - garrytan#121 (self-contained API keys) Same theme: GBrain is meaningfully more useful when the markdown export is a first-class deliverable, not a half-broken side-effect.
Summary
Today
src/core/embedding.tsandsrc/core/search/expansion.tscallnew OpenAI()/new Anthropic()with no arguments, which makes the SDKs readOPENAI_API_KEY/ANTHROPIC_API_KEYfrom the process's env. That puts the burden on every caller — shells, cron jobs, agent subprocess tools, daemons, containers — to propagate those env vars correctly.When a caller's subprocess env doesn't have them (common for launchd-spawned daemons, agent terminal tools with env sanitization, or any subprocess without explicit env forwarding), the caller silently gets empty results from
gbrain query/gbrain embedbecause the SDK falls back to anonymous API calls that fail with 401 (then throw, then get caught by gbrain's retry/fallback logic, then return "No results").The debugging experience is painful: the user sees "No results" in their agent's output, assumes the brain is empty, doesn't realize the subprocess env is missing keys. I hit this personally running
gbrain queryfrom a launchd-managed agent's terminal tool — the .zshrc-sourced keys were in my interactive shell but not in the daemon's env, and there's no surface-level error telling you that.The existing schema is almost there — just unconnected
GBrainConfig(src/core/config.ts:10-16) already definesopenai_api_keyandanthropic_api_keyfields, andsaveConfig()writes them to~/.gbrain/config.jsonwith 0600 perms. But none of the runtime code actually reads those fields — the config is populated but never consulted. This PR just connects the wiring.Also fixed:
loadConfig()was silently droppingANTHROPIC_API_KEYfrom the env merge (onlyOPENAI_API_KEYwas being merged on line 43). Minor but present bug.Fix
Three small changes:
config.ts: mergeANTHROPIC_API_KEYenv var into loaded config alongsideOPENAI_API_KEY(was silently dropped).embedding.ts: readopenai_api_keyfromloadConfig()and pass tonew OpenAI({ apiKey }). Falls back tonew OpenAI()(SDK env-default) when config has no key — preserves current behavior for users who rely on env vars.expansion.ts: same pattern for Anthropic.Precedence preserved from existing code:
env var > config file(becauseloadConfig()merges env over file). Users who want to override per-process still can.Usage for callers (new capability)
One-time setup:
Then from any caller — no env vars needed:
Especially valuable for:
-e OPENAI_API_KEYpassthroughImpact
~/.gbrain/config.jsonsaveConfig)Test plan
test/config.test.ts:OPENAI_API_KEYenv merges into configANTHROPIC_API_KEYenv merges into config (regression — was missing)bun testsuite: no new regressions (the 4 pre-existingPGLiteEnginefailures are unrelated and exist onmaster)Context
Fourth in a series of PRs from real-user setup on a Chinese-speaking deployment: #114 (chunker CJK), #115 (slugify CJK), #119 (sync CJK via core.quotepath), and now this one (key portability). Each addresses a distinct anti-pattern surfaced by running gbrain outside the "English-speaker, interactive-shell, env-vars-in-.zshrc" default assumption.
This PR is the only one that's a feature, not a bug fix — but it makes gbrain meaningfully more embeddable as the knowledge backbone for cron-driven agents and daemons, which is the direction the SKILLPACK itself advocates.