diff --git a/CHANGELOG.md b/CHANGELOG.md
index e18179c3f..c0475f560 100644
--- a/CHANGELOG.md
+++ b/CHANGELOG.md
@@ -2,6 +2,122 @@
 
 All notable changes to GBrain will be documented in this file.
 
+## [0.32.0] - 2026-05-10
+
+**5 new embedding providers + the discoverability fix that closes the 17-PR dupe cluster.**
+**`gbrain providers list` now shows 14 recipes; `gbrain doctor` tells you which alternatives are already wired.**
+
+A triage of 197 open issues + 289 open PRs surfaced a 17-PR cluster of community embedding-provider PRs filed within ~3 weeks (Ollama, Gemini, Voyage, Azure, MiniMax, Copilot, llama-server, Vertex, DashScope, Zhipu, etc.). Most were dupes of work already in master — gbrain has shipped a comprehensive AI SDK gateway + recipe pattern since v0.14, with 9 providers built in. Users just didn't know.
+
+v0.32.0 ships the missing recipes that aren't covered by the existing pattern, plus a documentation pass + doctor advisory + improved error hints that close the discoverability gap. Codex outside-voice review during plan-eng-review caught the discoverability framing — without it, the wave would have shipped 8 recipes plus an OAuth subsystem instead of the focused 5-recipe + docs delivery.
+
+### The numbers that matter
+
+```
+gbrain providers list   →   v0.31.1: 9 providers   →   v0.32.0: 14 providers
+gbrain doctor           →   v0.31.1: 1 advisory     →   v0.32.0: 2 advisories (+ alternative_providers)
+```
+
+5 new recipes:
+
+| Recipe | Auth | Default dims | Notes |
+|---|---|---|---|
+| `azure-openai` | `AZURE_OPENAI_API_KEY` + `AZURE_OPENAI_ENDPOINT` + `AZURE_OPENAI_DEPLOYMENT` | 1536 | First recipe with `api-key:` custom header (not Bearer); first with templated URL + `?api-version=` query injection |
+| `minimax` | `MINIMAX_API_KEY` | 1536 | China-region; embo-01 model; type='db' asymmetric retrieval field plumbed via dims.ts |
+| `dashscope` | `DASHSCOPE_API_KEY` | 1024 | Alibaba; international endpoint default; CJK-aware batching (chars_per_token=2) |
+| `zhipu` | `ZHIPUAI_API_KEY` | 1024 | BigModel; embedding-3 with Matryoshka up to 2048 (HNSW falls back to exact-scan past 2000 dims) |
+| `llama-server` | (none) | user-set | llama.cpp's `llama-server --embeddings`; user_provided_models recipe |
+
+### What this means for new users
+
+`gbrain init` keeps OpenAI as the zero-config default. Users with API keys for any of the other 13 providers see them surfaced via `gbrain doctor` ("Detected 2 alternative embedding providers ready to use: voyage, dashscope. Run `gbrain providers list` to switch."). Users on Azure tenancies, China-region, or local-only setups have first-class recipes instead of "find a workaround." Users with provider needs gbrain doesn't ship can route through LiteLLM proxy (the universal escape hatch) without writing custom code.
+
+For agents: every recipe is registered in the same `listRecipes()` registry, so `gbrain providers list/test/env/explain` automatically picks up new recipes without code changes. The recipe contract test (`test/ai/recipes-contract.test.ts`) keeps the registry honest.
+
+### To take advantage of v0.32.0
+
+`gbrain upgrade` should do this automatically. If it didn't:
+
+1. **Confirm the new recipes show:**
+   ```bash
+   gbrain providers list
+   ```
+   Should show 14 entries including `azure-openai`, `minimax`, `dashscope`, `zhipu`, `llama-server`.
+
+2. **Try the doctor advisory:**
+   ```bash
+   gbrain doctor
+   ```
+   Look for the `alternative_providers` row. If env vars for unconfigured providers are present, it'll name them.
+
+3. **Read the new docs** at [`docs/integrations/embedding-providers.md`](docs/integrations/embedding-providers.md) — capability matrix, decision tree, per-recipe setup, "my provider isn't listed" path.
+
+4. **No breaking changes**: the existing 9 recipes (openai, anthropic, google, deepseek, groq, ollama, litellm-proxy, together, voyage) keep working unchanged. The internal auth refactor (D12=A unified resolveAuth seam) is pinned by `test/ai/recipes-existing-regression.test.ts` so the next refactor can't silently break them.
+
+5. **If anything breaks**, file an issue at https://github.com/garrytan/gbrain/issues with `gbrain doctor` output. The only behavior change for existing recipes: Ollama expansion + chat now read `OLLAMA_API_KEY` when set (embedding already did; the unification aligns all three touchpoints).
+
+### Itemized changes
+
+#### Architectural foundations
+
+- **Recipe.resolveAuth(env) seam (D12=A)**: unified the openai-compatible auth path, which was duplicated 3 times across `instantiateEmbedding`, `instantiateExpansion`, `instantiateChat` with subtle drift. Default impl (used by all existing recipes unchanged) returns `{headerName: 'Authorization', token: 'Bearer <key>'}`. Recipes deviating override; Azure is the first.
+- **Recipe.resolveOpenAICompatConfig(env) seam**: env-templated baseURL + optional fetch wrapper for recipes whose URL shape doesn't fit a static `base_url_default`. Azure uses both seams.
+- **Recipe.probe() seam (D13=A)**: recipe-owned readiness check for local-server providers. Replaces the hardcoded `recipe.id === 'ollama'` special case in `runExplain()`. llama-server declares its own probe; future local providers self-register.
+- **EmbeddingTouchpoint.user_provided_models?: true (D8=A)**: explicit signal for recipes that ship without a fixed model list (litellm, llama-server). Replaces the legacy `recipe.id === 'litellm'` hardcode in gateway.ts:223; refusal in `init.ts:resolveAIOptions` for shorthand `--model` with a setup hint pointing at the explicit form.
+- **EmbeddingTouchpoint.no_batch_cap?: true**: silences the missing-max_batch_tokens startup warning for recipes with genuinely dynamic batch capacity (Ollama, LiteLLM proxy, llama-server). Pre-fix: 3 stderr warnings on every `configureGateway()` call. Post-fix: only `google` warns.
+
+#### Discoverability
+
+- New `docs/integrations/embedding-providers.md` (one-pager: capability table, decision tree, per-recipe setup, "my provider isn't listed" path to LiteLLM).
+- README embedding-providers callout near the top of the install section.
+- `gbrain doctor` adds an `alternative_providers` check that surfaces recipes whose env vars are already set but aren't the configured provider.
+- `gbrain init --model litellm` (or any user_provided_models recipe) now refuses with a structured setup hint instead of throwing "no embedding models listed."
+
+#### Codex review fixes (pre-merge)
+
+- **dimsProviderOptions on openai-compatible**: text-embedding-3-* (Azure), text-embedding-v3 (DashScope), and embedding-3 (Zhipu) now thread `dimensions` to the wire. Without this, Azure-default 3072d would mismatch a 1536d brain on the first embed; DashScope/Zhipu Matryoshka requests would be silently ignored.
+- **`gbrain init --embedding-model llama-server:foo` (verbose path)**: now refuses without `--embedding-dimensions`. Pre-fix, the verbose path fell through to the gateway's 1536d default and silently created the wrong-width schema (only the shorthand `--model` was guarded).
+- **MiniMax host correction**: `api.minimax.chat` → `api.minimaxi.com` (matches MiniMax's current OpenAI-compatible docs).
+- **`LLAMA_SERVER_BASE_URL` reaches the gateway**: `buildGatewayConfig` now threads `LLAMA_SERVER_BASE_URL`, `OLLAMA_BASE_URL`, `LMSTUDIO_BASE_URL`, `LITELLM_BASE_URL` env into `cfg.base_urls` so embed traffic actually hits the configured port. Pre-fix, the env-only setup let probe pass on a custom port while traffic still hit `localhost:8080`.
+- **`Recipe.probe(baseURL?)` accepts the resolved URL**: probe and gateway can no longer disagree when only `provider_base_urls` is set in config (no env). Callers with cfg pass the URL; legacy callers fall back to env.
+
+#### Adjacent fixes
+
+- **#779 (alexandreroumieu-codeapprentice) reworked**: `EmbeddingTouchpoint.no_batch_cap?: true` opt-out for dynamic-cap recipes.
+- **#121 (vinsew) reworked**: `~/.gbrain/config.json` API keys now propagate to the gateway env. Pre-fix, `openai_api_key` / `anthropic_api_key` config-file values were ignored (the gateway only saw `process.env`). Common bite: launchd-spawned daemons or agent subprocess tools without `~/.zshrc` propagation. Process env still wins on conflict.
+- `loadConfig()` now merges `ANTHROPIC_API_KEY` env var into the file-config result (was silently dropped).
+- IRON RULE regression test (`test/ai/recipes-existing-regression.test.ts`): pins that the v0.32 resolveAuth refactor preserves auth behavior for the existing 9 recipes.
+
+### Closed as superseded
+
+The following community PRs are closed because their work is now covered by the recipe system + LiteLLM proxy escape hatch + the recipes shipped in this wave:
+
+- #49, #58, #73, #100, #112, #134, #137, #150, #172, #178, #255, #327, #420, #482, #516, #780, #89 — pluggable embedding adapter / Ollama / Gemini / E5 / Azure-via-LiteLLM / etc.
+
+Each contributor identified a real gap; the patterns they prototyped converged on the recipe system that was shipped in v0.14. Thank you for the early signal.
+
+### Deferred to v0.32.x (with TODOS.md entries)
+
+- **#729 Vertex AI ADC** (lucha0404): proper ADC chain (metadata server, gcloud creds, service-account JSON) is a real product surface, not the single-source-JSON path the original PR proposed.
+- **#691 GitHub Copilot** (tonyxu-io): outbound OAuth is a new product surface (login flow, browser/device flow, refresh, UX), not a sidecar recipe. Needs its own design pass.
+- **#698 OpenAI Codex OAuth** (perlantir): same OAuth-product-surface argument; chat-only.
+- **#765 Hunyuan PGLite + CJK keyword fallback** (313094319-sudo): the CJK PGLite branch is ~150 lines of new SQL + scoring logic that deserves its own focused PR rather than being folded into a 9-commit wave.
+- **Interactive provider chooser in `gbrain init`**: the wizard piece of the discoverability lane. v0.32.0 ships the doctor advisory + cleaner refusal that close the 80% case; the full wizard is a v0.32.x follow-up.
+- **Real-credentials per-recipe smoke fixtures**: opt-in CI matrix gated on API-key budget approval.
+
+### Contributors
+
+Reworked from / inspired by:
+- @cacity (#148 MiniMax)
+- @JamesJZhang (#459 Azure OpenAI)
+- @Magicray1217 (#59 DashScope + Zhipu)
+- @SiyaoZheng (#702 llama-server)
+- @alexandreroumieu-codeapprentice (#779)
+- @vinsew (#121)
+- @100yenadmin / Eva (Voyage 4 Large 2048d HNSW policy, shipped earlier via 3004a87)
+
+Codex outside-voice review during plan-eng-review drove the scope reduction (D11=C) from 8 recipes + OAuth subsystem to 5 recipes + docs.
+
 ## [0.31.12] - 2026-05-10
 
 **The chat default no longer 404s, and every Claude call gbrain makes is now one config key away from your preferred model.**
diff --git a/README.md b/README.md
index 3de35dc52..9b33b3a0c 100644
--- a/README.md
+++ b/README.md
@@ -16,6 +16,8 @@ GBrain is those patterns, generalized. 34 skills. Install in 30 minutes. Your ag
 
 > **LLMs:** fetch [`llms.txt`](llms.txt) for the documentation map, or [`llms-full.txt`](llms-full.txt) for the same map with core docs inlined in one fetch. **Agents:** start with [`AGENTS.md`](AGENTS.md) (or [`CLAUDE.md`](CLAUDE.md) if you're Claude Code).
 
+> **Embedding providers:** OpenAI is the default, but gbrain ships with **14 recipes** covering Voyage, Google Gemini, Azure OpenAI, MiniMax, Alibaba DashScope, Zhipu, Ollama (local), llama.cpp llama-server (local), LiteLLM proxy (universal), and 5 more. Run `gbrain providers list` to see them, or read [`docs/integrations/embedding-providers.md`](docs/integrations/embedding-providers.md) for setup, pricing, and a decision tree. `gbrain doctor` will surface alternative providers whose env vars you already have set.
+
 ## Install
 
 ### On an agent platform (recommended)
diff --git a/TODOS.md b/TODOS.md
index 140b53211..2c2002b39 100644
--- a/TODOS.md
+++ b/TODOS.md
@@ -1,5 +1,71 @@
 # TODOS
 
+## Embedding-provider follow-ups (v0.32.0)
+
+- [ ] **v0.32.x: Vertex AI ADC embedding provider (#729 originally).** lucha0404
+  prototyped this with single-source-JSON via `GOOGLE_APPLICATION_CREDENTIALS`.
+  Real ADC is the full chain (metadata server, gcloud creds, service-account
+  JSON). The recipe needs to either use `@ai-sdk/google-vertex` (one new
+  dep, native fit) or implement the chain via Bun.crypto.subtle for RS256
+  JWT signing (zero dep, ~150 lines + RS256 spike). Original Q3 chose
+  zero-dep; revisit the dep budget when scoping.
+
+- [ ] **v0.32.x: GitHub Copilot embeddings (#691 originally).** tonyxu-io
+  proposed adding Copilot's Metis embedding endpoint as a sidecar recipe.
+  Codex review caught that this is not a recipe-add — it's an outbound OAuth
+  product surface (login flow, browser/device flow, refresh, UX). Needs its
+  own design pass: where does the token live? `~/.gbrain/oauth/copilot.json`
+  mode 0600 was the v0.32 plan; revisit + write `gbrain auth login copilot`.
+
+- [ ] **v0.32.x: OpenAI Codex OAuth chat provider (#698 originally).** perlantir
+  proposed a chat-only provider that reuses ChatGPT subscription auth instead
+  of API keys. Same OAuth-product-surface argument as #691. Same shared
+  infra: `~/.gbrain/oauth/<provider>.json` + `gbrain auth login <provider>`.
+  Build alongside #691 in one OAuth-subsystem wave.
+
+- [ ] **v0.32.x: CJK PGLite keyword fallback (#765 extracted).** 313094319-sudo
+  hit a real gap: PGLite's FTS doesn't tokenize CJK well, so Chinese queries
+  return empty results even with proper embeddings. Their PR added a
+  hasCJK detection branch in `searchKeyword` that switches to LIKE-based
+  fuzzy matching with a custom scoring function. ~150 lines of new SQL +
+  scoring + tests. Worth its own focused PR rather than folded into the
+  v0.32 wave's adjacent-fix lane. Extract `extractSearchTokens`,
+  `normalizeSearchText`, `hasCJK` helpers + the CJK branch in
+  `pglite-engine.ts:searchKeyword`. Includes tests for romaji + Korean
+  Hangul + traditional/simplified Chinese.
+
+- [ ] **v0.32.x: interactive provider chooser in `gbrain init`.** The full
+  wizard piece of the v0.32 discoverability lane was deferred. Today
+  `gbrain init` (no flags, TTY) silently uses OpenAI default. Plan: hook
+  into `init.ts:resolveAIOptions`, when no `--model` AND TTY AND not
+  `--non-interactive`, call `runExplain([])` (non-JSON path) from
+  `providers.ts:233-350` to print the provider matrix, then prompt with
+  readline (mirror `supabaseWizard()` at `init.ts:108`). Suggest
+  recommended based on env detection. Refuse `user_provided_models`
+  shorthand (already done in v0.32.0). Tests:
+  `test/init-provider-wizard.test.ts` (TTY → prompt fires; non-TTY →
+  falls through; invalid choice → re-prompts).
+
+- [ ] **v0.32.x: real-credentials per-recipe smoke-test CI matrix.** Codex
+  finding #6 noted that unit tests via `__setEmbedTransportForTests` prove
+  routing but not contract correctness with the actual provider HTTP
+  shape. Provider APIs change quietly (Voyage encoding-format, MiniMax
+  type field, Azure header). One real-call per recipe per month catches
+  drift before users do; <$1/run estimated. Requires API-key budget
+  approval + repo secrets.
+
+- [ ] **v0.32.x: MiniMax asymmetric retrieval support.** v0.32 ships
+  `embo-01` with `type: 'db'` for both indexing and queries (symmetric
+  retrieval). True asymmetric needs a query/document signal threaded
+  through the embed seam. Worth it for MiniMax users who care about
+  retrieval quality on Chinese content; defer until users complain.
+
+- [ ] **v0.32.x: un-hardcode the multimodal dispatch at gateway.ts:583.**
+  Currently `recipe.id !== 'voyage'` is hardcoded — harmless until a
+  second multimodal recipe lands. Make it table-driven via
+  `Recipe.touchpoints.embedding.supports_multimodal` +
+  `multimodal_models`. ~10 lines + a contract test.
+
 ## v0.31.2 follow-ups
 
 ### Investigate: `gbrain query <common-keyword>` infinite loop
diff --git a/VERSION b/VERSION
index 3112f8a4c..8a0d6d408 100644
--- a/VERSION
+++ b/VERSION
@@ -1 +1 @@
-0.31.12
+0.32.0
\ No newline at end of file
diff --git a/docs/integrations/embedding-providers.md b/docs/integrations/embedding-providers.md
new file mode 100644
index 000000000..943728e96
--- /dev/null
+++ b/docs/integrations/embedding-providers.md
@@ -0,0 +1,130 @@
+# Embedding providers
+
+GBrain ships with 14 embedding-provider recipes covering OpenAI, the major hosted alternatives, three local options, and a universal escape hatch (LiteLLM proxy). Run `gbrain providers list` to see the live registry; `gbrain providers explain --json` emits a machine-readable matrix for agents.
+
+This page is the human-readable counterpart: capability per provider, env-var setup, dimensions, cost, and known constraints.
+
+## Quick start
+
+```
+gbrain providers list                          # see all providers
+gbrain providers env <provider-id>             # see required env vars
+gbrain providers test --model openai:text-embedding-3-large   # smoke-test
+gbrain init --pglite --model voyage            # use a non-default provider
+```
+
+## TL;DR table
+
+| Provider | env vars | default dims | cost ($/1M tokens) | local? | multimodal? |
+|---|---|---|---|---|---|
+| `openai` | `OPENAI_API_KEY` | 1536 | 0.13 | no | no |
+| `voyage` | `VOYAGE_API_KEY` | 1024 | 0.18 | no | yes (`voyage-multimodal-3`) |
+| `google` | `GOOGLE_GENERATIVE_AI_API_KEY` | 768 | 0.025 | no | no |
+| `azure-openai` | `AZURE_OPENAI_API_KEY`, `AZURE_OPENAI_ENDPOINT`, `AZURE_OPENAI_DEPLOYMENT` | 1536 | 0.13 | no | no |
+| `minimax` | `MINIMAX_API_KEY` | 1536 | 0.07 | no | no |
+| `dashscope` | `DASHSCOPE_API_KEY` | 1024 | varies | no | no |
+| `zhipu` | `ZHIPUAI_API_KEY` | 1024 | varies | no | no |
+| `ollama` | (none — runs locally) | 768 | 0 | yes | no |
+| `llama-server` | (none — runs locally) | user-set | 0 | yes | no |
+| `litellm` | `LITELLM_API_KEY` (optional) | user-set | varies | yes (proxy) | no |
+| `together` | `TOGETHER_API_KEY` | 768 | varies | no | no |
+| `anthropic` | (no embedding model — chat only) | — | — | — | — |
+| `deepseek` | (no embedding model — chat only) | — | — | — | — |
+| `groq` | (no embedding model — chat only) | — | — | — | — |
+
+## Decision tree
+
+- **Cost-sensitive, English-only**: Ollama (free, local) or Voyage (paid, best quality per dollar).
+- **Quality-first**: Voyage `voyage-4-large` (1024-2048 dims, ~3-4× more dense tokens than OpenAI tiktoken).
+- **Reranking pair**: Voyage (their reranker `rerank-2.5` pairs cleanly with Voyage embeddings).
+- **Enterprise compliance**: Azure OpenAI (data residency + private endpoints) or self-hosted via llama-server / Ollama.
+- **China region**: DashScope (Alibaba) or Zhipu (BigModel). DashScope's international endpoint at `dashscope-intl.aliyuncs.com`; override `provider_base_urls.dashscope` for the China endpoint.
+- **OSS local, full control**: llama-server (`llama.cpp`) for any GGUF model; Ollama for the curated catalog.
+- **Anything else**: LiteLLM proxy. Run LiteLLM in front of any provider (Bedrock, Vertex, Cohere, Jina, Fireworks, etc.) and point gbrain at it via `LITELLM_BASE_URL`.
+
+## Per-provider details
+
+### OpenAI
+
+Default. Set `OPENAI_API_KEY`. Models: `text-embedding-3-large` (3072 max, 1536 default), `text-embedding-3-small` (1536). Matryoshka via the `dimensions` field — gbrain pins it from `embedding_dimensions` config so existing 1536-dim brains stay aligned across SDK upgrades.
+
+### Voyage AI
+
+Best-in-class quality on the Voyage 4 family (Jan 2026 release). Set `VOYAGE_API_KEY`. Models: `voyage-4-large`, `voyage-4`, `voyage-4-lite`, `voyage-4-nano`, `voyage-3.5`, `voyage-code-3` (code-tuned), `voyage-finance-2`, `voyage-law-2`, `voyage-multimodal-3` (text + image).
+
+Voyage 4 family shares an embedding space across all variants, so you can index with `voyage-4-large` and query with `voyage-4-lite` without reindexing. Dims: 256, 512, 1024, 2048. **2048 exceeds pgvector's HNSW cap of 2000** — those brains fall back to exact vector scans (still correct, just slower).
+
+### Google Gemini
+
+Set `GOOGLE_GENERATIVE_AI_API_KEY` (the AI Studio public API key). Model: `gemini-embedding-001`. Default 768 dims; Matryoshka up to 3072. Cheap.
+
+For GCP service-account / Vertex AI auth (production deployments), see the v0.32.x follow-up — Vertex ADC is on the roadmap.
+
+### Azure OpenAI
+
+Enterprise OpenAI behind Azure tenancy. Required env: `AZURE_OPENAI_API_KEY`, `AZURE_OPENAI_ENDPOINT` (e.g. `https://my-resource.openai.azure.com`), `AZURE_OPENAI_DEPLOYMENT` (the deployment name from your Azure portal). Optional: `AZURE_OPENAI_API_VERSION` (defaults to `2024-10-21`).
+
+Unlike vanilla OpenAI, Azure uses `api-key:` header (not `Authorization: Bearer`) and a templated URL with `?api-version=` query param — gbrain handles both via the recipe's resolveAuth + resolveOpenAICompatConfig overrides.
+
+Models: `text-embedding-3-large`, `text-embedding-3-small`, `text-embedding-ada-002` (your Azure deployment must serve the requested model).
+
+### MiniMax (海螺AI)
+
+Set `MINIMAX_API_KEY`. Optional `MINIMAX_GROUP_ID` for org-scoped accounts. Model: `embo-01` (1536 dims).
+
+MiniMax's API takes a `type: 'db' | 'query'` field for asymmetric retrieval. v0.32 routes everything as `type='db'` (symmetric retrieval — same vector space for indexing and queries). Asymmetric query support is a v0.32.x follow-up.
+
+### DashScope (Alibaba)
+
+Set `DASHSCOPE_API_KEY`. International endpoint at `dashscope-intl.aliyuncs.com` by default; override `provider_base_urls.dashscope` for the China endpoint. Models: `text-embedding-v3` (current; Matryoshka 64-1024 dims), `text-embedding-v2`.
+
+CJK-dominant content tokenizes denser than OpenAI tiktoken; gbrain declares `chars_per_token: 2` so the batch pre-split leaves headroom.
+
+### Zhipu AI (BigModel)
+
+Set `ZHIPUAI_API_KEY`. Models: `embedding-3` (current; Matryoshka 256-2048 dims), `embedding-2`. v0.32 default is 1024 (HNSW-compatible). The 2048-dim option works but falls into the exact-scan branch (see Voyage 4 Large note above).
+
+### Ollama (local)
+
+No env required — Ollama runs unauthenticated locally. Optional `OLLAMA_BASE_URL` (default `http://localhost:11434/v1`) and `OLLAMA_API_KEY` (for auth-enabled deployments).
+
+Recipe ships with `nomic-embed-text` (768d, recommended), `mxbai-embed-large` (1024d), `all-minilm` (384d). `gbrain providers test --model ollama:nomic-embed-text` smoke-tests the local install.
+
+### llama-server (local, llama.cpp)
+
+`llama.cpp`'s `llama-server --embeddings` endpoint. No env required. Optional `LLAMA_SERVER_BASE_URL` (default `http://localhost:8080/v1`) and `LLAMA_SERVER_API_KEY`.
+
+User-driven models: launch llama-server with `--model <gguf-path> --embeddings`, then run `gbrain init --embedding-model llama-server:<your-id> --embedding-dimensions <N>`. The recipe refuses the implicit shorthand `--model llama-server` because there's no canonical first model.
+
+### LiteLLM proxy (universal escape hatch)
+
+Run [LiteLLM](https://docs.litellm.ai/docs/proxy/quick_start) in front of any provider — Bedrock, Vertex, Cohere, Jina, Fireworks, OctoAI, etc. The proxy normalizes everything to the OpenAI-compatible API; gbrain points at the proxy via `LITELLM_BASE_URL` and proxies the call.
+
+This is the catch-all for "my provider isn't in the list above." Set up LiteLLM, then `gbrain init --embedding-model litellm:<your-model-id> --embedding-dimensions <N>`.
+
+## Choosing dimensions
+
+Three numbers matter:
+1. **Provider's native dims**: each model has a "true" output dim (e.g. OpenAI `text-embedding-3-large` is 3072 native).
+2. **Matryoshka reductions**: most modern providers let you request a smaller vector via the `dimensions` field.
+3. **HNSW cap**: pgvector's HNSW index supports up to 2000 dims. Brains above that fall back to exact vector scans (slower but correct; gbrain handles the SQL automatically via `chunkEmbeddingIndexSql` in `src/core/vector-index.ts`).
+
+For most users: **stay at 1024 or 1536**. Bigger isn't better below the noise floor; smaller saves disk + RAM with marginal recall loss on Matryoshka providers.
+
+## My provider isn't listed
+
+Three options:
+
+1. **Use LiteLLM proxy** (above) — the universal escape hatch. Works for 100+ providers.
+2. **Open a feature request** at [github.com/garrytan/gbrain/issues](https://github.com/garrytan/gbrain/issues) with the provider's API docs URL and a setup snippet. Recipes are ~30-40 lines of TypeScript.
+3. **Submit a recipe**: clone, copy `src/core/ai/recipes/voyage.ts` as the gold-standard openai-compat template, register in `src/core/ai/recipes/index.ts`, add a per-recipe smoke test under `test/ai/recipe-<name>.test.ts`. The recipe contract test (`test/ai/recipes-contract.test.ts`) and IRON RULE regression test pin the structural invariants.
+
+## Switching providers on an existing brain
+
+Embedding dimensions are baked into the schema at `gbrain init` time. To change providers post-init, you usually need to re-embed:
+
+1. Update config: `gbrain config set embedding_model <provider>:<model>` and `embedding_dimensions <N>`.
+2. Reindex schema if dims changed: `gbrain doctor` will detect the mismatch and print the exact `ALTER TABLE` recipe.
+3. Re-embed: `gbrain embed --all` (or `--stale` for incremental).
+
+`gbrain doctor` 8c "alternative_providers" surfaces unconfigured providers whose env is already set — useful when you've configured OpenAI but also have e.g. `VOYAGE_API_KEY` exported and want to know you can switch without extra setup.
diff --git a/llms-full.txt b/llms-full.txt
index e823a5029..bbc1d39d1 100644
--- a/llms-full.txt
+++ b/llms-full.txt
@@ -1836,6 +1836,8 @@ GBrain is those patterns, generalized. 34 skills. Install in 30 minutes. Your ag
 
 > **LLMs:** fetch [`llms.txt`](llms.txt) for the documentation map, or [`llms-full.txt`](llms-full.txt) for the same map with core docs inlined in one fetch. **Agents:** start with [`AGENTS.md`](AGENTS.md) (or [`CLAUDE.md`](CLAUDE.md) if you're Claude Code).
 
+> **Embedding providers:** OpenAI is the default, but gbrain ships with **14 recipes** covering Voyage, Google Gemini, Azure OpenAI, MiniMax, Alibaba DashScope, Zhipu, Ollama (local), llama.cpp llama-server (local), LiteLLM proxy (universal), and 5 more. Run `gbrain providers list` to see them, or read [`docs/integrations/embedding-providers.md`](docs/integrations/embedding-providers.md) for setup, pricing, and a decision tree. `gbrain doctor` will surface alternative providers whose env vars you already have set.
+
 ## Install
 
 ### On an agent platform (recommended)
diff --git a/package.json b/package.json
index 8e2c3fa0b..b4beb23f2 100644
--- a/package.json
+++ b/package.json
@@ -1,6 +1,6 @@
 {
   "name": "gbrain",
-  "version": "0.31.12",
+  "version": "0.32.0",
   "description": "Postgres-native personal knowledge brain with hybrid RAG search",
   "type": "module",
   "main": "src/core/index.ts",
diff --git a/scripts/run-serial-tests.sh b/scripts/run-serial-tests.sh
index f54321827..a9ee62a49 100755
--- a/scripts/run-serial-tests.sh
+++ b/scripts/run-serial-tests.sh
@@ -30,5 +30,29 @@ if [ "${1:-}" = "--dry-run-list" ]; then
   exit 0
 fi
 
-echo "[serial-tests] running ${#files[@]} file(s) with --max-concurrency=1"
-exec bun test --max-concurrency=1 --timeout=60000 "${files[@]}"
+echo "[serial-tests] running ${#files[@]} file(s), one bun process per file"
+
+# Each serial file gets its OWN bun process. `--max-concurrency=1` was not
+# enough: files in the same process share the module registry, so a top-level
+# `mock.module(...)` in one file leaks into the next file's imports
+# (eval-takes-quality-runner mocks gateway.ts and the next file fails on
+# `import { resetGateway }` because the mock factory didn't export it).
+# Per-file processes give true isolation; cost is ~100ms startup × N files.
+fail_count=0
+failed_files=()
+for f in "${files[@]}"; do
+  if ! bun test --max-concurrency=1 --timeout=60000 "$f"; then
+    fail_count=$((fail_count + 1))
+    failed_files+=("$f")
+  fi
+done
+
+if [ "$fail_count" -gt 0 ]; then
+  echo "" >&2
+  echo "[serial-tests] $fail_count file(s) failed:" >&2
+  for f in "${failed_files[@]}"; do
+    echo "  - $f" >&2
+  done
+  exit 1
+fi
+echo "[serial-tests] all ${#files[@]} file(s) passed"
diff --git a/src/cli.ts b/src/cli.ts
index ec96a46e9..5b82a5c2f 100755
--- a/src/cli.ts
+++ b/src/cli.ts
@@ -1219,6 +1219,27 @@ async function handleCliOnly(command: string, args: string[]) {
 // but not the other previously required remembering to mirror the change;
 // the helper makes that structural.
 function buildGatewayConfig(c: GBrainConfig): AIGatewayConfig {
+  // v0.32 (#121 reworked): when ~/.gbrain/config.json declares
+  // openai_api_key / anthropic_api_key, fold them into the gateway env so
+  // recipes that read OPENAI_API_KEY / ANTHROPIC_API_KEY find them. Process
+  // env still wins (it's loaded last) — this is a fallback for daemons /
+  // launchd-spawned subprocesses that don't propagate ~/.zshrc-sourced keys.
+  const envFromConfig: Record<string, string> = {};
+  if (c.openai_api_key) envFromConfig.OPENAI_API_KEY = c.openai_api_key;
+  if (c.anthropic_api_key) envFromConfig.ANTHROPIC_API_KEY = c.anthropic_api_key;
+
+  // v0.32 codex finding #4+#5 fix: thread local-server _BASE_URL env vars
+  // into base_urls so the gateway hits the user's configured port. Without
+  // this, `LLAMA_SERVER_BASE_URL=http://localhost:9000` would let the probe
+  // succeed against :9000 but the actual embed call would still go to the
+  // recipe's base_url_default (localhost:8080). Same fix applies to
+  // OLLAMA_BASE_URL. Caller-provided cfg.provider_base_urls wins.
+  const envBaseUrls: Record<string, string> = {};
+  if (process.env.LLAMA_SERVER_BASE_URL) envBaseUrls['llama-server'] = process.env.LLAMA_SERVER_BASE_URL;
+  if (process.env.OLLAMA_BASE_URL) envBaseUrls['ollama'] = process.env.OLLAMA_BASE_URL;
+  if (process.env.LMSTUDIO_BASE_URL) envBaseUrls['lmstudio'] = process.env.LMSTUDIO_BASE_URL;
+  if (process.env.LITELLM_BASE_URL) envBaseUrls['litellm'] = process.env.LITELLM_BASE_URL;
+
   return {
     embedding_model: c.embedding_model,
     embedding_dimensions: c.embedding_dimensions,
@@ -1226,8 +1247,8 @@ function buildGatewayConfig(c: GBrainConfig): AIGatewayConfig {
     expansion_model: c.expansion_model,
     chat_model: c.chat_model,
     chat_fallback_chain: c.chat_fallback_chain,
-    base_urls: c.provider_base_urls,
-    env: { ...process.env },
+    base_urls: { ...envBaseUrls, ...(c.provider_base_urls ?? {}) }, // config wins over env
+    env: { ...envFromConfig, ...process.env }, // process.env wins
   };
 }
 
diff --git a/src/commands/doctor.ts b/src/commands/doctor.ts
index e94c303ca..d021b35b7 100644
--- a/src/commands/doctor.ts
+++ b/src/commands/doctor.ts
@@ -1170,6 +1170,39 @@ export async function runDoctor(engine: BrainEngine | null, args: string[], dbSo
     });
   }
 
+  // 8c. Alternative provider advisory (v0.32 D11=C / Codex finding #2 wire-through).
+  // Walks listRecipes() and surfaces any recipe whose required env vars are ALL
+  // set in the process env but is not the currently configured provider. Helps
+  // users discover that, e.g., OPENAI_API_KEY=x DASHSCOPE_API_KEY=y means they
+  // have a Chinese-region alternative ready to go without setup.
+  progress.heartbeat('alternative_providers');
+  try {
+    const { listRecipes } = await import('../core/ai/recipes/index.ts');
+    const { getEmbeddingModel } = await import('../core/ai/gateway.ts');
+    const configuredId = (getEmbeddingModel() || '').split(':')[0];
+    const alternatives: string[] = [];
+    for (const r of listRecipes()) {
+      if (r.id === configuredId) continue;
+      const required = r.auth_env?.required ?? [];
+      // Skip recipes with no required env (they're "always available" — not a
+      // useful signal) and recipes that require env we don't have.
+      if (required.length === 0) continue;
+      const allPresent = required.every(k => !!process.env[k]);
+      if (!allPresent) continue;
+      // Skip recipes without an embedding touchpoint (chat-only — not an
+      // embedding alternative).
+      if (!r.touchpoints.embedding) continue;
+      alternatives.push(r.id);
+    }
+    if (alternatives.length > 0) {
+      checks.push({
+        name: 'alternative_providers',
+        status: 'ok',
+        message: `Detected ${alternatives.length} alternative embedding provider${alternatives.length > 1 ? 's' : ''} ready to use: ${alternatives.join(', ')}. Run \`gbrain providers list\` to switch.`,
+      });
+    }
+  } catch { /* listRecipes / gateway not available — silent */ }
+
   // 9. Graph health (link + timeline coverage on entity pages).
   // dead_links removed in v0.10.1: ON DELETE CASCADE on link FKs makes it always 0.
   //
diff --git a/src/commands/init.ts b/src/commands/init.ts
index 5fa08bbfa..737b5b0e9 100644
--- a/src/commands/init.ts
+++ b/src/commands/init.ts
@@ -134,6 +134,18 @@ async function resolveAIOptions(
       console.error(`Unknown provider: ${shorthand}. Run \`gbrain providers list\` to see known providers.`);
       process.exit(1);
     }
+    // v0.32 D8=A: recipes flagged user_provided_models (litellm, llama-server)
+    // refuse implicit "first model" pick with a setup hint pointing the user
+    // at the explicit form. The shorthand --model is meaningless for these
+    // recipes because there's no canonical first model.
+    if (recipe.touchpoints.embedding?.user_provided_models === true) {
+      console.error(
+        `Provider ${shorthand} requires you to specify the model + dimensions explicitly:\n` +
+        `  gbrain init --embedding-model ${shorthand}:<your-model-id> --embedding-dimensions <N>\n` +
+        (recipe.setup_hint ? `\nSetup: ${recipe.setup_hint}` : '')
+      );
+      process.exit(1);
+    }
     const firstModel = recipe.touchpoints.embedding?.models[0];
     if (!firstModel) {
       console.error(`Provider ${shorthand} has no embedding models listed. Use --embedding-model provider:model.`);
@@ -150,6 +162,20 @@ async function resolveAIOptions(
     const { getRecipe } = await import('../core/ai/recipes/index.ts');
     const providerId = out.embedding_model.split(':')[0];
     const recipe = getRecipe(providerId);
+    // v0.32: user_provided_models recipes (litellm, llama-server) have
+    // default_dims=0 and ship with `models: []` — there's no sensible
+    // fallback. Refuse explicitly here too. Without this, the verbose path
+    // `--embedding-model llama-server:foo` (no --embedding-dimensions) would
+    // fall through to configureGateway's default (1536), creating a
+    // wrong-width schema that explodes only at first embed.
+    if (recipe?.touchpoints.embedding?.user_provided_models === true) {
+      console.error(
+        `Provider ${providerId} requires --embedding-dimensions <N> when using --embedding-model ${out.embedding_model}.\n` +
+        `User-driven-model recipes (litellm, llama-server) have no default dimension.\n` +
+        (recipe.setup_hint ? `\nSetup: ${recipe.setup_hint}` : '')
+      );
+      process.exit(1);
+    }
     if (recipe?.touchpoints.embedding?.default_dims) {
       out.embedding_dimensions = recipe.touchpoints.embedding.default_dims;
     }
diff --git a/src/core/ai/dims.ts b/src/core/ai/dims.ts
index d08c46053..2b3bb030a 100644
--- a/src/core/ai/dims.ts
+++ b/src/core/ai/dims.ts
@@ -58,6 +58,29 @@ export function dimsProviderOptions(
       if (VOYAGE_OUTPUT_DIMENSION_MODELS.has(modelId)) {
         return { openaiCompatible: { output_dimension: dims } };
       }
+      // OpenAI text-embedding-3 family on the openai-compatible adapter
+      // (Azure OpenAI hosts these via its OpenAI-compatible /embeddings
+      // endpoint). The provider defaults to the model's native size (3072
+      // for `-large`, 1536 for `-small`); without `dimensions`, brains
+      // configured for a smaller width (e.g. 1536) hard-fail at first embed.
+      if (modelId.startsWith('text-embedding-3')) {
+        return { openaiCompatible: { dimensions: dims } };
+      }
+      // DashScope text-embedding-v3 (Matryoshka 64-1024) and Zhipu
+      // embedding-3 (Matryoshka 256-2048) both accept `dimensions` on the
+      // OpenAI-compat path. Without this, user-selected non-default dims are
+      // silently ignored and the provider returns its default size.
+      if (modelId === 'text-embedding-v3' || modelId === 'embedding-3') {
+        return { openaiCompatible: { dimensions: dims } };
+      }
+      // MiniMax embo-01 takes a `type: 'db' | 'query'` field for asymmetric
+      // retrieval. Default to 'db' (the indexing path) so embed() works for
+      // import. Queries also embed with type:'db', making retrieval
+      // symmetric. Asymmetric query support is a follow-up TODO that needs
+      // a query/document signal threaded through the embed seam.
+      if (modelId === 'embo-01') {
+        return { openaiCompatible: { type: 'db' } };
+      }
       return undefined;
   }
 }
diff --git a/src/core/ai/gateway.ts b/src/core/ai/gateway.ts
index c7b9df419..619932c7e 100644
--- a/src/core/ai/gateway.ts
+++ b/src/core/ai/gateway.ts
@@ -139,6 +139,116 @@ const DEFAULT_SAFETY_FACTOR = 0.8;
  */
 const MAX_VOYAGE_RESPONSE_BYTES = 256 * 1024 * 1024;
 
+// ---- Unified auth resolution (D12=A) ----
+//
+// Pre-v0.32, openai-compatible auth was duplicated across instantiateEmbedding,
+// instantiateExpansion, and instantiateChat with subtle drift (embedding had a
+// `${recipe.id.toUpperCase()}_API_KEY` fallback the other two lacked). D12=A
+// unifies all three through `Recipe.resolveAuth?(env)` with a sensible default
+// so existing recipes need zero code changes; only deviating recipes (Azure
+// with `api-key:` instead of `Authorization: Bearer`) override.
+
+/**
+ * Default auth resolver: returns `{headerName: 'Authorization', token: 'Bearer
+ * <key>'}` where `<key>` is the first present env var from `auth_env.required`,
+ * falling back to the first `auth_env.optional` entry, or 'unauthenticated'
+ * for fully no-auth recipes (Ollama). Throws AIConfigError when required env
+ * is missing.
+ *
+ * `touchpoint` is included in the error message so users know which call path
+ * triggered the missing-env error.
+ *
+ * @internal exported for tests; not part of the public gateway API.
+ */
+export function defaultResolveAuth(
+  recipe: Recipe,
+  env: Record<string, string | undefined>,
+  touchpoint: 'embedding' | 'expansion' | 'chat',
+): { headerName: string; token: string } {
+  const required = recipe.auth_env?.required ?? [];
+  const optional = recipe.auth_env?.optional ?? [];
+
+  if (required.length === 0) {
+    // No-auth or optional-auth recipe (e.g. Ollama, llama-server). Read first
+    // present optional API-key env (ignoring URL-shaped names like
+    // OLLAMA_BASE_URL, which belong in cfg.base_urls, not auth). If none
+    // present, use 'unauthenticated' so createOpenAICompatible has something
+    // to put in Authorization (servers like Ollama / llama-server ignore it).
+    const optKey = optional.find(
+      k => !!env[k] && !/_(BASE_)?URL$/.test(k),
+    );
+    const token = optKey ? env[optKey]! : 'unauthenticated';
+    return { headerName: 'Authorization', token: `Bearer ${token}` };
+  }
+
+  const key = env[required[0]];
+  if (!key) {
+    throw new AIConfigError(
+      `${recipe.name} ${touchpoint} requires ${required[0]}.`,
+      recipe.setup_hint,
+    );
+  }
+  return { headerName: 'Authorization', token: `Bearer ${key}` };
+}
+
+/**
+ * Apply the recipe's auth resolver (or default) and translate the result into
+ * `createOpenAICompatible` options. Authorization-Bearer style returns
+ * `{apiKey}` (the SDK's native path); custom-header style returns `{headers}`
+ * with NO apiKey to avoid double-auth.
+ *
+ * @internal exported for tests; not part of the public gateway API.
+ */
+export function applyResolveAuth(
+  recipe: Recipe,
+  cfg: AIGatewayConfig,
+  touchpoint: 'embedding' | 'expansion' | 'chat',
+): { apiKey?: string; headers?: Record<string, string> } {
+  const resolved = recipe.resolveAuth
+    ? recipe.resolveAuth(cfg.env)
+    : defaultResolveAuth(recipe, cfg.env, touchpoint);
+
+  // Bearer-via-Authorization: use the SDK's native apiKey path (which sets
+  // Authorization: Bearer <key> internally). Strip the 'Bearer ' prefix the
+  // resolver returned.
+  if (
+    resolved.headerName === 'Authorization' &&
+    resolved.token.startsWith('Bearer ')
+  ) {
+    return { apiKey: resolved.token.slice('Bearer '.length) };
+  }
+
+  // Custom header (Azure: api-key). Use headers; do NOT pass apiKey, or the
+  // SDK will also set Authorization and the server may reject double-auth.
+  return { headers: { [resolved.headerName]: resolved.token } };
+}
+
+/**
+ * Resolve the openai-compatible URL + optional fetch wrapper. Defaults to
+ * `cfg.base_urls?.[recipe.id] ?? recipe.base_url_default` (the pre-v0.32
+ * behavior). Recipes whose URL is env-templated (Azure: needs endpoint +
+ * deployment + api-version) override `recipe.resolveOpenAICompatConfig` to
+ * build the URL and inject custom fetch behavior.
+ *
+ * @internal exported for tests.
+ */
+export function applyOpenAICompatConfig(
+  recipe: Recipe,
+  cfg: AIGatewayConfig,
+): { baseURL: string; fetch?: typeof fetch } {
+  if (recipe.resolveOpenAICompatConfig) {
+    return recipe.resolveOpenAICompatConfig(cfg.env);
+  }
+  const baseURL = cfg.base_urls?.[recipe.id] ?? recipe.base_url_default;
+  if (!baseURL) {
+    throw new AIConfigError(
+      `${recipe.name} requires a base URL.`,
+      recipe.setup_hint,
+    );
+  }
+  return { baseURL };
+}
+
 /** Configure the gateway. Called by cli.ts#connectEngine. Clears cached models. */
 export function configureGateway(config: AIGatewayConfig): void {
   _config = {
@@ -259,6 +369,10 @@ function warnRecipesMissingBatchTokens(): void {
     // recipe; suppress the warning for it. Every other recipe missing the
     // field is suspicious.
     if (recipe.id === 'openai') continue;
+    // v0.32 (#779): explicit opt-out for dynamic-cap recipes (Ollama,
+    // LiteLLM proxy, llama-server) — they ship without a static cap because
+    // the cap depends on a user-launched server. Warning is noise for them.
+    if (embedding.no_batch_cap === true) continue;
     if (_warnedRecipes.has(recipe.id)) continue;
     _warnedRecipes.add(recipe.id);
     // eslint-disable-next-line no-console
@@ -380,8 +494,19 @@ export function isAvailable(touchpoint: TouchpointKind): boolean {
     // embedding from an anthropic-configured brain is unavailable regardless of auth.
     const touchpointConfig = recipe.touchpoints[touchpoint as 'embedding' | 'expansion' | 'chat'];
     if (!touchpointConfig) return false;
-    // Openai-compat recipes with empty models list (e.g. litellm template) require user-provided model
-    if (Array.isArray(touchpointConfig.models) && touchpointConfig.models.length === 0 && recipe.id === 'litellm') return false;
+    // Openai-compat recipes with empty models list require a user-provided
+    // model. Either the recipe explicitly opts in via
+    // EmbeddingTouchpoint.user_provided_models (D8=A), or the legacy
+    // `recipe.id === 'litellm'` heuristic (back-compat for pre-v0.32 builds
+    // where the field hadn't been declared yet).
+    const isUserProvided =
+      touchpoint === 'embedding' &&
+      (touchpointConfig as any).user_provided_models === true;
+    if (
+      Array.isArray(touchpointConfig.models) &&
+      touchpointConfig.models.length === 0 &&
+      (recipe.id === 'litellm' || isUserProvided)
+    ) return false;
 
     // For openai-compatible without auth requirements (Ollama local), treat as always-available.
     const required = recipe.auth_env?.required ?? [];
@@ -571,32 +696,20 @@ function instantiateEmbedding(recipe: Recipe, modelId: string, cfg: AIGatewayCon
         `Anthropic has no embedding model. Use openai or google for embeddings.`,
       );
     case 'openai-compatible': {
-      const baseUrl = cfg.base_urls?.[recipe.id] ?? recipe.base_url_default;
-      if (!baseUrl) throw new AIConfigError(
-        `${recipe.name} requires a base URL.`,
-        recipe.setup_hint,
-      );
-      // For openai-compatible, auth is optional (ollama local) but pass a dummy key if unauthenticated.
-      const apiKey = recipe.auth_env?.required[0]
-        ? cfg.env[recipe.auth_env.required[0]]
-        : (cfg.env[`${recipe.id.toUpperCase()}_API_KEY`] ?? 'unauthenticated');
-      if (recipe.auth_env?.required.length && !apiKey) {
-        throw new AIConfigError(
-          `${recipe.name} requires ${recipe.auth_env.required[0]}.`,
-          recipe.setup_hint,
-        );
-      }
+      // D12=A: unified auth via Recipe.resolveAuth (or default).
+      const auth = applyResolveAuth(recipe, cfg, 'embedding');
+      // v0.32: env-templated base URL + optional fetch wrapper for Azure.
+      const compat = applyOpenAICompatConfig(recipe, cfg);
+      // Voyage's openai-compat path needs voyageCompatFetch (translates
+      // request/response shape) when the recipe doesn't ship its own fetch
+      // wrapper via resolveOpenAICompatConfig. Azure recipes ship their own
+      // fetch (api-version splice); voyage doesn't — use voyageCompatFetch.
+      const fetchWrapper = compat.fetch ?? (recipe.id === 'voyage' ? voyageCompatFetch : undefined);
       const client = createOpenAICompatible({
         name: recipe.id,
-        baseURL: baseUrl,
-        apiKey: apiKey ?? 'unauthenticated',
-        // Voyage AI's `/v1/embeddings` endpoint is "OpenAI-compatible" only in URL
-        // shape; it rejects `encoding_format=float` (only `base64` is accepted) and
-        // ignores OpenAI's `dimensions` parameter (Voyage uses `output_dimension`).
-        // The default openai-compatible client sends `encoding_format=float`, which
-        // makes Voyage respond with HTTP 400 "Bad Request". Strip those fields
-        // before forwarding when targeting Voyage.
-        fetch: recipe.id === 'voyage' ? voyageCompatFetch : undefined,
+        baseURL: compat.baseURL,
+        ...(fetchWrapper ? { fetch: fetchWrapper } : {}),
+        ...auth,
       });
       return client.textEmbeddingModel(modelId);
     }
@@ -1026,15 +1139,15 @@ function instantiateExpansion(recipe: Recipe, modelId: string, cfg: AIGatewayCon
       return createAnthropic({ apiKey }).languageModel(modelId);
     }
     case 'openai-compatible': {
-      const baseUrl = cfg.base_urls?.[recipe.id] ?? recipe.base_url_default;
-      if (!baseUrl) throw new AIConfigError(`${recipe.name} requires a base URL.`, recipe.setup_hint);
-      const apiKey = recipe.auth_env?.required[0]
-        ? cfg.env[recipe.auth_env.required[0]]
-        : 'unauthenticated';
+      // D12=A: unified auth via Recipe.resolveAuth (or default).
+      const auth = applyResolveAuth(recipe, cfg, 'expansion');
+      // v0.32: env-templated base URL + optional fetch wrapper.
+      const compat = applyOpenAICompatConfig(recipe, cfg);
       return createOpenAICompatible({
         name: recipe.id,
-        baseURL: baseUrl,
-        apiKey: apiKey ?? 'unauthenticated',
+        baseURL: compat.baseURL,
+        ...(compat.fetch ? { fetch: compat.fetch } : {}),
+        ...auth,
       }).languageModel(modelId);
     }
   }
@@ -1229,17 +1342,15 @@ function instantiateChat(recipe: Recipe, modelId: string, cfg: AIGatewayConfig):
       return createAnthropic({ apiKey }).languageModel(modelId);
     }
     case 'openai-compatible': {
-      const baseUrl = cfg.base_urls?.[recipe.id] ?? recipe.base_url_default;
-      if (!baseUrl) throw new AIConfigError(`${recipe.name} requires a base URL.`, recipe.setup_hint);
-      const required = recipe.auth_env?.required ?? [];
-      const apiKey = required[0] ? cfg.env[required[0]] : 'unauthenticated';
-      if (required.length > 0 && !apiKey) {
-        throw new AIConfigError(`${recipe.name} requires ${required[0]}.`, recipe.setup_hint);
-      }
+      // D12=A: unified auth via Recipe.resolveAuth (or default).
+      const auth = applyResolveAuth(recipe, cfg, 'chat');
+      // v0.32: env-templated base URL + optional fetch wrapper.
+      const compat = applyOpenAICompatConfig(recipe, cfg);
       return createOpenAICompatible({
         name: recipe.id,
-        baseURL: baseUrl,
-        apiKey: apiKey ?? 'unauthenticated',
+        baseURL: compat.baseURL,
+        ...(compat.fetch ? { fetch: compat.fetch } : {}),
+        ...auth,
       }).languageModel(modelId);
     }
     default:
diff --git a/src/core/ai/probes.ts b/src/core/ai/probes.ts
index 81097517b..ba17b6769 100644
--- a/src/core/ai/probes.ts
+++ b/src/core/ai/probes.ts
@@ -45,3 +45,15 @@ export async function probeLMStudio(): Promise<ProbeResult> {
   const url = process.env.LMSTUDIO_BASE_URL ?? 'http://localhost:1234/v1';
   return probeOpenAICompat(url);
 }
+
+/**
+ * Probe llama.cpp's `llama-server --embeddings` endpoint. Defaults to port
+ * 8080 (llama-server's default; distinct from Ollama's 11434 and LM Studio's
+ * 1234). Override via `LLAMA_SERVER_BASE_URL` env, or pass `baseURL` directly
+ * (callers with access to `cfg.base_urls['llama-server']` should pass it so
+ * probe agrees with what the gateway will actually call).
+ */
+export async function probeLlamaServer(baseURL?: string): Promise<ProbeResult> {
+  const url = baseURL ?? process.env.LLAMA_SERVER_BASE_URL ?? 'http://localhost:8080/v1';
+  return probeOpenAICompat(url);
+}
diff --git a/src/core/ai/recipes/azure-openai.ts b/src/core/ai/recipes/azure-openai.ts
new file mode 100644
index 000000000..23aa7ab47
--- /dev/null
+++ b/src/core/ai/recipes/azure-openai.ts
@@ -0,0 +1,111 @@
+import type { Recipe } from '../types.ts';
+import { AIConfigError } from '../errors.ts';
+
+const DEFAULT_API_VERSION = '2024-10-21'; // stable Azure OpenAI version as of 2026-05
+
+/**
+ * Azure OpenAI. The first recipe in v0.32 to exercise both seams:
+ *   - resolveAuth returns `{headerName: 'api-key', token: <key>}` instead of
+ *     Authorization: Bearer (Azure's API explicitly requires `api-key:` and
+ *     rejects double-auth).
+ *   - resolveOpenAICompatConfig templates the URL from env + injects an
+ *     `?api-version=` query param via a custom fetch wrapper.
+ *
+ * Azure's URL shape:
+ *   {ENDPOINT}/openai/deployments/{DEPLOYMENT}/embeddings?api-version=...
+ *
+ * The AI SDK's openai-compatible adapter appends `/embeddings` to the
+ * baseURL, so we set baseURL to `{ENDPOINT}/openai/deployments/{DEPLOYMENT}`
+ * and let the SDK's path-suffix handle the rest. The api-version query is
+ * spliced via the fetch wrapper because the SDK has no native query-param
+ * option.
+ *
+ * Reference: https://learn.microsoft.com/en-us/azure/ai-services/openai/
+ */
+export const azureOpenAI: Recipe = {
+  id: 'azure-openai',
+  name: 'Azure OpenAI',
+  tier: 'openai-compat',
+  implementation: 'openai-compatible',
+  // base_url_default omitted: Azure URLs are env-templated only.
+  auth_env: {
+    required: [
+      'AZURE_OPENAI_API_KEY',
+      'AZURE_OPENAI_ENDPOINT',
+      'AZURE_OPENAI_DEPLOYMENT',
+    ],
+    optional: ['AZURE_OPENAI_API_VERSION'],
+    setup_url:
+      'https://learn.microsoft.com/en-us/azure/ai-services/openai/quickstart',
+  },
+  touchpoints: {
+    embedding: {
+      models: [
+        'text-embedding-3-large',
+        'text-embedding-3-small',
+        'text-embedding-ada-002',
+      ],
+      default_dims: 1536,
+      // Matryoshka via text-embedding-3-*; ada-002 is fixed at 1536.
+      dims_options: [256, 512, 768, 1024, 1536, 3072],
+      cost_per_1m_tokens_usd: 0.13,
+      price_last_verified: '2026-05-10',
+      max_batch_tokens: 8192,
+    },
+  },
+  resolveAuth(env) {
+    const key = env.AZURE_OPENAI_API_KEY;
+    if (!key) {
+      throw new AIConfigError(
+        `Azure OpenAI requires AZURE_OPENAI_API_KEY.`,
+        'Get a key from your Azure portal: https://learn.microsoft.com/en-us/azure/ai-services/openai/quickstart',
+      );
+    }
+    // Azure uses `api-key:` (no Bearer); the unified seam routes this
+    // through `headers` instead of the SDK's apiKey field to avoid any
+    // double-auth Authorization header sneaking in.
+    return { headerName: 'api-key', token: key };
+  },
+  resolveOpenAICompatConfig(env) {
+    const endpoint = env.AZURE_OPENAI_ENDPOINT?.replace(/\/+$/, '');
+    const deployment = env.AZURE_OPENAI_DEPLOYMENT;
+    if (!endpoint) {
+      throw new AIConfigError(
+        `Azure OpenAI requires AZURE_OPENAI_ENDPOINT.`,
+        'Find your endpoint at portal.azure.com → Azure OpenAI resource → Keys and Endpoint.',
+      );
+    }
+    if (!deployment) {
+      throw new AIConfigError(
+        `Azure OpenAI requires AZURE_OPENAI_DEPLOYMENT.`,
+        'Each Azure OpenAI deployment has its own URL path. Set AZURE_OPENAI_DEPLOYMENT to the deployment name from your Azure portal.',
+      );
+    }
+    const apiVersion = env.AZURE_OPENAI_API_VERSION ?? DEFAULT_API_VERSION;
+    const baseURL = `${endpoint}/openai/deployments/${deployment}`;
+    // Custom fetch wrapper splices ?api-version=... onto every request.
+    // Azure rejects requests without it.
+    // Cast through `any` because TS's `typeof fetch` includes a `preconnect`
+    // method that wrappers don't need (the AI SDK never calls it).
+    const wrappedFetch = (async (input: any, init: any) => {
+      const url =
+        typeof input === 'string'
+          ? input
+          : input instanceof URL
+          ? input.toString()
+          : (input as Request).url;
+      const sep = url.includes('?') ? '&' : '?';
+      const finalUrl = url.includes('api-version=')
+        ? url
+        : `${url}${sep}api-version=${encodeURIComponent(apiVersion)}`;
+      const finalInput =
+        typeof input === 'string' || input instanceof URL
+          ? finalUrl
+          : new Request(finalUrl, input as Request);
+      return fetch(finalInput, init);
+    }) as unknown as typeof fetch;
+    return { baseURL, fetch: wrappedFetch };
+  },
+  setup_hint:
+    'Azure portal → Azure OpenAI resource. Set AZURE_OPENAI_API_KEY, AZURE_OPENAI_ENDPOINT, AZURE_OPENAI_DEPLOYMENT. Optionally AZURE_OPENAI_API_VERSION (default 2024-10-21).',
+};
diff --git a/src/core/ai/recipes/dashscope.ts b/src/core/ai/recipes/dashscope.ts
new file mode 100644
index 000000000..7aa2c273f
--- /dev/null
+++ b/src/core/ai/recipes/dashscope.ts
@@ -0,0 +1,42 @@
+import type { Recipe } from '../types.ts';
+
+/**
+ * Alibaba DashScope (灵积). OpenAI-compatible /embeddings endpoint at
+ * dashscope-intl.aliyuncs.com. Hosts text-embedding-v2 (older) and
+ * text-embedding-v3 (current; Matryoshka-aware up to 1024 dims).
+ *
+ * Reference: https://help.aliyun.com/zh/model-studio/getting-started/
+ *
+ * Note: the international endpoint requires a region-aware DASHSCOPE_API_KEY.
+ * China-region users typically point at https://dashscope.aliyuncs.com/...
+ * via cfg.base_urls['dashscope']. v0.32 ships with the international
+ * default; users override per the recipe convention.
+ */
+export const dashscope: Recipe = {
+  id: 'dashscope',
+  name: 'Alibaba DashScope (灵积)',
+  tier: 'openai-compat',
+  implementation: 'openai-compatible',
+  base_url_default: 'https://dashscope-intl.aliyuncs.com/compatible-mode/v1',
+  auth_env: {
+    required: ['DASHSCOPE_API_KEY'],
+    setup_url: 'https://help.aliyun.com/zh/model-studio/getting-started/',
+  },
+  touchpoints: {
+    embedding: {
+      models: ['text-embedding-v3', 'text-embedding-v2'],
+      default_dims: 1024,
+      dims_options: [64, 128, 256, 512, 768, 1024],
+      // Alibaba doesn't publish a hard batch-token cap for the OpenAI-compat
+      // path. Conservative declaration so the gateway pre-splits before
+      // hitting whatever undocumented server-side limit exists.
+      max_batch_tokens: 8192,
+      // text-embedding-v3 mixes English + CJK heavily; the tokenizer is
+      // closer to Voyage density than OpenAI tiktoken for CJK-dominant
+      // content. Conservative chars_per_token=2 leaves headroom.
+      chars_per_token: 2,
+    },
+  },
+  setup_hint:
+    'Get an API key at https://help.aliyun.com/zh/model-studio/getting-started/, then `export DASHSCOPE_API_KEY=...`',
+};
diff --git a/src/core/ai/recipes/index.ts b/src/core/ai/recipes/index.ts
index 21a1506d6..1682fa115 100644
--- a/src/core/ai/recipes/index.ts
+++ b/src/core/ai/recipes/index.ts
@@ -15,6 +15,11 @@ import { litellmProxy } from './litellm-proxy.ts';
 import { deepseek } from './deepseek.ts';
 import { groq } from './groq.ts';
 import { together } from './together.ts';
+import { llamaServer } from './llama-server.ts';
+import { minimax } from './minimax.ts';
+import { dashscope } from './dashscope.ts';
+import { zhipu } from './zhipu.ts';
+import { azureOpenAI } from './azure-openai.ts';
 
 const ALL: Recipe[] = [
   openai,
@@ -26,6 +31,11 @@ const ALL: Recipe[] = [
   deepseek,
   groq,
   together,
+  llamaServer,
+  minimax,
+  dashscope,
+  zhipu,
+  azureOpenAI,
 ];
 
 /** Map from `provider:id` key to recipe. */
diff --git a/src/core/ai/recipes/litellm-proxy.ts b/src/core/ai/recipes/litellm-proxy.ts
index 8f7da2dea..d7ee4e946 100644
--- a/src/core/ai/recipes/litellm-proxy.ts
+++ b/src/core/ai/recipes/litellm-proxy.ts
@@ -23,9 +23,13 @@ export const litellmProxy: Recipe = {
     embedding: {
       // Models depend on the proxy's config; declare empties so wizard prompts user.
       models: [],
+      user_provided_models: true, // v0.32 D8=A wire-through for the litellm hardcode
       default_dims: 0, // user must declare --embedding-dimensions explicitly
       cost_per_1m_tokens_usd: undefined,
       price_last_verified: '2026-04-20',
+      // LiteLLM's batch capacity is determined by the backend it proxies;
+      // no static cap to declare here. v0.32 (#779).
+      no_batch_cap: true,
     },
   },
   setup_hint: 'Run LiteLLM (https://docs.litellm.ai) in front of any provider; set LITELLM_BASE_URL + pass --embedding-model litellm:<model> and --embedding-dimensions <N>.',
diff --git a/src/core/ai/recipes/llama-server.ts b/src/core/ai/recipes/llama-server.ts
new file mode 100644
index 000000000..1bc8b97cc
--- /dev/null
+++ b/src/core/ai/recipes/llama-server.ts
@@ -0,0 +1,67 @@
+import type { Recipe } from '../types.ts';
+import { probeLlamaServer } from '../probes.ts';
+
+/**
+ * llama.cpp's `llama-server --embeddings` (also published as
+ * `@llama.cpp/llama-server`). Exposes an OpenAI-compatible /v1/embeddings
+ * endpoint. Distinct from Ollama: different default port (8080), different
+ * model-management story (you launch it with `--model <path>`; the server
+ * serves whatever model was passed).
+ *
+ * Like LiteLLM, this recipe ships with `models: []` because the model
+ * identity is whatever the user launched llama-server with. They MUST
+ * pass `--embedding-model llama-server:<id>` and `--embedding-dimensions
+ * <N>`. The wizard refuses to pick implicit defaults.
+ *
+ * Reference: https://github.com/ggml-org/llama.cpp/blob/master/tools/server/README.md
+ */
+export const llamaServer: Recipe = {
+  id: 'llama-server',
+  name: 'llama.cpp llama-server (local)',
+  tier: 'openai-compat',
+  implementation: 'openai-compatible',
+  base_url_default: 'http://localhost:8080/v1',
+  auth_env: {
+    required: [],
+    optional: ['LLAMA_SERVER_BASE_URL', 'LLAMA_SERVER_API_KEY'],
+    setup_url:
+      'https://github.com/ggml-org/llama.cpp/blob/master/tools/server/README.md',
+  },
+  touchpoints: {
+    embedding: {
+      models: [], // user-driven; whatever model the server was launched with
+      user_provided_models: true,
+      default_dims: 0, // forces explicit --embedding-dimensions
+      cost_per_1m_tokens_usd: 0,
+      price_last_verified: '2026-05-10',
+      // llama-server's batch capacity is set by `--ctx-size` at launch
+      // time; no static cap to declare. v0.32 (#779).
+      no_batch_cap: true,
+    },
+  },
+  /**
+   * Probe via the OpenAI-compatible /v1/models endpoint. Caller passes the
+   * resolved baseURL (from cfg.base_urls['llama-server'] or env), so the
+   * probe agrees with what the gateway will actually call. Falls back to
+   * env / localhost:8080 when called without an argument.
+   */
+  async probe(baseURL?: string) {
+    const url = baseURL ?? process.env.LLAMA_SERVER_BASE_URL ?? 'http://localhost:8080/v1';
+    const result = await probeLlamaServer(url);
+    if (!result.reachable) {
+      return {
+        ready: false,
+        hint: `llama-server not reachable at ${url}. Start it with \`./llama-server --model <path> --embeddings\` or set LLAMA_SERVER_BASE_URL.`,
+      };
+    }
+    if (!result.models_endpoint_valid) {
+      return {
+        ready: false,
+        hint: `llama-server reached but /v1/models returned an unexpected shape: ${result.error ?? 'unknown'}.`,
+      };
+    }
+    return { ready: true };
+  },
+  setup_hint:
+    'Build llama.cpp, then `llama-server --model <gguf-path> --embeddings`. Set --embedding-model llama-server:<id> + --embedding-dimensions <N>.',
+};
diff --git a/src/core/ai/recipes/minimax.ts b/src/core/ai/recipes/minimax.ts
new file mode 100644
index 000000000..49a756057
--- /dev/null
+++ b/src/core/ai/recipes/minimax.ts
@@ -0,0 +1,44 @@
+import type { Recipe } from '../types.ts';
+
+/**
+ * MiniMax (海螺AI). OpenAI-compatible /embeddings endpoint at
+ * api.minimax.chat. The flagship embedding model is `embo-01` (1536 dims).
+ *
+ * MiniMax's API takes an extra `type: 'db' | 'query'` field for asymmetric
+ * retrieval. gbrain currently has no notion of "this is a document vs a
+ * query" at the embed-call site (embed() takes only texts), so we default
+ * to `type: 'db'` for the indexing path. Queries also embed with `type:
+ * 'db'`, making retrieval symmetric. This sacrifices some retrieval
+ * quality vs. a true asymmetric setup but works correctly. A follow-up
+ * TODO will thread query/document context through the embed seam for
+ * full asymmetric support.
+ *
+ * Reference: https://www.minimaxi.com/document/guides/embeddings
+ */
+export const minimax: Recipe = {
+  id: 'minimax',
+  name: 'MiniMax (海螺AI)',
+  tier: 'openai-compat',
+  implementation: 'openai-compatible',
+  base_url_default: 'https://api.minimaxi.com/v1',
+  auth_env: {
+    required: ['MINIMAX_API_KEY'],
+    optional: ['MINIMAX_GROUP_ID'],
+    setup_url: 'https://www.minimaxi.com/document/guides/embeddings',
+  },
+  touchpoints: {
+    embedding: {
+      models: ['embo-01'],
+      default_dims: 1536,
+      cost_per_1m_tokens_usd: 0.07,
+      price_last_verified: '2026-05-09',
+      // MiniMax docs don't publish a hard batch-token cap; declare a
+      // conservative 4096-token budget so the gateway pre-splits before
+      // hitting whatever undocumented server-side limit exists. Recursive
+      // halving in the gateway catches token-limit errors at runtime.
+      max_batch_tokens: 4096,
+    },
+  },
+  setup_hint:
+    'Get an API key at https://www.minimaxi.com, then `export MINIMAX_API_KEY=...`',
+};
diff --git a/src/core/ai/recipes/ollama.ts b/src/core/ai/recipes/ollama.ts
index 31fac0ae6..0f355cdae 100644
--- a/src/core/ai/recipes/ollama.ts
+++ b/src/core/ai/recipes/ollama.ts
@@ -17,6 +17,9 @@ export const ollama: Recipe = {
       default_dims: 768, // nomic-embed-text native dim
       cost_per_1m_tokens_usd: 0,
       price_last_verified: '2026-04-20',
+      // Ollama's batch capacity depends on the locally loaded model + the
+      // OLLAMA_NUM_PARALLEL config; no static cap to declare. v0.32 (#779).
+      no_batch_cap: true,
     },
   },
   setup_hint: 'Install Ollama from https://ollama.ai, then `ollama pull nomic-embed-text` and `ollama serve`.',
diff --git a/src/core/ai/recipes/zhipu.ts b/src/core/ai/recipes/zhipu.ts
new file mode 100644
index 000000000..758c5c9cb
--- /dev/null
+++ b/src/core/ai/recipes/zhipu.ts
@@ -0,0 +1,40 @@
+import type { Recipe } from '../types.ts';
+
+/**
+ * Zhipu AI (智谱AI) BigModel Open Platform. OpenAI-compatible /embeddings
+ * endpoint at open.bigmodel.cn. Hosts embedding-2 (1024d) and embedding-3
+ * (Matryoshka up to 2048d).
+ *
+ * embedding-3 at 2048 dims exceeds pgvector's HNSW cap of 2000 — those
+ * brains fall back to exact vector scans (see
+ * src/core/ai/vector-index.ts:PGVECTOR_HNSW_VECTOR_MAX_DIMS). v0.32 ships
+ * with `default_dims: 1024` (HNSW-compatible) and exposes 2048 via
+ * dims_options for users who want the full embedding fidelity at the
+ * cost of slower retrieval.
+ *
+ * Reference: https://open.bigmodel.cn/
+ */
+export const zhipu: Recipe = {
+  id: 'zhipu',
+  name: 'Zhipu AI (智谱AI BigModel)',
+  tier: 'openai-compat',
+  implementation: 'openai-compatible',
+  base_url_default: 'https://open.bigmodel.cn/api/paas/v4',
+  auth_env: {
+    required: ['ZHIPUAI_API_KEY'],
+    setup_url: 'https://open.bigmodel.cn/',
+  },
+  touchpoints: {
+    embedding: {
+      models: ['embedding-3', 'embedding-2'],
+      default_dims: 1024,
+      // 2048 exposed but breaks HNSW (exact-scan fallback). 1024/512/256
+      // stay HNSW-compatible.
+      dims_options: [256, 512, 1024, 2048],
+      max_batch_tokens: 8192,
+      chars_per_token: 2,
+    },
+  },
+  setup_hint:
+    'Get an API key at https://open.bigmodel.cn/, then `export ZHIPUAI_API_KEY=...`',
+};
diff --git a/src/core/ai/types.ts b/src/core/ai/types.ts
index b9125e5f0..e88867993 100644
--- a/src/core/ai/types.ts
+++ b/src/core/ai/types.ts
@@ -72,6 +72,33 @@ export interface EmbeddingTouchpoint {
    * text embedding paths ignore it.
    */
   multimodal_models?: string[];
+  /**
+   * v0.32: when true, the recipe ships without a fixed model list and users
+   * MUST provide `--embedding-model provider:model` and
+   * `--embedding-dimensions N` explicitly. Used by litellm-proxy and
+   * llama-server (and any future "bring your own backend" recipe).
+   *
+   * Consumers:
+   *  - `recipes-contract.test.ts` permits `models.length === 0` only when
+   *    this flag is true.
+   *  - `gateway.ts` skips the model-list-must-include-modelId check.
+   *  - `init.ts:resolveAIOptions` refuses the implicit "first model" pick
+   *    for shorthand `--model <provider>` and prints a setup hint.
+   */
+  user_provided_models?: true;
+  /**
+   * v0.32 (#779 reworked): explicit opt-out of the missing-max_batch_tokens
+   * startup warning. Set to `true` for recipes whose batch capacity is
+   * genuinely dynamic (Ollama: depends on user-loaded model; LiteLLM proxy:
+   * depends on backend; llama.cpp: depends on `--ctx-size` at server launch).
+   *
+   * Without this flag, missing `max_batch_tokens` triggers a once-per-process
+   * stderr warning so future recipes that forget the cap (and would
+   * silently rely on recursive-halving) don't ship un-noticed. Recipes that
+   * declare `no_batch_cap: true` are explicitly opting out — the warning is
+   * noise for them.
+   */
+  no_batch_cap?: true;
 }
 
 /**
@@ -150,6 +177,63 @@ export interface Recipe {
   aliases?: Record<string, string>;
   /** One-line description of setup (shown in wizard + env subcommand). */
   setup_hint?: string;
+  /**
+   * v0.32 (D12=A): unified auth resolver across embed / expansion / chat
+   * touchpoints. Returns the header name (`Authorization`, `api-key`, etc.)
+   * and the full header value (for Bearer-style providers, include the
+   * `Bearer ` prefix). Throws AIConfigError when required env is missing
+   * with a hint pointing at the recipe's setup_url.
+   *
+   * When omitted, the gateway applies a default that returns
+   * `{headerName: 'Authorization', token: 'Bearer ' + env[auth_env.required[0]]}`.
+   * The default is the right behavior for OpenAI-compatible providers with a
+   * single API key. Recipes deviating (Azure uses `api-key`; future OAuth
+   * providers fetch dynamic tokens) override this.
+   *
+   * IMPORTANT: this runs at gateway-configure time (NOT at embed-call time)
+   * so the env snapshot in `cfg.env` is consulted, never `process.env`.
+   */
+  resolveAuth?(env: Record<string, string | undefined>): {
+    headerName: string;
+    token: string;
+  };
+  /**
+   * v0.32: templated openai-compatible config for recipes whose URL shape
+   * doesn't fit a static `base_url_default`. Returns the resolved baseURL
+   * and an optional fetch wrapper for cases like Azure OpenAI that need a
+   * query parameter (?api-version=) injected on every request.
+   *
+   * Default behavior (when undefined): use `base_urls[recipe.id]` from
+   * config or `recipe.base_url_default`. Throws `AIConfigError` when both
+   * are missing.
+   *
+   * Currently only Azure OpenAI overrides this — the URL is templated
+   * from `AZURE_OPENAI_ENDPOINT` + `AZURE_OPENAI_DEPLOYMENT` and the fetch
+   * wrapper splices `api-version` into every request URL.
+   */
+  resolveOpenAICompatConfig?(env: Record<string, string | undefined>): {
+    baseURL: string;
+    fetch?: typeof fetch;
+  };
+  /**
+   * v0.32 (D13=A): optional runtime readiness check for local-server
+   * recipes (ollama, llama-server, future lmstudio-recipe). Returns
+   * `ready: false` when the local endpoint isn't reachable, with a `hint`
+   * the wizard / doctor can surface.
+   *
+   * Defaults to env-only readiness (`auth_env.required` all set) when
+   * absent. Consumed by `runExplain()` in `src/commands/providers.ts` and
+   * by the doctor's embedding probe; both wrap the call in
+   * `Promise.allSettled` with a 200ms timeout so a hung local server does
+   * not block the provider matrix.
+   *
+   * `baseURL`: optional resolved URL the gateway will actually call (from
+   * `cfg.base_urls[recipe.id]` or recipe defaults). Pass it so the probe
+   * checks the same endpoint as live traffic. Without it, the probe falls
+   * back to recipe defaults / env, which can disagree with config-only
+   * URL overrides (codex finding #5).
+   */
+  probe?(baseURL?: string): Promise<{ ready: boolean; hint?: string }>;
 }
 
 export interface AIGatewayConfig {
diff --git a/src/core/config.ts b/src/core/config.ts
index ae84916fa..9b920c409 100644
--- a/src/core/config.ts
+++ b/src/core/config.ts
@@ -151,6 +151,7 @@ export function loadConfig(): GBrainConfig | null {
     ...(dbUrl ? { database_url: dbUrl } : {}),
     ...(dbUrl ? { database_path: undefined } : {}),
     ...(process.env.OPENAI_API_KEY ? { openai_api_key: process.env.OPENAI_API_KEY } : {}),
+    ...(process.env.ANTHROPIC_API_KEY ? { anthropic_api_key: process.env.ANTHROPIC_API_KEY } : {}),
     ...(process.env.GBRAIN_EMBEDDING_MODEL ? { embedding_model: process.env.GBRAIN_EMBEDDING_MODEL } : {}),
     ...(process.env.GBRAIN_EMBEDDING_DIMENSIONS ? { embedding_dimensions: parseInt(process.env.GBRAIN_EMBEDDING_DIMENSIONS, 10) } : {}),
     ...(process.env.GBRAIN_EXPANSION_MODEL ? { expansion_model: process.env.GBRAIN_EXPANSION_MODEL } : {}),
diff --git a/test/ai/no-batch-cap-suppression.serial.test.ts b/test/ai/no-batch-cap-suppression.serial.test.ts
new file mode 100644
index 000000000..9bd3e69b7
--- /dev/null
+++ b/test/ai/no-batch-cap-suppression.serial.test.ts
@@ -0,0 +1,79 @@
+/**
+ * #779 + #121 adjacent fixes (Commit 9 of v0.32 wave).
+ *
+ * Coverage:
+ *  - Recipes with `embedding.no_batch_cap: true` suppress the
+ *    missing-max_batch_tokens startup warning (#779)
+ *  - Real-provider recipes without the flag still warn (regression guard)
+ *  - listRecipes returns expected dynamic-cap recipes (ollama, litellm,
+ *    llama-server) all flagged
+ */
+
+import { afterAll, beforeAll, describe, expect, mock, test } from 'bun:test';
+import { configureGateway, resetGateway } from '../../src/core/ai/gateway.ts';
+import { listRecipes, getRecipe } from '../../src/core/ai/recipes/index.ts';
+
+describe('v0.32 #779: no_batch_cap suppresses the missing-max_batch_tokens warning', () => {
+  let warnSpy: ReturnType<typeof mock>;
+  let realWarn: typeof console.warn;
+
+  beforeAll(() => {
+    realWarn = console.warn;
+    warnSpy = mock(() => {});
+    console.warn = warnSpy as any;
+  });
+
+  afterAll(() => {
+    console.warn = realWarn;
+    resetGateway();
+  });
+
+  test('Ollama, LiteLLM, llama-server all declare no_batch_cap: true', () => {
+    for (const id of ['ollama', 'litellm', 'llama-server']) {
+      const r = getRecipe(id);
+      expect(r, `${id} not registered`).toBeDefined();
+      expect(
+        r!.touchpoints.embedding?.no_batch_cap,
+        `${id} should declare no_batch_cap: true`,
+      ).toBe(true);
+    }
+  });
+
+  test('configureGateway does NOT warn for ollama/litellm/llama-server', () => {
+    warnSpy.mockClear();
+    resetGateway();
+    configureGateway({ env: {} });
+    const messages = warnSpy.mock.calls.map(c => String(c[0] ?? ''));
+    for (const id of ['ollama', 'litellm', 'llama-server']) {
+      expect(
+        messages.some(m => m.includes(`"${id}"`)),
+        `should NOT warn for ${id}`,
+      ).toBe(false);
+    }
+  });
+
+  test('configureGateway STILL warns for google (real provider, no cap declared)', () => {
+    warnSpy.mockClear();
+    resetGateway();
+    configureGateway({ env: {} });
+    const messages = warnSpy.mock.calls.map(c => String(c[0] ?? ''));
+    expect(
+      messages.some(m => m.includes('"google"') && m.includes('without max_batch_tokens')),
+      'google should warn (it has fixed-cap models)',
+    ).toBe(true);
+  });
+
+  test('every recipe with empty models[] declares user_provided_models OR has openai-fast-path', () => {
+    // Cross-cutting invariant: contracts should not silently disagree.
+    for (const r of listRecipes()) {
+      const e = r.touchpoints.embedding;
+      if (!e) continue;
+      if (e.models.length === 0) {
+        expect(
+          e.user_provided_models === true || r.id === 'litellm',
+          `${r.id} has empty models[] — must declare user_provided_models: true`,
+        ).toBe(true);
+      }
+    }
+  });
+});
diff --git a/test/ai/recipe-azure-openai.test.ts b/test/ai/recipe-azure-openai.test.ts
new file mode 100644
index 000000000..1bd4dcd07
--- /dev/null
+++ b/test/ai/recipe-azure-openai.test.ts
@@ -0,0 +1,202 @@
+/**
+ * Azure OpenAI recipe smoke (Commit 8 of the v0.32 wave).
+ *
+ * Azure is the first recipe to exercise BOTH new seams:
+ *   - resolveAuth → custom header (api-key, NOT Authorization Bearer)
+ *   - resolveOpenAICompatConfig → templated baseURL + fetch wrapper that
+ *     splices `?api-version=` onto every request
+ *
+ * Coverage:
+ *  - Recipe registered with expected shape
+ *  - resolveAuth returns api-key header; missing key → AIConfigError
+ *  - resolveOpenAICompatConfig templates baseURL from endpoint + deployment
+ *  - resolveOpenAICompatConfig throws when endpoint or deployment missing
+ *  - fetch wrapper splices api-version query param (default + override)
+ *  - applyResolveAuth puts the key in headers (NOT apiKey, no double-auth)
+ *  - applyOpenAICompatConfig honors the recipe override
+ */
+
+import { describe, expect, test } from 'bun:test';
+import { getRecipe } from '../../src/core/ai/recipes/index.ts';
+import {
+  applyResolveAuth,
+  applyOpenAICompatConfig,
+} from '../../src/core/ai/gateway.ts';
+import { AIConfigError } from '../../src/core/ai/errors.ts';
+
+const FULL_ENV = {
+  AZURE_OPENAI_API_KEY: 'az-fake-key',
+  AZURE_OPENAI_ENDPOINT: 'https://my-resource.openai.azure.com',
+  AZURE_OPENAI_DEPLOYMENT: 'embed-deployment',
+};
+
+describe('recipe: azure-openai', () => {
+  test('registered with expected shape', () => {
+    const r = getRecipe('azure-openai');
+    expect(r).toBeDefined();
+    expect(r!.id).toBe('azure-openai');
+    expect(r!.tier).toBe('openai-compat');
+    expect(r!.implementation).toBe('openai-compatible');
+    expect(r!.base_url_default).toBeUndefined(); // env-templated only
+    expect(r!.auth_env?.required).toEqual([
+      'AZURE_OPENAI_API_KEY',
+      'AZURE_OPENAI_ENDPOINT',
+      'AZURE_OPENAI_DEPLOYMENT',
+    ]);
+    expect(r!.auth_env?.optional).toContain('AZURE_OPENAI_API_VERSION');
+  });
+
+  test('embedding touchpoint declares 3 models + 1536 default + Matryoshka options', () => {
+    const r = getRecipe('azure-openai')!;
+    expect(r.touchpoints.embedding).toBeDefined();
+    expect(r.touchpoints.embedding!.models).toEqual([
+      'text-embedding-3-large',
+      'text-embedding-3-small',
+      'text-embedding-ada-002',
+    ]);
+    expect(r.touchpoints.embedding!.default_dims).toBe(1536);
+    expect(r.touchpoints.embedding!.dims_options).toContain(3072);
+  });
+
+  test('resolveAuth returns api-key header (NOT Authorization Bearer)', () => {
+    const r = getRecipe('azure-openai')!;
+    const auth = r.resolveAuth!({ AZURE_OPENAI_API_KEY: 'az-fake-key' });
+    expect(auth.headerName).toBe('api-key');
+    expect(auth.token).toBe('az-fake-key');
+    expect(auth.token).not.toContain('Bearer'); // critical: no Bearer prefix
+  });
+
+  test('resolveAuth throws AIConfigError when AZURE_OPENAI_API_KEY missing', () => {
+    const r = getRecipe('azure-openai')!;
+    expect(() => r.resolveAuth!({})).toThrow(AIConfigError);
+  });
+
+  test('applyResolveAuth puts the key in headers (NOT apiKey) — no double-auth', () => {
+    const r = getRecipe('azure-openai')!;
+    const result = applyResolveAuth(r, { env: FULL_ENV } as any, 'embedding');
+    expect(result.apiKey, 'apiKey must be undefined to avoid double-auth').toBeUndefined();
+    expect(result.headers).toEqual({ 'api-key': 'az-fake-key' });
+  });
+
+  test('resolveOpenAICompatConfig templates baseURL from endpoint + deployment', () => {
+    const r = getRecipe('azure-openai')!;
+    const cfg = r.resolveOpenAICompatConfig!(FULL_ENV);
+    expect(cfg.baseURL).toBe(
+      'https://my-resource.openai.azure.com/openai/deployments/embed-deployment',
+    );
+    expect(typeof cfg.fetch).toBe('function');
+  });
+
+  test('resolveOpenAICompatConfig strips trailing slash from endpoint', () => {
+    const r = getRecipe('azure-openai')!;
+    const cfg = r.resolveOpenAICompatConfig!({
+      ...FULL_ENV,
+      AZURE_OPENAI_ENDPOINT: 'https://my-resource.openai.azure.com/',
+    });
+    expect(cfg.baseURL).toBe(
+      'https://my-resource.openai.azure.com/openai/deployments/embed-deployment',
+    );
+  });
+
+  test('resolveOpenAICompatConfig throws when endpoint or deployment missing', () => {
+    const r = getRecipe('azure-openai')!;
+    expect(() =>
+      r.resolveOpenAICompatConfig!({
+        AZURE_OPENAI_API_KEY: 'k',
+        AZURE_OPENAI_DEPLOYMENT: 'd',
+      }),
+    ).toThrow(AIConfigError);
+    expect(() =>
+      r.resolveOpenAICompatConfig!({
+        AZURE_OPENAI_API_KEY: 'k',
+        AZURE_OPENAI_ENDPOINT: 'https://x.openai.azure.com',
+      }),
+    ).toThrow(AIConfigError);
+  });
+
+  test('fetch wrapper splices ?api-version=... onto every request URL (default version)', async () => {
+    const r = getRecipe('azure-openai')!;
+    const cfg = r.resolveOpenAICompatConfig!(FULL_ENV);
+    const wrapped = cfg.fetch!;
+    // Stub global fetch to capture the URL the wrapper hands off.
+    const captured: string[] = [];
+    const realFetch = globalThis.fetch;
+    globalThis.fetch = ((input: any, _init?: any) => {
+      const url = typeof input === 'string' ? input : input instanceof URL ? input.toString() : input.url;
+      captured.push(url);
+      return Promise.resolve(new Response('{}', { status: 200 }));
+    }) as typeof fetch;
+    try {
+      await wrapped('https://my-resource.openai.azure.com/openai/deployments/embed-deployment/embeddings');
+      expect(captured).toHaveLength(1);
+      expect(captured[0]).toContain('api-version=');
+      expect(captured[0]).toContain('2024-10-21'); // DEFAULT_API_VERSION
+    } finally {
+      globalThis.fetch = realFetch;
+    }
+  });
+
+  test('fetch wrapper honors AZURE_OPENAI_API_VERSION override', async () => {
+    const r = getRecipe('azure-openai')!;
+    const cfg = r.resolveOpenAICompatConfig!({
+      ...FULL_ENV,
+      AZURE_OPENAI_API_VERSION: '2025-04-01',
+    });
+    const captured: string[] = [];
+    const realFetch = globalThis.fetch;
+    globalThis.fetch = ((input: any) => {
+      const url = typeof input === 'string' ? input : input instanceof URL ? input.toString() : input.url;
+      captured.push(url);
+      return Promise.resolve(new Response('{}', { status: 200 }));
+    }) as typeof fetch;
+    try {
+      await cfg.fetch!('https://my-resource.openai.azure.com/openai/deployments/embed-deployment/embeddings');
+      expect(captured[0]).toContain('api-version=2025-04-01');
+    } finally {
+      globalThis.fetch = realFetch;
+    }
+  });
+
+  test('fetch wrapper does NOT double-add api-version when caller already set it', async () => {
+    const r = getRecipe('azure-openai')!;
+    const cfg = r.resolveOpenAICompatConfig!(FULL_ENV);
+    const captured: string[] = [];
+    const realFetch = globalThis.fetch;
+    globalThis.fetch = ((input: any) => {
+      const url = typeof input === 'string' ? input : input instanceof URL ? input.toString() : input.url;
+      captured.push(url);
+      return Promise.resolve(new Response('{}', { status: 200 }));
+    }) as typeof fetch;
+    try {
+      await cfg.fetch!('https://my-resource.openai.azure.com/openai/deployments/embed-deployment/embeddings?api-version=2025-01-01');
+      expect(captured[0]).toBe(
+        'https://my-resource.openai.azure.com/openai/deployments/embed-deployment/embeddings?api-version=2025-01-01',
+      );
+    } finally {
+      globalThis.fetch = realFetch;
+    }
+  });
+
+  test('applyOpenAICompatConfig honors the recipe override (templated URL)', () => {
+    const r = getRecipe('azure-openai')!;
+    const result = applyOpenAICompatConfig(r, { env: FULL_ENV } as any);
+    expect(result.baseURL).toBe(
+      'https://my-resource.openai.azure.com/openai/deployments/embed-deployment',
+    );
+    expect(typeof result.fetch).toBe('function');
+  });
+
+  test('dimsProviderOptions threads dimensions for text-embedding-3-* via openai-compat', async () => {
+    // Codex finding #1: Azure (openai-compatible) was missing dim
+    // passthrough for text-embedding-3-large. Without `dimensions`, Azure
+    // returns 3072d; gbrain config expects 1536d → first embed hard-fails.
+    const { dimsProviderOptions } = await import('../../src/core/ai/dims.ts');
+    expect(dimsProviderOptions('openai-compatible', 'text-embedding-3-large', 1536))
+      .toEqual({ openaiCompatible: { dimensions: 1536 } });
+    expect(dimsProviderOptions('openai-compatible', 'text-embedding-3-small', 512))
+      .toEqual({ openaiCompatible: { dimensions: 512 } });
+    // ada-002 has no dimensions knob; recipe must accept the native 1536.
+    expect(dimsProviderOptions('openai-compatible', 'text-embedding-ada-002', 1536))
+      .toBeUndefined();
+  });
+});
diff --git a/test/ai/recipe-dashscope.test.ts b/test/ai/recipe-dashscope.test.ts
new file mode 100644
index 000000000..2d1dac288
--- /dev/null
+++ b/test/ai/recipe-dashscope.test.ts
@@ -0,0 +1,71 @@
+/**
+ * DashScope (Alibaba) recipe smoke (Commit 6 of the v0.32 wave).
+ */
+
+import { describe, expect, test } from 'bun:test';
+import { getRecipe } from '../../src/core/ai/recipes/index.ts';
+import { defaultResolveAuth } from '../../src/core/ai/gateway.ts';
+import { AIConfigError } from '../../src/core/ai/errors.ts';
+
+describe('recipe: dashscope', () => {
+  test('registered with expected shape', () => {
+    const r = getRecipe('dashscope');
+    expect(r).toBeDefined();
+    expect(r!.id).toBe('dashscope');
+    expect(r!.tier).toBe('openai-compat');
+    expect(r!.implementation).toBe('openai-compatible');
+    expect(r!.base_url_default).toBe(
+      'https://dashscope-intl.aliyuncs.com/compatible-mode/v1',
+    );
+    expect(r!.auth_env?.required).toEqual(['DASHSCOPE_API_KEY']);
+  });
+
+  test('embedding touchpoint declares text-embedding-v3 first + 1024 dims', () => {
+    const r = getRecipe('dashscope')!;
+    expect(r.touchpoints.embedding).toBeDefined();
+    expect(r.touchpoints.embedding!.models[0]).toBe('text-embedding-v3');
+    expect(r.touchpoints.embedding!.models).toContain('text-embedding-v2');
+    expect(r.touchpoints.embedding!.default_dims).toBe(1024);
+    expect(r.touchpoints.embedding!.dims_options).toEqual([64, 128, 256, 512, 768, 1024]);
+    // Matryoshka: every dims option ≤ 2000 (HNSW-compatible).
+    for (const d of r.touchpoints.embedding!.dims_options ?? []) {
+      expect(d).toBeLessThanOrEqual(2000);
+    }
+  });
+
+  test('default auth: DASHSCOPE_API_KEY set → "Bearer <key>"', () => {
+    const r = getRecipe('dashscope')!;
+    const auth = defaultResolveAuth(
+      r,
+      { DASHSCOPE_API_KEY: 'sk-dashscope-fake' },
+      'embedding',
+    );
+    expect(auth.headerName).toBe('Authorization');
+    expect(auth.token).toBe('Bearer sk-dashscope-fake');
+  });
+
+  test('default auth: missing DASHSCOPE_API_KEY → AIConfigError', () => {
+    const r = getRecipe('dashscope')!;
+    expect(() => defaultResolveAuth(r, {}, 'embedding')).toThrow(AIConfigError);
+  });
+
+  test('declares chars_per_token + max_batch_tokens for safer batching', () => {
+    const r = getRecipe('dashscope')!;
+    expect(r.touchpoints.embedding!.max_batch_tokens).toBeGreaterThan(0);
+    expect(r.touchpoints.embedding!.chars_per_token).toBeGreaterThan(0);
+  });
+
+  test('dimsProviderOptions threads dimensions for text-embedding-v3 (Matryoshka)', async () => {
+    // Codex finding #1: DashScope text-embedding-v3 is Matryoshka 64-1024.
+    // Without `dimensions` on the wire, user-selected non-default dims are
+    // silently ignored and the provider returns its default size.
+    const { dimsProviderOptions } = await import('../../src/core/ai/dims.ts');
+    expect(dimsProviderOptions('openai-compatible', 'text-embedding-v3', 512))
+      .toEqual({ openaiCompatible: { dimensions: 512 } });
+    expect(dimsProviderOptions('openai-compatible', 'text-embedding-v3', 1024))
+      .toEqual({ openaiCompatible: { dimensions: 1024 } });
+    // text-embedding-v2 is fixed-dim; no passthrough.
+    expect(dimsProviderOptions('openai-compatible', 'text-embedding-v2', 1024))
+      .toBeUndefined();
+  });
+});
diff --git a/test/ai/recipe-llama-server.test.ts b/test/ai/recipe-llama-server.test.ts
new file mode 100644
index 000000000..58775142c
--- /dev/null
+++ b/test/ai/recipe-llama-server.test.ts
@@ -0,0 +1,82 @@
+/**
+ * llama-server recipe smoke (Commit 4 of the v0.32 wave).
+ *
+ * llama-server is the second user-driven-models recipe (alongside
+ * litellm-proxy). It declares `models: []`, `user_provided_models: true`,
+ * and a `probe()` that consults LLAMA_SERVER_BASE_URL.
+ *
+ * Coverage:
+ *  - Recipe registered + has expected fields
+ *  - user_provided_models is the explicit signal (not the legacy id heuristic)
+ *  - probe is callable and reports `ready: false` with a setup hint when no server is listening
+ *  - default auth resolves to "Bearer unauthenticated" (or the API key if set)
+ */
+
+import { describe, expect, test } from 'bun:test';
+import { getRecipe } from '../../src/core/ai/recipes/index.ts';
+import { defaultResolveAuth } from '../../src/core/ai/gateway.ts';
+import { withEnv } from '../helpers/with-env.ts';
+
+describe('recipe: llama-server', () => {
+  test('registered with expected shape', () => {
+    const r = getRecipe('llama-server');
+    expect(r).toBeDefined();
+    expect(r!.id).toBe('llama-server');
+    expect(r!.tier).toBe('openai-compat');
+    expect(r!.implementation).toBe('openai-compatible');
+    expect(r!.base_url_default).toBe('http://localhost:8080/v1');
+    expect(r!.auth_env?.required ?? []).toEqual([]);
+    expect(r!.auth_env?.optional ?? []).toContain('LLAMA_SERVER_BASE_URL');
+    expect(r!.auth_env?.optional ?? []).toContain('LLAMA_SERVER_API_KEY');
+  });
+
+  test('embedding touchpoint declares user_provided_models', () => {
+    const r = getRecipe('llama-server')!;
+    expect(r.touchpoints.embedding).toBeDefined();
+    expect(r.touchpoints.embedding!.models).toEqual([]);
+    expect(r.touchpoints.embedding!.user_provided_models).toBe(true);
+    expect(r.touchpoints.embedding!.default_dims).toBe(0);
+  });
+
+  test('declares a probe function', () => {
+    const r = getRecipe('llama-server')!;
+    expect(typeof r.probe).toBe('function');
+  });
+
+  test('probe returns ready=false with hint when no server listening on default port', async () => {
+    // Use a guaranteed-unreachable port. withEnv ensures the prior value
+    // (if any) is restored after the test, including across the
+    // shared-process parallel test runner.
+    await withEnv({ LLAMA_SERVER_BASE_URL: 'http://127.0.0.1:1/v1' }, async () => {
+      const r = getRecipe('llama-server')!;
+      const result = await r.probe!();
+      expect(result.ready).toBe(false);
+      expect(result.hint).toBeDefined();
+      expect(result.hint!.toLowerCase()).toContain('llama-server');
+    });
+  });
+
+  test('default auth: no env → "Bearer unauthenticated"', () => {
+    const r = getRecipe('llama-server')!;
+    const auth = defaultResolveAuth(r, {}, 'embedding');
+    expect(auth.headerName).toBe('Authorization');
+    expect(auth.token).toBe('Bearer unauthenticated');
+  });
+
+  test('default auth: LLAMA_SERVER_API_KEY set → "Bearer <key>"', () => {
+    const r = getRecipe('llama-server')!;
+    const auth = defaultResolveAuth(r, { LLAMA_SERVER_API_KEY: 'sk-llama-fake' }, 'embedding');
+    expect(auth.headerName).toBe('Authorization');
+    expect(auth.token).toBe('Bearer sk-llama-fake');
+  });
+
+  test('default auth: LLAMA_SERVER_BASE_URL alone does NOT become the Bearer (URL-shaped optional)', () => {
+    const r = getRecipe('llama-server')!;
+    const auth = defaultResolveAuth(
+      r,
+      { LLAMA_SERVER_BASE_URL: 'http://my-llama:8080/v1' },
+      'embedding',
+    );
+    expect(auth.token).toBe('Bearer unauthenticated');
+  });
+});
diff --git a/test/ai/recipe-minimax.test.ts b/test/ai/recipe-minimax.test.ts
new file mode 100644
index 000000000..96f6c9eda
--- /dev/null
+++ b/test/ai/recipe-minimax.test.ts
@@ -0,0 +1,59 @@
+/**
+ * MiniMax recipe smoke (Commit 5 of the v0.32 wave).
+ *
+ * Coverage:
+ *  - Recipe registered with expected shape
+ *  - default auth: MINIMAX_API_KEY → "Bearer <key>"; missing → AIConfigError
+ *  - dimsProviderOptions threads `type: 'db'` for embo-01 (the asymmetric
+ *    retrieval field default) — pins the v1 indexing-only behavior
+ */
+
+import { describe, expect, test } from 'bun:test';
+import { getRecipe } from '../../src/core/ai/recipes/index.ts';
+import { defaultResolveAuth } from '../../src/core/ai/gateway.ts';
+import { dimsProviderOptions } from '../../src/core/ai/dims.ts';
+import { AIConfigError } from '../../src/core/ai/errors.ts';
+
+describe('recipe: minimax', () => {
+  test('registered with expected shape', () => {
+    const r = getRecipe('minimax');
+    expect(r).toBeDefined();
+    expect(r!.id).toBe('minimax');
+    expect(r!.tier).toBe('openai-compat');
+    expect(r!.implementation).toBe('openai-compatible');
+    expect(r!.base_url_default).toBe('https://api.minimaxi.com/v1');
+    expect(r!.auth_env?.required).toEqual(['MINIMAX_API_KEY']);
+    expect(r!.auth_env?.optional).toContain('MINIMAX_GROUP_ID');
+  });
+
+  test('embedding touchpoint declares embo-01 + 1536 dims', () => {
+    const r = getRecipe('minimax')!;
+    expect(r.touchpoints.embedding).toBeDefined();
+    expect(r.touchpoints.embedding!.models).toEqual(['embo-01']);
+    expect(r.touchpoints.embedding!.default_dims).toBe(1536);
+    expect(r.touchpoints.embedding!.user_provided_models ?? false).toBe(false);
+    expect(r.touchpoints.embedding!.max_batch_tokens).toBe(4096);
+  });
+
+  test('default auth: MINIMAX_API_KEY set → "Bearer <key>"', () => {
+    const r = getRecipe('minimax')!;
+    const auth = defaultResolveAuth(r, { MINIMAX_API_KEY: 'fake-mm-key' }, 'embedding');
+    expect(auth.headerName).toBe('Authorization');
+    expect(auth.token).toBe('Bearer fake-mm-key');
+  });
+
+  test('default auth: missing MINIMAX_API_KEY → AIConfigError', () => {
+    const r = getRecipe('minimax')!;
+    expect(() => defaultResolveAuth(r, {}, 'embedding')).toThrow(AIConfigError);
+  });
+
+  test('dimsProviderOptions threads type:db for embo-01', () => {
+    const opts = dimsProviderOptions('openai-compatible', 'embo-01', 1536);
+    expect(opts).toEqual({ openaiCompatible: { type: 'db' } });
+  });
+
+  test('dimsProviderOptions returns undefined for non-MiniMax openai-compat models', () => {
+    expect(dimsProviderOptions('openai-compatible', 'voyage-3-lite', 512)).toBeUndefined();
+    expect(dimsProviderOptions('openai-compatible', 'nomic-embed-text', 768)).toBeUndefined();
+  });
+});
diff --git a/test/ai/recipe-zhipu.test.ts b/test/ai/recipe-zhipu.test.ts
new file mode 100644
index 000000000..dfb5f112a
--- /dev/null
+++ b/test/ai/recipe-zhipu.test.ts
@@ -0,0 +1,85 @@
+/**
+ * Zhipu AI (BigModel) recipe smoke (Commit 7 of the v0.32 wave).
+ *
+ * Coverage:
+ *  - Recipe registered with expected shape
+ *  - default auth: ZHIPUAI_API_KEY → "Bearer <key>"; missing → AIConfigError
+ *  - dims_options exposes [256, 512, 1024, 2048]; default 1024 (HNSW-compatible)
+ *  - 2048-dim path falls into exact-scan branch via chunkEmbeddingIndexSql
+ *    from src/core/vector-index.ts
+ */
+
+import { describe, expect, test } from 'bun:test';
+import { getRecipe } from '../../src/core/ai/recipes/index.ts';
+import { defaultResolveAuth } from '../../src/core/ai/gateway.ts';
+import { AIConfigError } from '../../src/core/ai/errors.ts';
+import {
+  PGVECTOR_HNSW_VECTOR_MAX_DIMS,
+  chunkEmbeddingIndexSql,
+} from '../../src/core/vector-index.ts';
+
+describe('recipe: zhipu', () => {
+  test('registered with expected shape', () => {
+    const r = getRecipe('zhipu');
+    expect(r).toBeDefined();
+    expect(r!.id).toBe('zhipu');
+    expect(r!.tier).toBe('openai-compat');
+    expect(r!.implementation).toBe('openai-compatible');
+    expect(r!.base_url_default).toBe('https://open.bigmodel.cn/api/paas/v4');
+    expect(r!.auth_env?.required).toEqual(['ZHIPUAI_API_KEY']);
+  });
+
+  test('embedding touchpoint declares embedding-3 first + 1024 dims (HNSW-compatible default)', () => {
+    const r = getRecipe('zhipu')!;
+    expect(r.touchpoints.embedding).toBeDefined();
+    expect(r.touchpoints.embedding!.models[0]).toBe('embedding-3');
+    expect(r.touchpoints.embedding!.models).toContain('embedding-2');
+    expect(r.touchpoints.embedding!.default_dims).toBe(1024);
+    expect(r.touchpoints.embedding!.dims_options).toEqual([256, 512, 1024, 2048]);
+    // The default must stay HNSW-compatible.
+    expect(r.touchpoints.embedding!.default_dims).toBeLessThanOrEqual(
+      PGVECTOR_HNSW_VECTOR_MAX_DIMS,
+    );
+  });
+
+  test('default auth: ZHIPUAI_API_KEY set → "Bearer <key>"', () => {
+    const r = getRecipe('zhipu')!;
+    const auth = defaultResolveAuth(r, { ZHIPUAI_API_KEY: 'fake-zhipu-key' }, 'embedding');
+    expect(auth.headerName).toBe('Authorization');
+    expect(auth.token).toBe('Bearer fake-zhipu-key');
+  });
+
+  test('default auth: missing ZHIPUAI_API_KEY → AIConfigError', () => {
+    const r = getRecipe('zhipu')!;
+    expect(() => defaultResolveAuth(r, {}, 'embedding')).toThrow(AIConfigError);
+  });
+
+  test('2048-dim option from dims_options falls into exact-scan branch', () => {
+    // 2048d exceeds the HNSW cap, so chunkEmbeddingIndexSql returns the
+    // exact-scan-skip-index path. Users picking 2048 trade ANN speed for
+    // full embedding fidelity.
+    const sql = chunkEmbeddingIndexSql(2048);
+    expect(sql.toLowerCase()).toContain('skipped');
+    expect(sql.toLowerCase()).toContain('hnsw');
+  });
+
+  test('1024-dim default returns the HNSW index SQL (fast path)', () => {
+    const sql = chunkEmbeddingIndexSql(1024);
+    expect(sql.toLowerCase()).toContain('create index');
+    expect(sql.toLowerCase()).toContain('hnsw');
+  });
+
+  test('dimsProviderOptions threads dimensions for embedding-3 (Matryoshka)', async () => {
+    // Codex finding #1: Zhipu embedding-3 is Matryoshka 256-2048. Without
+    // `dimensions` on the wire, user-selected non-default dims are
+    // silently ignored.
+    const { dimsProviderOptions } = await import('../../src/core/ai/dims.ts');
+    expect(dimsProviderOptions('openai-compatible', 'embedding-3', 1024))
+      .toEqual({ openaiCompatible: { dimensions: 1024 } });
+    expect(dimsProviderOptions('openai-compatible', 'embedding-3', 2048))
+      .toEqual({ openaiCompatible: { dimensions: 2048 } });
+    // embedding-2 is fixed-dim; no passthrough.
+    expect(dimsProviderOptions('openai-compatible', 'embedding-2', 1024))
+      .toBeUndefined();
+  });
+});
diff --git a/test/ai/recipes-existing-regression.test.ts b/test/ai/recipes-existing-regression.test.ts
new file mode 100644
index 000000000..567c1c94e
--- /dev/null
+++ b/test/ai/recipes-existing-regression.test.ts
@@ -0,0 +1,196 @@
+/**
+ * IRON RULE regression test (D2/D12=A): the v0.32 resolveAuth refactor
+ * MUST NOT change auth behavior for any of the 9 existing recipes
+ * (openai, anthropic, google, deepseek, groq, ollama, litellm-proxy,
+ * together, voyage).
+ *
+ * Pre-v0.32, openai-compatible auth was duplicated 3 times in gateway.ts
+ * with subtle drift; D12=A unified all three through Recipe.resolveAuth?
+ * with a default that covers existing recipes unchanged. This test pins
+ * the contract so the next refactor can't silently regress it.
+ *
+ * Coverage:
+ *  - defaultResolveAuth returns Authorization Bearer <key> when required[0] is set
+ *  - throws AIConfigError when required env is missing (with recipe name + touchpoint in message)
+ *  - falls back to first present optional env when required is empty (Ollama-style)
+ *  - falls back to 'unauthenticated' when neither required nor optional present
+ *  - applyResolveAuth converts Authorization Bearer to {apiKey} (SDK native)
+ *  - applyResolveAuth converts custom headers to {headers} WITHOUT apiKey (no double-auth)
+ *  - all 3 touchpoints (embedding, expansion, chat) produce identical auth shape for the same recipe+env
+ *  - native recipes (openai, anthropic, google) are not consulted via resolveAuth (they use their AI-SDK adapters directly)
+ */
+
+import { describe, expect, test } from 'bun:test';
+import { defaultResolveAuth, applyResolveAuth } from '../../src/core/ai/gateway.ts';
+import { listRecipes, getRecipe } from '../../src/core/ai/recipes/index.ts';
+import { AIConfigError } from '../../src/core/ai/errors.ts';
+import type { Recipe } from '../../src/core/ai/types.ts';
+
+const TOUCHPOINTS: Array<'embedding' | 'expansion' | 'chat'> = ['embedding', 'expansion', 'chat'];
+
+describe('IRON RULE: existing 9 recipes survive the v0.32 resolveAuth refactor', () => {
+  test('all 9 baseline recipes are still registered (subset, allows post-v0.32 additions)', () => {
+    const ids = new Set(listRecipes().map(r => r.id));
+    for (const baseline of [
+      'anthropic',
+      'deepseek',
+      'google',
+      'groq',
+      'litellm',
+      'ollama',
+      'openai',
+      'together',
+      'voyage',
+    ]) {
+      expect(ids.has(baseline), `baseline recipe ${baseline} missing post-refactor`).toBe(true);
+    }
+  });
+
+  test('every recipe with a non-empty required[] returns Authorization Bearer <key>', () => {
+    for (const r of listRecipes()) {
+      const required = r.auth_env?.required ?? [];
+      if (required.length === 0) continue;
+      const env = { [required[0]]: `fake-${r.id}-key` };
+      const auth = defaultResolveAuth(r, env, 'embedding');
+      expect(auth.headerName).toBe('Authorization');
+      expect(auth.token).toBe(`Bearer fake-${r.id}-key`);
+    }
+  });
+
+  test('missing required env throws AIConfigError naming the recipe + touchpoint', () => {
+    const recipesWithRequired = listRecipes().filter(r => (r.auth_env?.required ?? []).length > 0);
+    expect(recipesWithRequired.length).toBeGreaterThan(0);
+    for (const r of recipesWithRequired) {
+      for (const tp of TOUCHPOINTS) {
+        let caught: unknown;
+        try {
+          defaultResolveAuth(r, {}, tp);
+        } catch (e) {
+          caught = e;
+        }
+        expect(caught, `${r.id} ${tp} should throw on missing env`).toBeInstanceOf(AIConfigError);
+        const msg = (caught as Error).message;
+        expect(msg).toContain(r.name);
+        expect(msg).toContain(tp);
+        expect(msg).toContain(r.auth_env!.required[0]);
+      }
+    }
+  });
+
+  test('Ollama (empty required, OLLAMA_API_KEY set) reads it as the Bearer token', () => {
+    const ollama = getRecipe('ollama');
+    expect(ollama).toBeDefined();
+    expect(ollama!.auth_env?.required ?? []).toEqual([]);
+    const optional = ollama!.auth_env?.optional ?? [];
+    expect(optional).toContain('OLLAMA_API_KEY');
+    // OLLAMA_API_KEY (a non-URL-shaped optional) becomes the Bearer.
+    const auth = defaultResolveAuth(ollama!, { OLLAMA_API_KEY: 'fake-token' }, 'embedding');
+    expect(auth.headerName).toBe('Authorization');
+    expect(auth.token).toBe('Bearer fake-token');
+  });
+
+  test('Ollama (no env at all) falls back to "Bearer unauthenticated"', () => {
+    const ollama = getRecipe('ollama');
+    const auth = defaultResolveAuth(ollama!, {}, 'embedding');
+    expect(auth.headerName).toBe('Authorization');
+    expect(auth.token).toBe('Bearer unauthenticated');
+  });
+
+  test('URL-shaped optional env (OLLAMA_BASE_URL, LLAMA_SERVER_BASE_URL) does NOT become the Bearer token', () => {
+    // Regression for the v0.32 default-fallback design: optional entries
+    // ending in _URL or _BASE_URL are config (cfg.base_urls), not auth.
+    // The fallback must skip them and consult the next optional API-key entry.
+    const ollama = getRecipe('ollama');
+    const auth1 = defaultResolveAuth(
+      ollama!,
+      { OLLAMA_BASE_URL: 'http://my-ollama/v1' },
+      'embedding',
+    );
+    expect(auth1.token, 'OLLAMA_BASE_URL must not become Bearer token').toBe('Bearer unauthenticated');
+
+    // When BOTH BASE_URL and API_KEY are set, the API_KEY wins.
+    const auth2 = defaultResolveAuth(
+      ollama!,
+      { OLLAMA_BASE_URL: 'http://my-ollama/v1', OLLAMA_API_KEY: 'real-key' },
+      'embedding',
+    );
+    expect(auth2.token).toBe('Bearer real-key');
+  });
+
+  test('all 3 touchpoints produce identical auth for the same recipe + env', () => {
+    // Critical regression: pre-v0.32, embedding had a fallback to
+    // ${recipe.id.toUpperCase()}_API_KEY that expansion and chat lacked.
+    // Post-D12=A unification, all 3 touchpoints go through the same
+    // resolver, so the auth shape MUST match.
+    for (const r of listRecipes()) {
+      if (r.implementation !== 'openai-compatible') continue;
+      const required = r.auth_env?.required ?? [];
+      const env: Record<string, string> = {};
+      if (required.length > 0) env[required[0]] = `fake-${r.id}-key`;
+
+      const embeddingAuth = applyResolveAuth(r, { env } as any, 'embedding');
+      const expansionAuth = applyResolveAuth(r, { env } as any, 'expansion');
+      const chatAuth = applyResolveAuth(r, { env } as any, 'chat');
+
+      expect(embeddingAuth, `${r.id} embed=expand`).toEqual(expansionAuth);
+      expect(expansionAuth, `${r.id} expand=chat`).toEqual(chatAuth);
+    }
+  });
+
+  test('applyResolveAuth converts Authorization Bearer to {apiKey} (SDK-native path)', () => {
+    const voyage = getRecipe('voyage')!;
+    const env = { VOYAGE_API_KEY: 'fake-voyage-key' };
+    const auth = applyResolveAuth(voyage, { env } as any, 'embedding');
+    expect(auth.apiKey).toBe('fake-voyage-key');
+    expect(auth.headers).toBeUndefined();
+  });
+
+  test('applyResolveAuth respects a recipe.resolveAuth override that returns a custom header', () => {
+    // Synthetic recipe with a custom-header resolveAuth (Azure-style preview;
+    // the actual Azure recipe lands in commit 8). Ensures the seam works.
+    const fakeAzure: Recipe = {
+      id: 'fake-azure',
+      name: 'Fake Azure',
+      tier: 'openai-compat',
+      implementation: 'openai-compatible',
+      auth_env: { required: ['FAKE_AZURE_API_KEY'] },
+      touchpoints: {},
+      resolveAuth(env) {
+        const k = env.FAKE_AZURE_API_KEY;
+        if (!k) throw new AIConfigError('Fake Azure requires FAKE_AZURE_API_KEY.');
+        return { headerName: 'api-key', token: k };
+      },
+    };
+    const env = { FAKE_AZURE_API_KEY: 'fake-key' };
+    const auth = applyResolveAuth(fakeAzure, { env } as any, 'embedding');
+    expect(auth.apiKey, 'custom-header path must NOT set apiKey').toBeUndefined();
+    expect(auth.headers).toEqual({ 'api-key': 'fake-key' });
+  });
+
+  test('native-* recipes have no resolveAuth declared; they take native SDK paths', () => {
+    // Confirms the architectural invariant: resolveAuth is only consulted by
+    // the openai-compatible branches in instantiate{Embedding,Expansion,Chat}.
+    // Native recipes (openai, anthropic, google) use createOpenAI /
+    // createAnthropic / createGoogleGenerativeAI directly with the SDK's
+    // own apiKey field. This test pins that resolveAuth is intentionally
+    // absent on the native recipes — a future drift that adds it without
+    // wiring it through the native branches would silently fail this assert.
+    for (const id of ['openai', 'anthropic', 'google']) {
+      const r = getRecipe(id);
+      expect(r, `recipe ${id} missing`).toBeDefined();
+      expect(r!.tier).toBe('native');
+      expect(r!.resolveAuth, `${id} should NOT declare resolveAuth in v0.32`).toBeUndefined();
+    }
+  });
+
+  test('only Azure overrides resolveAuth in v0.32 (default applies elsewhere)', () => {
+    // The default resolver covers every openai-compatible recipe except
+    // Azure, which uses the api-key custom-header path. The IRON RULE
+    // contract: any new override beyond Azure must be reviewed for
+    // double-auth + back-compat regression.
+    const overrides = listRecipes().filter(
+      r => r.implementation === 'openai-compatible' && r.resolveAuth,
+    );
+    expect(overrides.map(r => r.id).sort()).toEqual(['azure-openai']);
+  });
+});
diff --git a/test/brain-registry.serial.test.ts b/test/brain-registry.serial.test.ts
index 83eff6b60..fae4e6154 100644
--- a/test/brain-registry.serial.test.ts
+++ b/test/brain-registry.serial.test.ts
@@ -270,12 +270,29 @@ describe('BrainRegistry — lazy init', () => {
     // verify the routing logic by observing the default-branch path. This
     // test proves the fall-through to HOST_BRAIN_ID happens before any
     // lookup, not that host init actually succeeds.
-    const reg = new BrainRegistry([]);
-    // Expect the host-init path to be attempted (it'll fail on missing
-    // ~/.gbrain/config.json in test env, but the error will come from
-    // initHostBrain, not UnknownBrainError — proving routing hit host).
-    await expect(reg.getBrain(null)).rejects.not.toBeInstanceOf(UnknownBrainError);
-    await expect(reg.getBrain(undefined)).rejects.not.toBeInstanceOf(UnknownBrainError);
-    await expect(reg.getBrain('')).rejects.not.toBeInstanceOf(UnknownBrainError);
+    //
+    // Hermeticity: dev machines often have a real ~/.gbrain/config.json
+    // (the maintainer's own brain). Without GBRAIN_HOME isolation, the
+    // host-init path RESOLVES successfully on those machines instead of
+    // rejecting, breaking the `rejects.not.toBeInstanceOf` assertion. Pin
+    // GBRAIN_HOME to a guaranteed-empty tempdir so host-init has nothing
+    // to find and fails loudly (which is exactly the error the assertion
+    // wants — not UnknownBrainError, but ALSO not a successful resolve).
+    const isolatedHome = mkdtempSync(join(tmpdir(), 'brain-registry-home-'));
+    track(isolatedHome);
+    const savedHome = process.env.GBRAIN_HOME;
+    process.env.GBRAIN_HOME = isolatedHome;
+    try {
+      const reg = new BrainRegistry([]);
+      // Expect the host-init path to be attempted (it'll fail on missing
+      // <isolated>/.gbrain/config.json, but the error will come from
+      // initHostBrain, not UnknownBrainError — proving routing hit host).
+      await expect(reg.getBrain(null)).rejects.not.toBeInstanceOf(UnknownBrainError);
+      await expect(reg.getBrain(undefined)).rejects.not.toBeInstanceOf(UnknownBrainError);
+      await expect(reg.getBrain('')).rejects.not.toBeInstanceOf(UnknownBrainError);
+    } finally {
+      if (savedHome !== undefined) process.env.GBRAIN_HOME = savedHome;
+      else delete process.env.GBRAIN_HOME;
+    }
   });
 });