fix(inference): support request-level provider api key by Vaibhav701161 · Pull Request #3586 · maximhq/bifrost

Vaibhav701161 · 2026-05-19T08:16:53Z

Summary

Adds request-level BYOK support for provider API keys through x-bf-provider-api-key.

The goal is to let a request use a customer-provided provider key without storing it in Bifrost and without bypassing the normal provider/model/key selection flow. Bifrost still selects an eligible configured key first, then only overrides the selected key value for that single request.

Changes

Added x-bf-provider-api-key as a request context key.
Read the header from HTTP requests and store it only in the request context.
Override only the selected key value after normal key selection is complete.
Preserved selected key metadata like ID, name, model filtering, and key eligibility.
Added x-bf-provider-api-key to security deny lists so it cannot be forwarded through extra headers.
Added it to security header filtering paths.

Main trade-off: usage/key attribution still points to the selected configured key, since only the secret value is replaced for the request. This keeps the change small and avoids introducing customer key storage.

Type of change

Affected areas

How to test

Build:

cd transports/bifrost-http
CGO_ENABLED=1 go build -ldflags="-w -s -X main.Version=v0.0.0-dev" -tags "sqlite_static" -o ../../tmp/bifrost-http .

Runtime validation was done with Bifrost running on 127.0.0.1:9090 and a local capture provider on 127.0.0.1:19091.

Without BYOK header:

curl -sS -X POST http://127.0.0.1:9090/v1/chat/completions \
  -H 'Content-Type: application/json' \
  -d '{"model":"openai/gpt-4o-mini","messages":[{"role":"user","content":"hello"}]}' >/dev/null

cat /tmp/bifrost_byok_capture.jsonl

Expected: downstream provider receives the configured key.

With BYOK header:

curl -sS -X POST http://127.0.0.1:9090/v1/chat/completions \
  -H 'Content-Type: application/json' \
  -H 'x-bf-provider-api-key: CUSTOMER_KEY_SHOULD_BE_USED' \
  -d '{"model":"openai/gpt-4o-mini","messages":[{"role":"user","content":"hello"}]}' >/dev/null

cat /tmp/bifrost_byok_capture.jsonl

Expected: downstream provider receives Bearer CUSTOMER_KEY_SHOULD_BE_USED.

Then removing the header again switches back to the configured key.

Also verified:

BYOK does not allow unconfigured providers.
BYOK does not allow unsupported models.
Customer key was not present in Bifrost logs.
Customer key was not present in response bodies.

Screenshots/Recordings

Added terminal screenshot showing:

request without x-bf-provider-api-key uses the configured vendor key
request with x-bf-provider-api-key uses the request-scoped customer key
request without the header again goes back to the configured vendor key
unsupported provider is still blocked

Breaking changes

Yes
No

Related issues

closes BF-557

Security considerations

This touches provider API key handling, so I kept the scope request-only.

Raw customer provider key is not stored.
Raw customer provider key is not written to provider config.
Normal provider/model/key eligibility still runs before the override.
x-bf-provider-api-key is blocked from extra-header forwarding paths.
Runtime check confirmed the dummy customer key did not appear in Bifrost logs or response bodies.

Checklist

I read docs/contributing/README.md and followed the guidelines
I added/updated tests where appropriate
I updated documentation where needed
I verified builds succeed (Go and UI)
I verified the CI pipeline passes locally if applicable

…filter inaccessible sidebar items (#3295) This PR improves RBAC granularity in the sidebar by introducing dedicated resource types for `APIKeys`, `Inference`, and `Metrics`, and fixes sidebar visibility logic so that items and groups are hidden when the user lacks access rather than relying on broader, less specific permissions. - Added three new `RbacResource` enum values: `APIKeys`, `Inference`, and `Metrics` to the fallback RBAC context. - The API Keys sidebar item now gates access via the new `hasAPIKeyAccess` (`RbacResource.APIKeys`) check instead of the generic `hasSettingsAccess`. - The MCP Logs sidebar item now correctly gates access via `hasMCPGatewayAccess` instead of the unrelated `hasLogsAccess`. - Introduced an `accessibleItems` memoized computation that filters out sidebar items and entire groups whose sub-items are all inaccessible, ensuring users never see empty navigation sections. Previously, access filtering only happened during search. - Removed unused imports (`PanelLeft`, `PanelRight`, `cn`). - [ ] Bug fix - [x] Feature - [x] Refactor - [ ] Documentation - [ ] Chore/CI - [ ] Core (Go) - [ ] Transports (HTTP) - [ ] Providers/Integrations - [ ] Plugins - [x] UI (React) - [ ] Docs 1. Log in as a user with restricted RBAC permissions that exclude `APIKeys` and/or `Settings`. 2. Verify the API Keys entry under the Config section is hidden for users without `APIKeys` view permission. 3. Verify the MCP Logs entry is hidden for users without `MCPGateway` view permission. 4. Verify that sidebar groups with no accessible sub-items are hidden entirely rather than showing an empty group. 5. Verify that users with full access see no change in sidebar behavior. ```sh cd ui pnpm i || npm i pnpm build || npm run build ``` _Add before/after screenshots showing sidebar items hidden for restricted users._ - [ ] Yes - [x] No _Link related issues here._ Access control checks for API Keys management are now scoped to a dedicated `APIKeys` RBAC resource rather than the broader `Settings` resource, reducing the risk of unintended access to key management for users who have settings visibility but should not manage API keys. - [ ] I read `docs/contributing/README.md` and followed the guidelines - [ ] I added/updated tests where appropriate - [ ] I updated documentation where needed - [ ] I verified builds succeed (Go and UI) - [ ] I verified the CI pipeline passes locally if applicable

…elete access (#3314) The delete button in log tables was always rendered (just disabled) for users without delete access. This PR hides the actions column entirely when the user lacks delete permissions, and fixes the RBAC resource check for MCP logs to use the correct `MCPGateway` resource instead of `Logs`. - The actions column in both the workspace logs and MCP logs tables is now conditionally included in the column definitions only when `hasDeleteAccess` is `true`, rather than always rendering a disabled button. - The delete button styling was updated to use more visible destructive colors (`text-destructive/60 border-destructive/60`) instead of the previous muted secondary foreground styles. - The RBAC resource used to gate delete access on the MCP logs page was corrected from `RbacResource.Logs` to `RbacResource.MCPGateway`. - [x] Bug fix - [ ] Feature - [ ] Refactor - [ ] Documentation - [ ] Chore/CI - [ ] Core (Go) - [ ] Transports (HTTP) - [ ] Providers/Integrations - [ ] Plugins - [x] UI (React) - [ ] Docs 1. Log in as a user **without** delete access on Logs or MCPGateway resources. 2. Navigate to the workspace logs page and the MCP logs page. 3. Verify the delete button/column is not visible. 4. Log in as a user **with** delete access. 5. Verify the delete button appears and is functional. ```sh cd ui pnpm i pnpm test pnpm build ``` Before: Delete button rendered but disabled for users without access. After: Delete column is hidden entirely for users without delete access. - [ ] Yes - [x] No The RBAC fix ensures MCP log deletion is gated on the correct `MCPGateway` resource permission, preventing users with only `Logs` delete access from incorrectly being granted delete access to MCP logs. - [ ] I read `docs/contributing/README.md` and followed the guidelines - [ ] I added/updated tests where appropriate - [ ] I updated documentation where needed - [ ] I verified builds succeed (Go and UI) - [ ] I verified the CI pipeline passes locally if applicable

…ogs route and sidebar (#3316) Introduces a dedicated `MCPLogs` RBAC resource, decoupling MCP log access control from the `MCPGateway` resource. This allows permissions for viewing and deleting MCP logs to be managed independently from gateway-level permissions. - Added `MCPLogs` as a new `RbacResource` enum value in the fallback RBAC context. - The MCP Logs route now checks `MCPLogs` view permission and renders a `NoPermissionView` when access is denied, rather than rendering the page unconditionally. - Delete access on the MCP Logs page now checks `RbacResource.MCPLogs` instead of `RbacResource.MCPGateway`. - The sidebar MCP Logs entry now uses `hasMCPLogsAccess` (derived from `RbacResource.MCPLogs`) to control visibility, rather than reusing `hasMCPGatewayAccess`. - [ ] Bug fix - [x] Feature - [ ] Refactor - [ ] Documentation - [ ] Chore/CI - [ ] Core (Go) - [ ] Transports (HTTP) - [ ] Providers/Integrations - [ ] Plugins - [x] UI (React) - [ ] Docs 1. Configure a role that has `MCPGateway` access but **no** `MCPLogs` access. 2. Log in as a user with that role and navigate to the MCP Logs page — the `NoPermissionView` should be displayed and the sidebar entry should be hidden. 3. Grant the role `MCPLogs` view access and confirm the page and sidebar entry become accessible. 4. Verify that delete functionality on the MCP Logs page is gated by `MCPLogs` delete permission independently of `MCPGateway` delete permission. ```sh cd ui pnpm i || npm i pnpm test || npm test pnpm build || npm run build ``` N/A - [x] Yes - [ ] No Any role configuration that previously relied on `MCPGateway` permissions to grant access to MCP Logs will need to be updated to explicitly grant `MCPLogs` permissions. N/A Access to MCP log data (which may contain sensitive tool execution details) is now enforced by a dedicated RBAC resource, reducing the risk of unintended access through overly broad `MCPGateway` permissions. - [ ] I read `docs/contributing/README.md` and followed the guidelines - [ ] I added/updated tests where appropriate - [ ] I updated documentation where needed - [ ] I verified builds succeed (Go and UI) - [ ] I verified the CI pipeline passes locally if applicable

…SQL helper to prevent malformed JSON from aborting list queries (#3407) ## Summary The `/api/logs` list query was aborting entirely when a single row contained malformed JSON in `input_history` or `responses_input_history`. The previous inline guard only checked the first character before casting to `jsonb`, so rows that appeared array-shaped but contained malformed JSON (unterminated structures, trailing commas, unpaired UTF-16 surrogates, `\u0000` escapes, etc.) would trigger a `22P02`/`22P05` error and kill the entire response. This PR fixes that by introducing a PL/pgSQL helper function (`bifrost_safe_jsonb`) that wraps the cast in an `EXCEPTION` block and falls back to returning the raw text on any parse failure. ## Changes - Added a new migration `migrationAddSafeJsonbFunction` that installs the `bifrost_safe_jsonb(text)` PL/pgSQL function on Postgres. The function validates the input, attempts the `jsonb` cast inside an `EXCEPTION` block, and returns the last array element on success or the raw text on any failure. - Replaced the multi-condition inline `CASE` guards in `listSelectColumns` for Postgres with calls to `bifrost_safe_jsonb`, simplifying the SQL and correctly handling all malformed-JSON edge cases that the previous character-check approach missed. - For SQLite, added `json_valid()`, `json_type()`, and `json_array_length()` guards to the `CASE` expressions to prevent extraction attempts on invalid or empty arrays. - Added `safe_jsonb_test.go` covering both the SQLite and Postgres dialect branches of `listSelectColumns`, as well as direct invocation of `bifrost_safe_jsonb` across all relevant edge cases (malformed structures, surrogate pairs, `\u0000` escapes, non-array values, SQL `NULL`). ## Type of change - [x] Bug fix - [ ] Feature - [ ] Refactor - [ ] Documentation - [ ] Chore/CI ## Affected areas - [x] Core (Go) - [ ] Transports (HTTP) - [ ] Providers/Integrations - [ ] Plugins - [ ] UI (React) - [ ] Docs ## How to test ```sh cd framework && docker compose up -d postgres go test ./framework/logstore/ -run 'MalformedInputHistory|BifrostSafeJsonb' -count=1 -v ``` Insert a row into the logs table with a malformed JSON value in `input_history` (e.g., `[{"key": "val"` — unterminated) and verify that a call to the list endpoint returns successfully without a 500 error, with the malformed row's `input_history` returned as raw text rather than aborting the query. ## Test Coverage ### `TestSearchLogs_MalformedInputHistory_{SQLite,Postgres}` — end-to-end list query | # | Case | Column | Payload shape | Pre-fix behavior | Path exercised | | --- | --- | --- | --- | --- | --- | | 1 | `unterminated_object_in_array` | `input_history` | `[{"role":"user","content":"hi"` | 22P02 aborts query | EXCEPTION fallback | | 2 | `garbage_after_bracket` | `input_history` | `[abc, not json]` | 22P02 aborts query | EXCEPTION fallback | | 3 | `trailing_comma` | `input_history` | `[{"role":"user","content":"hi"},]` | 22P02 aborts query | EXCEPTION fallback | | 4 | `unclosed_array_only` | `input_history` | `[` | 22P02 aborts query | EXCEPTION fallback | | 5 | `open_bracket_then_brace_unclosed` | `input_history` | `[{` | 22P02 aborts query | EXCEPTION fallback | | 6 | `nan_value_not_valid_json` | `input_history` | `[NaN]` | 22P02 aborts query | EXCEPTION fallback | | 7 | `infinity_value_not_valid_json` | `input_history` | `[Infinity]` | 22P02 aborts query | EXCEPTION fallback | | 8 | `unpaired_high_surrogate` | `input_history` | `[{"...":"bad \uD800 surrogate"}]` | 22P05 aborts query | EXCEPTION fallback | | 9 | `unpaired_low_surrogate` | `input_history` | `[{"...":"bad \uDC00 low"}]` | 22P05 aborts query | EXCEPTION fallback | | 10 | `bad_surrogate_pair_high_then_ascii` | `input_history` | `[{"c":"\uD800A"}]` | 22P05 aborts query | EXCEPTION fallback | | 11 | `u0000_escape_inside_string` | `input_history` | `[{"...":"null byte � here"}]` | 22P05 aborts query | EXCEPTION fallback | | 12 | `literal_backslash_u0000_valid_jsonb` | `input_history` | `[{"...":"... \\u0000 literal"}]` | OK (degraded to raw by old guard) | Fast path, last-element extraction | | 13 | `single_element_array` | `input_history` | `[{"role":"user","content":"only one"}]` | OK | Fast path | | 14 | `array_of_primitives` | `input_history` | `[1,2,3]` | OK | Fast path | | 15 | `array_with_null_last_element` | `input_history` | `[{...}, null]` | OK | Fast path | | 16 | `deeply_nested_valid` | `input_history` | `[{"role":"user","content":{"nested":{"deep":{"value":42}}}}]` | OK | Fast path | | 17 | `unicode_emoji_content` | `input_history` | `[{"...":"hello 🎉 world ✨"}]` | OK | Fast path | | 18 | `large_valid_array` | `input_history` | 1001-element array | OK | Fast path at scale | | 19 | `leading_whitespace_then_array` | `input_history` | ` [\t{...}]` | OK | `btrim` + fast path | | 20 | `top_level_object_not_array` | `input_history` | `{"not":"an array"}` | OK | Non-array fall-through | | 21 | `null_literal` | `input_history` | `null` | OK | Non-array fall-through | | 22 | `whitespace_only` | `input_history` | `" \t "` | OK | Empty-after-btrim fall-through | | 23 | `realtime_turn_malformed_passthrough` | `input_history` (object_type=`realtime.turn`) | `[{"role":"user"` | OK (outer CASE bypassed safe fn) | Realtime-turn bypass branch | | 24 | `malformed_responses_input_history` | `responses_input_history` | `[{"role":"user"` | 22P02 aborts query | Mirror column, EXCEPTION fallback | | 25 | `valid_responses_input_history` | `responses_input_history` | `[{...},{...}]` | OK | Mirror column, fast path | ## Breaking changes - [ ] Yes - [x] No ## Related issues [https://github.com/maximhq/bifrost/issues/3255](https://github.com/maximhq/bifrost/issues/3255#issuecomment-4427506449) ## Security considerations None. The function is `IMMUTABLE` and operates only on text values already stored in the database. No new inputs are exposed. ## Checklist - [ ] I read `docs/contributing/README.md` and followed the guidelines - [ ] I added/updated tests where appropriate - [ ] I updated documentation where needed - [ ] I verified builds succeed (Go and UI) - [ ] I verified the CI pipeline passes locally if applicable

…3412) ## Summary Adds support for server-configured required request headers in the prompt playground. When the server specifies `required_headers` in its client config, users can now provide values for those headers directly in the settings panel, and they are forwarded with every chat completion request. ## Changes - Added `customHeaders` state and `requiredHeaders` derived from the core config's `client_config.required_headers` to the `PromptContext`, keeping header keys in sync with the server config while preserving user-entered values. - Exposed a "Required Headers" section in the settings panel that renders an input field for each required header name when any are configured. - Extended `ExecutionConfig` in the executor to accept `customHeaders`, which are merged into the fetch request headers (skipping any entries with empty names or values). - Passed `customHeaders` through both `handleSubmit` and `handleSubmitToolResult` execution paths and included it in their respective `useCallback` dependency arrays. ## Type of change - [ ] Bug fix - [x] Feature - [ ] Refactor - [ ] Documentation - [ ] Chore/CI ## Affected areas - [ ] Core (Go) - [ ] Transports (HTTP) - [ ] Providers/Integrations - [ ] Plugins - [x] UI (React) - [ ] Docs ## How to test 1. Configure `required_headers` in the server's client config (e.g., `["X-My-Custom-Header"]`). 2. Open the prompt playground and navigate to the settings panel. 3. Verify a "Required Headers" section appears with an input for each configured header name. 4. Enter a value for each header and send a chat completion request. 5. Confirm the header is present in the outgoing request. 6. Remove a header from the server config and verify it disappears from the UI without affecting other header values. ```sh cd ui pnpm i || npm i pnpm test || npm test pnpm build || npm run build ``` ## Screenshots/Recordings _Add before/after screenshots of the settings panel showing the new Required Headers section._ ## Breaking changes - [x] No ## Related issues ## Security considerations Header values are entered by the user and sent only to the configured backend endpoint. Empty header names or values are explicitly skipped before being added to the request, preventing accidental forwarding of blank headers. Users should be cautious not to enter sensitive credentials unless the connection to the server is secured. ## Checklist - [ ] I read `docs/contributing/README.md` and followed the guidelines - [ ] I added/updated tests where appropriate - [ ] I updated documentation where needed - [ ] I verified builds succeed (Go and UI) - [ ] I verified the CI pipeline passes locally if applicable

## Summary When exporting virtual keys, pagination clamping was being applied unnecessarily, which could interfere with retrieving the full dataset. This PR skips the pagination limit/offset clamping when the export flag is set, while still ensuring the offset is non-negative. ## Changes - Pagination clamping via `ClampPaginationParams` is now bypassed when `params.Export` is `true`, allowing exports to retrieve data without artificially constrained limits - A minimal guard ensures `params.Offset` is still set to `0` if negative during an export request ## Type of change - [x] Bug fix - [ ] Feature - [ ] Refactor - [ ] Documentation - [ ] Chore/CI ## Affected areas - [ ] Core (Go) - [x] Transports (HTTP) - [ ] Providers/Integrations - [ ] Plugins - [ ] UI (React) - [ ] Docs ## How to test Trigger a virtual keys export request and verify that all keys are returned without being truncated by pagination limits. Compare the export result count against the total number of virtual keys in the system. ```sh go test ./... ``` ## Screenshots/Recordings N/A ## Breaking changes - [ ] Yes - [x] No ## Related issues #3414 ## Security considerations No additional security implications. Export access is still gated by existing authentication and authorization checks. ## Checklist - [ ] I read `docs/contributing/README.md` and followed the guidelines - [ ] I added/updated tests where appropriate - [ ] I updated documentation where needed - [ ] I verified builds succeed (Go and UI) - [ ] I verified the CI pipeline passes locally if applicable

## Summary Bumps the `@maximhq/bifrost` NPX package version to `1.6.3` to align the `package.json` and `package-lock.json` version fields, which were previously out of sync. ## Changes - Updated `package.json` version from `1.6.2` to `1.6.3` - Corrected `package-lock.json` to reflect `1.6.3` consistently across both the lockfile root and the package entry (previously mismatched at `1.0.6` and `1.0.4`) ## Type of change - [ ] Bug fix - [ ] Feature - [ ] Refactor - [ ] Documentation - [x] Chore/CI ## Affected areas - [ ] Core (Go) - [ ] Transports (HTTP) - [ ] Providers/Integrations - [ ] Plugins - [ ] UI (React) - [ ] Docs ## How to test ```sh cd npx/bifrost npm install npm pack --dry-run # Verify the reported version is 1.6.3 ``` ## Screenshots/Recordings N/A ## Breaking changes - [ ] Yes - [x] No ## Related issues N/A ## Security considerations No security implications. ## Checklist - [ ] I read `docs/contributing/README.md` and followed the guidelines - [ ] I added/updated tests where appropriate - [ ] I updated documentation where needed - [ ] I verified builds succeed (Go and UI) - [ ] I verified the CI pipeline passes locally if applicable

… bar click suppression (#3431) Adds a log volume histogram chart to the MCP Logs page, matching the existing chart behavior on the main Logs page. Also fixes a bug where clicking a bar immediately after a drag-select would overwrite the dragged time range with a single-bucket zoom. - Added `useGetMCPHistogramQuery` to the MCP Logs page to fetch histogram data with optional polling, and rendered the `LogsVolumeChart` component in the MCP Logs view. - Added `handleTimeRangeChange`, `handleResetZoom`, and `isZoomed` logic to the MCP Logs page, mirroring the behavior already present on the main Logs page. - Fixed `isZoomed` on the main Logs page to return `false` when a named `period` (e.g. `"1h"`) is active, so resetting zoom correctly clears the zoomed state. - When resetting zoom, `period: "1h"` and `polling: true` are now explicitly set in URL state to ensure the page returns to a live-polling relative range. - Fixed a race condition in `LogsVolumeChart` where Recharts fires a Bar `onClick` event immediately after a drag-select `mouseUp`, which was overwriting the dragged range with a single-bucket zoom. A `suppressNextBarClickRef` ref is set during drag completion and cleared on the next bar click to suppress the spurious event. - [x] Bug fix - [x] Feature - [x] UI (React) 1. Navigate to the MCP Logs page and confirm the log volume histogram chart renders and updates with polling. 2. Click a bar in the histogram and confirm the time range zooms into that bucket. 3. Drag-select a range on the histogram and confirm the time range updates to the dragged selection without immediately snapping to a single bucket. 4. Click "Reset Zoom" and confirm the chart returns to the default 1-hour live-polling view. ```sh cd ui pnpm i pnpm build ``` Before: MCP Logs page had no histogram chart. After: MCP Logs page displays the log volume histogram with zoom, drag-select, and reset zoom functionality identical to the main Logs page. - [x] No None. - [ ] I read `docs/contributing/README.md` and followed the guidelines - [ ] I added/updated tests where appropriate - [ ] I updated documentation where needed - [ ] I verified builds succeed (Go and UI) - [ ] I verified the CI pipeline passes locally if applicable

## Summary This PR refactors the semantic cache plugin to simplify its internal state management, improves cache lookup correctness, and adds a new `cache_hit_types` filter to the logs API and UI. The direct cache lookup path is now a single deterministic point-fetch by a UUIDv5 `directCacheID` (replacing the previous dual-path of chunk lookup + legacy metadata scan), and several context keys are consolidated. The UI gains a "Local Caching" filter sidebar section and cache hit type badges in the log detail view. ## Changes - **Semantic cache plugin refactor:** - Replaced the dual direct-search path (`performDirectChunkLookup` + `performLegacyDirectSearch`) with a single `performDirectSearch` that does an O(1) `GetChunk` by deterministic `directCacheID` (UUIDv5 derived from provider, model, cacheKey, requestHash, paramsHash). - `generateDirectCacheID` now returns an error instead of silently falling back to a string concatenation, making failures explicit. - `request_hash` is no longer stored as a top-level metadata field; it is encoded into the `directCacheID` instead. - Reduced context keys from ~10 to 4 (`directCacheIDKey`, `paramsHashKey`, `embeddingsKey`, `embeddingsInputTokensKey`), removing stale keys like `requestIDKey`, `requestHashKey`, `isCacheHitKey`, and `cacheHitTypeKey`. - `shouldSkipCaching` is extracted into its own method; cache-hit detection now reads `CacheDebug.CacheHit` from the response rather than a context flag. - `buildUnifiedMetadata` no longer accepts `requestHash` as a parameter. - `addSingleResponse` renamed to `addNonStreamingResponse`. - `StreamAccumulator` fields `HasError`, `FinalTimestamp`, and `FinishReason` on `StreamChunk` are removed; error streams are handled by early return in `PostLLMHook`. - Streaming replay goroutine now guards every send with `ctx.Done()` to prevent goroutine leaks on dropped consumers. - A background `runStreamCleanupLoop` goroutine (started by `Init`, stopped by `Cleanup` via `stopCh`) replaces the one-shot cleanup call, periodically reaping stale stream accumulators. - `buildResponseFromResult` now accepts `threshold`, `similarity`, and `inputTokens` as pointers, and `attachCacheDebug` is extracted as a shared helper for both streaming and non-streaming paths. - `isExpiredEntry` is extracted as a standalone function. - `chunkSortKey` replaces the large inline sort comparator in `processAccumulatedStream`. - Tools, stop sequences, modalities, include lists, and other order-insensitive set fields are now hashed with `hashSortedSet` / `sortedStringSet` to prevent MCP's randomized map iteration from perturbing the request hash. - `extractAttachmentsForCaching` is extracted so attachment URLs are included in the cache key metadata rather than the embedding text. - `extractTextForEmbedding` no longer returns a `paramsHash`; callers compute it once via `buildRequestMetadataForCaching` + `hashMap`. - `generateEmbedding` moved from `utils.go` to `search.go`. - `generateRequestHash` now accepts prebuilt metadata to avoid recomputing it. - `removeField` no longer mutates the input slice's backing array. - Added `PronunciationDictionaryLocators`, `TimestampGranularities`, `Include`, `AdditionalFormats`, and `InputImages` to their respective parameter metadata extractors. - Public context key names changed from `semantic_cache_*` to `semantic_cache-*` (underscore → hyphen separator after the plugin prefix). - `SelectFields` no longer includes `request_hash`. - `VectorStoreProperties` no longer includes a `request_hash` entry. - `CacheByModel` and `CacheByProvider` default-value log messages added. - **Log filtering — `cache_hit_types`:** - Added `CacheHitTypes []string` to `SearchFilters` in `framework/logstore/tables.go`. - `applyFilters` in `rdb.go` applies a JSON path filter on `cache_debug` for both SQLite (`json_extract`) and PostgreSQL (`substring` regex) dialects, restricted to the allowlist `["direct", "semantic"]`. - `canUseMatViewFilters` excludes queries with `CacheHitTypes` set from the materialized-view fast path. - HTTP handlers (`getLogs`, `getLogsStats`, `parseHistogramFilters`) parse a `cache_hit_types` comma-separated query parameter. - **UI:** - Added a "Local Caching" filter section to `LogsFilterSidebar` with checkboxes for "Direct cache" and "Semantic cache". - `cache_hit_types` is added to URL state, filter state, and the `buildFilterParams` API helper. - Log detail view shows "Direct Cache" (indigo) and "Semantic Cache" (rose) badges based on `cache_debug.hit_type`. - Plugins form now filters the provider dropdown to embedding-capable providers only (`EmbeddingSupportedProviders` for built-ins; `custom_provider_config.allowed_requests.embedding` for custom providers), shows an error message when no embedding provider is configured, and disables the toggle accordingly. - Embedding model input replaced with `ModelMultiselect` (single-select mode) scoped to the selected provider. - Provider dropdown clears the embedding model when the provider changes. - Provider icons rendered in the provider dropdown. - `EmbeddingSupportedProviders` constant added to `ui/lib/constants/logs.ts`. - **Misc:** - HTTP request logging in `CorsMiddleware` and an auth debug log are commented out. - `transports/bifrost-http/v1.5.x` added to `.gitignore`. - Minor formatting fixes in `core/schemas/bifrost.go` and `framework/modelcatalog/sync.go`. - Missing newline at end of `sync.go` added. ## Type of change - [ ] Bug fix - [x] Feature - [x] Refactor - [ ] Documentation - [ ] Chore/CI ## Affected areas - [x] Core (Go) - [x] Transports (HTTP) - [ ] Providers/Integrations - [x] Plugins - [x] UI (React) - [ ] Docs ## How to test ```sh # Core/Transports go test ./plugins/semanticcache/... go test ./framework/logstore/... go test ./transports/bifrost-http/... # UI cd ui pnpm i pnpm build ``` - Configure the semantic cache plugin with a direct and/or semantic cache type and verify that cache hits are recorded with the correct `hit_type` in `cache_debug`. - Query `/logs?cache_hit_types=direct` and `/logs?cache_hit_types=semantic` and confirm only matching entries are returned. - In the UI, open the logs filter sidebar and verify the "Local Caching" section appears with "Direct cache" and "Semantic cache" checkboxes that correctly filter the log list. - Open a log detail for a cache hit and confirm the appropriate badge ("Direct Cache" or "Semantic Cache") is displayed. - In the plugins form, verify that only embedding-capable providers appear in the provider dropdown and that the embedding model field uses the model multiselect. ## Breaking changes - [x] Yes The public semantic cache context key names have changed from `semantic_cache_*` to `semantic_cache-*`. Any caller setting `CacheKey`, `CacheTTLKey`, `CacheThresholdKey`, `CacheTypeKey`, or `CacheNoStoreKey` via the old string values will no longer be recognized by the plugin. Update all call sites to use the exported constants from the plugin package rather than raw string literals. `request_hash` is no longer stored as a top-level metadata field in the vector store. Existing cache entries written by prior versions will not be found by the new direct-search path (they will be treated as misses and re-populated). `ClearCacheForRequestID` is documented as currently broken for entries written by the new direct-search path; callers should not rely on it until the TODO is resolved. ## Related issues N/A ## Security considerations The `CacheHitTypes` filter allowlists values to `"direct"` and `"semantic"` before interpolating them into SQL, preventing arbitrary input from reaching the JSON path expression. ## Checklist - [ ] I read `docs/contributing/README.md` and followed the guidelines - [x] I added/updated tests where appropriate - [ ] I updated documentation where needed - [x] I verified builds succeed (Go and UI) - [ ] I verified the CI pipeline passes locally if applicable

…3330) ## Summary Removes the `cleanup_on_shutdown` option from the semantic cache plugin. Cache data now always persists between Bifrost restarts. The previous behavior of deleting all cache entries and the vector store namespace on shutdown is no longer supported. ## Changes - Removed `CleanUpOnShutdown` field from `Config` struct in `plugins/semanticcache/main.go` and stripped the corresponding shutdown deletion logic from `Cleanup()` - Removed `cleanup_on_shutdown` from the JSON config schema (`transports/config.schema.json`), Helm values schema (`helm-charts/bifrost/values.schema.json`), Helm template helper (`_helpers.tpl`), and default `values.yaml` - Removed `cleanup_on_shutdown` from all example Kubernetes values files and documentation code samples - Added migration guide entry (Breaking Change 16) in `docs/migration-guides/v1.5.0.mdx` describing the removal, how to clear cache data using the existing invalidation endpoints, and how to handle dimension/provider/model rotation without the old escape hatch - Updated the semantic caching feature docs to remove references to `cleanup_on_shutdown` and the associated warning block - Removed `TestCleanup_DeletesEntriesAndNamespaceWhenEnabled` test and simplified `newTestPlugin` helper to drop the `cleanupOnShutdown` parameter across all test files ## Type of change - [ ] Bug fix - [ ] Feature - [x] Refactor - [ ] Documentation - [ ] Chore/CI ## Affected areas - [ ] Core (Go) - [ ] Transports (HTTP) - [ ] Providers/Integrations - [x] Plugins - [ ] UI (React) - [x] Docs ## How to test ```sh go test ./plugins/semanticcache/... ``` Verify that passing `cleanup_on_shutdown` in a semantic cache plugin config is rejected by schema validation. Confirm that restarting Bifrost with a semantic cache configured leaves existing vector store entries intact. ## Breaking changes - [x] Yes - [ ] No The `cleanup_on_shutdown` field is removed from the semantic cache plugin config schema and will be rejected by validation. Remove it from `config.json`, Helm values, and any `PUT /api/config` payloads. To clear cache data, use `DELETE /api/cache/clear/{cacheId}`, `DELETE /api/cache/clear-by-key/{cacheKey}`, or rotate `vector_store_namespace` to a fresh name. ## Related issues See Breaking Change 16 in the v1.5.0 migration guide. ## Security considerations None. ## Checklist - [ ] I read `docs/contributing/README.md` and followed the guidelines - [x] I added/updated tests where appropriate - [x] I updated documentation where needed - [ ] I verified builds succeed (Go and UI) - [ ] I verified the CI pipeline passes locally if applicable

## Summary Replaces the separate `PluginsForm` component with a fully self-contained `CachingView` that introduces a first-class **Direct / Direct + Semantic** mode toggle for the local cache plugin. Previously, the UI only exposed provider-backed semantic cache settings and had no concept of direct-only (hash-based) caching as a distinct, supported mode. This rewrite makes direct-only mode the default and gates semantic configuration behind an explicit mode selection. ## Changes - Deleted `pluginsForm.tsx` and consolidated all local cache configuration logic directly into `cachingView.tsx`. - Introduced a `CacheMode` type (`"direct"` | `"semantic"`) with a tab-based picker. Direct-only mode requires no embedding provider; semantic mode adds vector similarity on top and requires a provider, model, and dimension. - The enable/disable toggle now immediately calls `updatePlugin` or `createPlugin` (for first-time setup) rather than deferring the enabled-state change to the Save button, decoupling the plugin lifecycle from config edits. - Added `inferMode` to derive the active mode from a saved config, `isEmptyConfig` to detect zero-value configs from the API, `buildPayload` to strip semantic-only fields when persisting a direct-only config, and `validateForSave` for inline validation surfaced before the user clicks Save. - Structural change warnings (provider/model/dimension drift vs. server state) are now shown only when the user has actually modified those fields, rather than permanently in semantic mode. - Removed the Zod `cacheConfigSchema` validation path in favor of the new `validateForSave` function. - Removed the effect that auto-seeded a default provider/model/dimension on first load, since direct-only mode no longer requires those fields. - Per-request override documentation expanded to include `x-bf-cache-key` and `x-bf-cache-no-store` with clearer descriptions. ## Type of change - [ ] Bug fix - [ ] Feature - [x] Refactor - [ ] Documentation - [ ] Chore/CI ## Affected areas - [ ] Core (Go) - [ ] Transports (HTTP) - [ ] Providers/Integrations - [ ] Plugins - [x] UI (React) - [ ] Docs ## How to test ```sh cd ui pnpm i || npm i pnpm build || npm run build ``` 1. Navigate to the Workspace → Config → Caching view. 2. Verify the page loads with **Direct only** selected by default and no provider/model/dimension fields visible. 3. Switch to **Direct + Semantic** and confirm provider, model, and dimension fields appear with inline validation. 4. Toggle caching on without a vector store configured and confirm the toggle is disabled. 5. Save a direct-only config and confirm the plugin is created/updated with `dimension: 1` and no provider fields. 6. Save a semantic config with a valid provider, model, and dimension and confirm the full payload is persisted. 7. Reload the page and confirm the saved mode and config are correctly hydrated. ## Screenshots/Recordings Before/after screenshots recommended showing the mode tab picker, the conditional semantic fields, and the structural change warning banner. ## Breaking changes - [x] Yes - [ ] No The `PluginsForm` component is removed. Any code importing it directly will need to be updated. The enable/disable toggle now persists immediately rather than requiring a Save click, which changes the interaction model for existing users. ## Related issues N/A ## Security considerations No new auth, secrets, or PII handling introduced. API keys for embedding providers continue to be inherited from the provider's existing configuration and are not re-entered or stored in the cache config. ## Checklist - [ ] I read `docs/contributing/README.md` and followed the guidelines - [ ] I added/updated tests where appropriate - [ ] I updated documentation where needed - [ ] I verified builds succeed (Go and UI) - [ ] I verified the CI pipeline passes locally if applicable

…and plugin reloads (#3423) ## Summary The `CacheHandler` previously captured a reference to the `semantic_cache` plugin at boot time. This caused two bugs: (1) if the plugin was not present in `config.json` at startup, cache-clear routes were never registered, resulting in HTTP 405 for the entire process lifetime; (2) if the plugin was loaded or reloaded via `/api/plugins` after boot, the handler held a stale (or nil) pointer and would silently misbehave. Additionally, `GET /api/plugins/:name` was returning the raw plugin config without runtime status, causing the UI to see an empty status when refetching a single plugin. ## Changes - `CacheHandler` now accepts a `CacheClearerResolver` function instead of a concrete plugin pointer. The resolver is called on every cache-clear request, so plugin lifecycle changes via `/api/plugins` are always honored. - `CacheClearer` and `CacheClearerResolver` are exported so server wiring can supply the resolver without importing the plugin's concrete type. - Cache routes are registered unconditionally at startup. When no plugin is loaded, requests return HTTP 400 with a descriptive message instead of HTTP 405. - The server wiring in `RegisterAPIRoutes` uses a closure over `lib.FindPluginAs` to resolve the plugin per request, replacing the boot-time capture. - `getPlugin` now returns the same response shape as list/create/update (with runtime status merged in), fixing the empty status seen by `useGetPluginQuery` in the UI. - Tests cover the new "plugin not loaded" path for both `clearCache` and `clearCacheByKey`, and existing tests are updated to use the resolver-based constructor. ## Type of change - [x] Bug fix - [ ] Feature - [ ] Refactor - [ ] Documentation - [ ] Chore/CI ## Affected areas - [ ] Core (Go) - [x] Transports (HTTP) - [ ] Providers/Integrations - [x] Plugins - [ ] UI (React) - [ ] Docs ## How to test ```sh go test ./transports/bifrost-http/handlers/... go test ./transports/bifrost-http/... ``` 1. Start the server **without** `semantic_cache` in `config.json`. Issue `DELETE /api/cache/clear/{cacheId}` — expect HTTP 400 with `"semantic_cache plugin is not loaded"` (previously HTTP 405). 2. Load the `semantic_cache` plugin via `POST /api/plugins`. Repeat the request — expect the cache-clear to succeed. 3. Reload or remove the plugin via `PUT`/`DELETE /api/plugins`. Verify the handler reflects the new state on the next request without a server restart. 4. Issue `GET /api/plugins/{name}` for a loaded plugin and confirm the response includes runtime status fields, matching the shape returned by the list endpoint. ## Breaking changes - [x] Yes - [ ] No `NewCacheHandler` now accepts a `CacheClearerResolver` function instead of a `schemas.LLMPlugin`. Any caller constructing a `CacheHandler` directly must be updated to pass a resolver. ## Related issues ## Security considerations None. The change does not affect authentication, secrets, or PII handling. ## Checklist - [ ] I read `docs/contributing/README.md` and followed the guidelines - [x] I added/updated tests where appropriate - [ ] I updated documentation where needed - [x] I verified builds succeed (Go and UI) - [ ] I verified the CI pipeline passes locally if applicable

…rch paths in semantic cache (#3424) ## Summary Fixes several correctness issues in the semantic cache plugin's `PostLLMHook` and related helpers: cache telemetry (`cache_debug`) was previously invisible to callers using `no-store`, cache-hit replay detection was fragile, non-positive per-request TTL overrides could silently kill cache writes, and requests with a `cache_type` header narrowed to a path the plugin cannot serve would still produce orphan cache entries. ## Changes - **Early exit for unsupported search paths in `PreLLMHook`**: When `resolveCacheTypes` resolves to a path the plugin cannot actually serve (e.g. `x-bf-cache-type=semantic` against a direct-only plugin, or an unknown header value), the hook now clears cache state and returns early instead of proceeding to generate an embedding or write an orphan entry under a random request UUID that no future read can match. - **Separated cache-hit replay handling from write-skip logic**: The `shouldSkipCaching` method (which conflated cache-hit detection with write-skip conditions) is replaced by `shouldSkipCacheWrite`. Cache-hit replay is now handled as a dedicated early return in `PostLLMHook` before any telemetry stamping, while `shouldSkipCacheWrite` gates only the write decision after telemetry is already stamped. This ensures `cache_debug` is always populated for callers using `no-store` or large-payload modes. - **Telemetry stamped before write decision**: `stampCacheDebugForMiss` is now called before `shouldSkipCacheWrite` is consulted, so observability is not conditional on whether the entry is ultimately written. - **Non-positive TTL overrides fall back to plugin default**: `resolveTTL` now treats a zero or negative per-request TTL override as "use default" rather than applying it, which would have set `expires_at=now` and silently discarded the cache write. - **Cleaned up stale comments**: Removed an outdated ordering constraint comment in `PostLLMHook` that no longer applies after the restructuring. - **Tests updated**: Test cases for `shouldSkipCaching` are renamed and updated to reflect the new `shouldSkipCacheWrite` contract. The cache-hit replay test case is removed from this suite (it is now an early return in `PostLLMHook`, not a condition inside the helper). A new default-is-false test is added. ## Type of change - [x] Bug fix - [ ] Feature - [x] Refactor - [ ] Documentation - [ ] Chore/CI ## Affected areas - [ ] Core (Go) - [ ] Transports (HTTP) - [ ] Providers/Integrations - [x] Plugins - [ ] UI (React) - [ ] Docs ## How to test ```sh go test ./plugins/semanticcache/... ``` Validate the following scenarios: - A request with `x-bf-cache-type=semantic` against a plugin configured with `Provider=""` or `Dimension=1` should log a warning and skip caching entirely — no orphan entry should appear in the store. - A request with `Cache-Control: no-store` should still produce a populated `cache_debug` field in the response with `cache_hit=false`. - A per-request TTL override of `0s` should fall back to the plugin's configured default TTL and not silently discard the cache write. ## Breaking changes - [x] No ## Security considerations None. ## Checklist - [ ] I read `docs/contributing/README.md` and followed the guidelines - [x] I added/updated tests where appropriate - [ ] I updated documentation where needed - [x] I verified builds succeed (Go and UI) - [ ] I verified the CI pipeline passes locally if applicable

## Summary Adds a standalone end-to-end test suite for the `semantic_cache` plugin under `tests/semanticcache`. The suite validates the full caching lifecycle against a live Bifrost instance — plugin creation/teardown, cache miss/hit assertions, cross-provider behavior, streaming, and log cross-checking — without provisioning any infrastructure itself. ## Changes - **`e2e_test.go`** — `TestMain` entry point: loads config, initializes the report directory, checks Bifrost reachability, enforces plugin-absent precondition (with `RUN_FORCE=1` auto-delete), runs all phases, and performs best-effort teardown on exit. - **`preconditions_test.go`** — Phase 0 checks: Bifrost reachable, OpenAI configured, optional providers (Gemini, Anthropic) present with warnings if absent. - **`http_test.go`** — HTTP helpers for all request types: chat completions (streaming and non-streaming), text completions, embeddings, image generation, and the Responses API. Each helper dumps full request/response bodies to the report directory for forensics. - **`plugin_test.go`** — Plugin lifecycle helpers (`pluginCreate`, `pluginUpdate`, `pluginDelete`, `pluginGet`) mirroring the exact wire format the UI sends to `/api/plugins`. - **`assert_test.go`** — Assertion helpers (`assertMiss`, `assertHit`, `assertNoCacheDebug`, `assertSameCacheID`, `assertDifferentCacheID`) plus a configurable async write-settle wait (`SC_WRITE_SETTLE_MS`) to account for the plugin's async PostLLMHook store write. - **`cache_test.go`** — Cache management helpers (`clearByCacheID`, `clearByCacheKey`) wrapping the `/api/cache/clear/*` endpoints. - **`logs_crosscheck_test.go`** — Cross-checks the persisted log row's `cache_debug` against the in-flight response stamp, with polling to handle Bifrost's async logging pipeline and float epsilon tolerance for JSON encoder differences. - **`fixtures_test.go`** — Hand-curated paraphrase pairs for Phase 2 semantic cases, designed to land well above (canonical→paraphrase) or well below (canonical→unrelated) the default 0.8 similarity threshold. - **`log_test.go`** — Structured per-run logging to `reports/<UTC-timestamp>/run.log` with optional `TRAIL_SESSION_ID` stamping for trail integration. - **`go.mod`** — Standalone module (`github.com/maximhq/bifrost/tests/semanticcache`), consistent with the `tests/governance` pattern, excluded from the repo's `go.work`. - **`README.md`** — Documents prerequisites, env vars, run commands, trail integration, and report output format. - **`.gitignore`** — Excludes `reports/` and `*.log` from version control. Notable design decisions: the suite is intentionally verify-only (no infrastructure provisioning), uses a dedicated vector store namespace (`BifrostSemanticCachePluginE2E`) to isolate test data, and writes full wire-level request/response artifacts per step to support post-mortem debugging without re-running. ## Type of change - [ ] Bug fix - [ ] Feature - [ ] Refactor - [ ] Documentation - [x] Chore/CI ## Affected areas - [ ] Core (Go) - [ ] Transports (HTTP) - [ ] Providers/Integrations - [x] Plugins - [ ] UI (React) - [ ] Docs ## How to test Requires a running Bifrost instance with Weaviate configured, OpenAI (required), and optionally Gemini and Anthropic providers. ```sh cd tests/semanticcache # All phases GOWORK=off go test -v ./... # Single phase GOWORK=off go test -v -run TestPhase1_DirectOnly ./... # Auto-delete any pre-existing plugin row before run RUN_FORCE=1 GOWORK=off go test -v ./... # Keep plugin after run for post-mortem inspection RUN_KEEP_PLUGIN=1 GOWORK=off go test -v ./... ``` Environment variables: | Variable | Default | Purpose | |---|---|---| | `BIFROST_URL` | `http://localhost:8080` | Bifrost base URL | | `SC_CHAT_MODEL_OPENAI` | `openai/gpt-4o-mini` | OpenAI chat model | | `SC_CHAT_MODEL_OPENAI_ALT` | `openai/gpt-4o` | Alternate OpenAI model for cache-by-model cases | | `SC_EMBED_MODEL_OPENAI` | `text-embedding-3-small` | Embedding model for Phase 2 | | `SC_CHAT_MODEL_GEMINI` | `gemini/gemini-2.5-flash` | Gemini chat model | | `SC_CHAT_MODEL_ANTHROPIC` | `anthropic/claude-haiku-4-5` | Anthropic chat model | | `SC_NAMESPACE` | `BifrostSemanticCachePluginE2E` | Vector store namespace | | `SC_WRITE_SETTLE_MS` | `500` | Async write settle wait in ms | | `RUN_FORCE` | unset | `1` to delete pre-existing plugin before run | | `RUN_KEEP_PLUGIN` | unset | `1` to skip teardown on exit | | `TRAIL_SESSION_ID` | unset | Stamped onto every log line for trail integration | ## Screenshots/Recordings N/A ## Breaking changes - [x] No ## Related issues N/A ## Security considerations No secrets are stored in the test suite. API keys are consumed from the existing Bifrost provider configuration and never passed directly through the test harness. ## Checklist - [ ] I read `docs/contributing/README.md` and followed the guidelines - [x] I added/updated tests where appropriate - [x] I updated documentation where needed - [x] I verified builds succeed (Go and UI) - [ ] I verified the CI pipeline passes locally if applicable

## Summary Adds a comprehensive end-to-end test suite (`TestDirect`) for the semantic cache plugin operating in direct-only mode. The suite covers 55 test cases (plan §1.1–1.55) validating cache hit/miss behavior, key isolation, TTL handling, config flag mutations, normalization, streaming, multi-endpoint support, parameter hashing, tool definitions, and cache management operations. ## Changes - Introduces `tests/semanticcache/direct_test.go` with `TestDirect`, covering: - **Basic hit/miss and key isolation** (1.1, 1.2, 1.3, 1.4) - **`cache_by_model` and `cache_by_provider` flag behavior** (1.5–1.8), including serial config-mutation cases that restore baseline via `t.Cleanup` - **`exclude_system_prompt` flag** (1.9, 1.10) - **Conversation threshold boundary conditions** (1.11, 1.12) - **TTL expiry, per-request TTL override, invalid TTL fallback, and zero/negative TTL fallback** (1.13, 1.14, 1.15, 1.54) - **`no-store` header semantics**, including case-sensitivity and explicit `false` value (1.16, 1.17, 1.45, 1.46) - **`cache-type` header behavior** in direct-only mode, including the `semantic` header bug case (1.18, 1.19) - **Streaming SSE**: hit/miss, chunk replay order, and non-final chunk cache_debug absence (1.24, 1.25, 1.47) - **Multi-endpoint coverage**: text completions, responses API, embeddings, and image generation (1.20–1.23) - **Input normalization**: case folding, whitespace trimming, Unicode, and large prompts (1.26–1.29) - **Image attachment hashing**: same URL hits, different URL misses (1.30, 1.31) - **Edge cases**: nil content messages, empty messages array, unknown cache ID deletion (1.42, 1.43, 1.40) - **Parameter hash isolation**: temperature, top_p, seed, max_tokens, top_logprobs, tools (order-independent and name-change), prompt_cache_key, service_tier, store flag (1.32–1.37, 1.48–1.52) - **Cache management**: clear by cache ID, clear by key (1.38, 1.39) - **Plugin status round-trip** via GET (1.44) - **`/api/logs` cross-check**: verifies persisted `cache_debug` matches in-flight response stamp (1.55) - **`responses` API `previous_response_id` isolation** (1.53) - **Threshold header no-op in direct-only mode** (1.41) - Adds helper functions: `simpleChat`, `chatWithSystem`, `chatWithImage`, `restoreDirectBaseline`, `assertHitAndReturnCacheDebug` - Establishes a parallelism contract: cases that mutate plugin config run serially (no `t.Parallel()`); all others run concurrently with unique cache keys to prevent collisions ## Type of change - [ ] Bug fix - [ ] Feature - [ ] Refactor - [ ] Documentation - [x] Chore/CI ## Affected areas - [ ] Core (Go) - [ ] Transports (HTTP) - [ ] Providers/Integrations - [x] Plugins - [ ] UI (React) - [ ] Docs ## How to test ```sh # Run the full direct-mode suite go test ./tests/semanticcache/... -run TestDirect -v -timeout 300s # Skip the expensive image generation case SC_SKIP_IMAGE_GEN=1 go test ./tests/semanticcache/... -run TestDirect -v -timeout 300s ``` Required environment variables (same as the broader semantic cache e2e suite): - `OPENAI_MODEL` — primary OpenAI-compatible model (e.g. `openai/gpt-4o-mini`) - `OPENAI_MODEL_ALT` — secondary model for cross-model isolation cases - `OPENAI_EMBED` — embedding model name (e.g. `text-embedding-3-small`) - `ANTHRO_MODEL` — (optional) Anthropic model; cases 1.7 and 1.8 skip if unset - `SC_SKIP_IMAGE_GEN=1` — (optional) skip case 1.23 to avoid DALL-E costs ## Screenshots/Recordings N/A — test-only change. ## Breaking changes - [x] No ## Related issues ## Security considerations No new auth, secrets, or PII surface. Test prompts are benign and do not contain sensitive data. ## Checklist - [ ] I read `docs/contributing/README.md` and followed the guidelines - [x] I added/updated tests where appropriate - [ ] I updated documentation where needed - [x] I verified builds succeed (Go and UI) - [ ] I verified the CI pipeline passes locally if applicable

## Summary Adds a comprehensive integration test suite for the semantic cache mode (Phase 2), covering the full lifecycle of semantic similarity-based cache hits and misses using Weaviate as the vector store and OpenAI's `text-embedding-3-small` as the embedding model. This suite validates that the semantic cache behaves correctly across a wide range of real-world scenarios, complementing the existing direct-mode (Phase 1) tests. ## Changes - Added `TestParaphraseFixtures` to pre-flight all paraphrase pairs against the live embedding model, asserting cosine similarity thresholds before any semantic cache cases run. This prevents flaky downstream failures caused by borderline fixture pairs. - Added `TestSemantic` containing 44 sub-cases (2.1–2.44) covering: - Semantic hit on paraphrase, miss on unrelated content - Per-request threshold overrides (relax, tighten, clamp above/below valid range) - `x-bf-cache-type` header forcing direct-only or semantic-only lookup paths - Cache key and model/provider isolation in semantic mode - `cache_by_model=false` and `cache_by_provider=false` cross-model/cross-provider hits - Streaming replay of semantic hits, including tool call preservation - TTL expiry, per-request TTL, TTL=0 fallback, and `no-store` header semantics - Namespace isolation and dimension-change silent miss behavior - Embedding endpoint bypass (semantic search skipped for `/v1/embeddings`) - Image generation and Responses API semantic hits - Text completion semantic hits - Gemini provider with OpenAI embedding provider - `params_hash` isolation (temperature, service tier, store flag, prompt cache key, previous response ID) - `exclude_system_prompt` flag effect on semantic matching - Conversation message threshold skipping semantic search - Attachment URL changes causing misses - `cache_debug` field presence and correctness on hits and misses, including log endpoint cross-check - Streaming chunk-level `cache_debug` placement (final chunk only) - Serial (non-parallel) cases that mutate plugin config restore baseline via `t.Cleanup` to avoid test pollution. - A dedicated Weaviate namespace (`cfg.Namespace + "Semantic"`) is used to avoid dimension conflicts with the Phase 1 direct-mode namespace. ## Type of change - [ ] Bug fix - [ ] Feature - [ ] Refactor - [ ] Documentation - [x] Chore/CI ## Affected areas - [ ] Core (Go) - [ ] Transports (HTTP) - [ ] Providers/Integrations - [x] Plugins - [ ] UI (React) - [ ] Docs ## How to test ```sh # Run fixture pre-flight (requires OpenAI embedding access) go test ./tests/semanticcache/... -run TestParaphraseFixtures -v # Run full semantic suite go test ./tests/semanticcache/... -run TestSemantic -v -timeout 10m # Skip fixture verification if embedding access is unavailable SC_SKIP_FIXTURE_VERIFY=1 go test ./tests/semanticcache/... -run TestSemantic -v -timeout 10m # Skip image generation cases if DALL-E is unavailable SC_SKIP_IMAGE_GEN=1 go test ./tests/semanticcache/... -run TestSemantic -v -timeout 10m ``` Required environment/config: - `cfg.OpenAIEmbed` — embedding model name (e.g. `text-embedding-3-small`) - `cfg.OpenAIModel` / `cfg.OpenAIModelAlt` — chat models for isolation tests - `cfg.AnthroModel` — optional; skipped if empty (case 2.13) - `cfg.GeminiModel` — optional; skipped if empty (case 2.28) - `cfg.Namespace` — base Weaviate namespace; suite appends `Semantic` suffix - `SC_SKIP_FIXTURE_VERIFY=1` — skip embedding pre-flight - `SC_SKIP_IMAGE_GEN=1` — skip DALL-E case ## Screenshots/Recordings N/A ## Breaking changes - [x] No ## Related issues N/A ## Security considerations No new auth, secrets, or PII handling introduced. Tests call live external APIs (OpenAI, optionally Anthropic/Gemini) and require valid credentials in the test environment; no credentials are hardcoded. ## Checklist - [ ] I read `docs/contributing/README.md` and followed the guidelines - [x] I added/updated tests where appropriate - [ ] I updated documentation where needed - [x] I verified builds succeed (Go and UI) - [ ] I verified the CI pipeline passes locally if applicable

## Summary Adds an end-to-end lifecycle test for the semantic cache plugin, covering the full disable → re-enable → delete → recreate flow and asserting that namespace data persists across each state transition. ## Changes - Introduces `TestLifecycle` in `tests/semanticcache/lifecycle_test.go`, which runs 10 serial subtests (3.1–3.10) exercising the plugin's lifecycle state machine: - **3.1** – Disabling the plugin via PUT sets `enabled=false` and `status=disabled` - **3.2** – Requests while disabled bypass the cache pipeline entirely (no `cache_debug` header) - **3.3 / 3.4** – Cache-clear endpoints (`/api/cache/clear/{id}` and `/api/cache/clear-by-key/{k}`) return HTTP 400 when the plugin is not loaded - **3.5** – Re-enabling via PUT restores `enabled=true` and `status=active` - **3.6** – Entries written before disable are still queryable after re-enable - **3.7** – DELETE removes both the DB row and the in-memory plugin instance - **3.8** – Requests after delete bypass the cache pipeline (no `cache_debug` header) - **3.9** – Recreating the plugin with the same config succeeds and surfaces `status=active` - **3.10** – Entries written before delete are still queryable after recreate, validating the namespace-persistence contract introduced by the removal of `CleanUpOnShutdown` - Tests are intentionally serial (no `t.Parallel()`) because each subtest mutates globally shared plugin lifecycle state - A `t.Cleanup` handler performs best-effort key clearing regardless of which lifecycle state the plugin is left in at teardown ## Type of change - [ ] Bug fix - [ ] Feature - [ ] Refactor - [ ] Documentation - [x] Chore/CI ## Affected areas - [ ] Core (Go) - [ ] Transports (HTTP) - [ ] Providers/Integrations - [x] Plugins - [ ] UI (React) - [ ] Docs ## How to test ```sh go test ./tests/semanticcache/... -run TestLifecycle -v ``` Expected outcome: all 10 subtests (3.1–3.10) pass, with structured log output at each step confirming correct status transitions and cache hit/miss behaviour. ## Breaking changes - [x] No ## Related issues ## Security considerations None. Tests run against a local Bifrost instance and do not introduce new auth paths, secrets handling, or PII exposure. ## Checklist - [ ] I read `docs/contributing/README.md` and followed the guidelines - [x] I added/updated tests where appropriate - [ ] I updated documentation where needed - [x] I verified builds succeed (Go and UI) - [ ] I verified the CI pipeline passes locally if applicable

…kefile targets (#3429) ## Summary Adds Makefile targets for running `semantic_cache` plugin unit tests and end-to-end tests, with optional integration of the `trail` CLI for capture-based debugging sessions. ## Changes - Added `test-semantic-cache` target that runs e2e tests from `tests/semanticcache`, supporting a `CACHE_TYPE` variable (`direct` or `semantic`) to filter which test phases are executed. Automatically wraps the run in `trail run` if the `trail` binary is available on `PATH`. - Added `test-semantic-cache-complete` target that runs both the plugin unit tests (`plugins/semanticcache`) and the e2e tests in sequence, optionally wrapping the entire session in a single `trail run` invocation. - Added `_test-semantic-cache-complete-inner` as an internal helper target that performs the actual sequential execution of unit and e2e tests with formatted output banners. - Registered all three new targets in the `.PHONY` declaration. ## Type of change - [ ] Bug fix - [ ] Feature - [ ] Refactor - [ ] Documentation - [x] Chore/CI ## Affected areas - [ ] Core (Go) - [ ] Transports (HTTP) - [ ] Providers/Integrations - [x] Plugins - [ ] UI (React) - [ ] Docs ## How to test ```sh # Run all semantic_cache e2e tests make test-semantic-cache # Run only direct cache tests CACHE_TYPE=direct make test-semantic-cache # Run only semantic cache tests CACHE_TYPE=semantic make test-semantic-cache # Run both unit and e2e tests together make test-semantic-cache-complete # Force e2e run regardless of preconditions RUN_FORCE=1 make test-semantic-cache-complete ``` If `trail` is installed and on `PATH`, all commands will automatically wrap execution in a `trail run` session for capture-based debugging. ## Breaking changes - [ ] Yes - [x] No ## Related issues ## Security considerations None. ## Checklist - [ ] I read `docs/contributing/README.md` and followed the guidelines - [ ] I added/updated tests where appropriate - [ ] I updated documentation where needed - [ ] I verified builds succeed (Go and UI) - [ ] I verified the CI pipeline passes locally if applicable

## Summary Briefly explain the purpose of this PR and the problem it solves. ## Changes - What was changed and why - Any notable design decisions or trade-offs ## Type of change - [ ] Bug fix - [ ] Feature - [ ] Refactor - [ ] Documentation - [ ] Chore/CI ## Affected areas - [ ] Core (Go) - [ ] Transports (HTTP) - [ ] Providers/Integrations - [ ] Plugins - [ ] UI (React) - [ ] Docs ## How to test Describe the steps to validate this change. Include commands and expected outcomes. ```sh # Core/Transports go version go test ./... # UI cd ui pnpm i || npm i pnpm test || npm test pnpm build || npm run build ``` If adding new configs or environment variables, document them here. ## Screenshots/Recordings If UI changes, add before/after screenshots or short clips. ## Breaking changes - [ ] Yes - [ ] No If yes, describe impact and migration instructions. ## Related issues Link related issues and discussions. Example: Closes #123 ## Security considerations Note any security implications (auth, secrets, PII, sandboxing, etc.). ## Checklist - [ ] I read `docs/contributing/README.md` and followed the guidelines - [ ] I added/updated tests where appropriate - [ ] I updated documentation where needed - [ ] I verified builds succeed (Go and UI) - [ ] I verified the CI pipeline passes locally if applicable

…#3444)

…tions column (#3480) ## Summary Replaces the direct delete button in the logs and MCP logs action columns with a dropdown menu triggered by a `MoreHorizontal` icon. This improves the UI by providing a more scalable actions pattern while keeping the delete functionality accessible. The actions column is also now properly pinned to the right side of the table when the user has delete access. ## Changes - Replaced the inline destructive `Trash2` button with a `DropdownMenu` containing a "Delete" item for both logs and MCP logs tables - The actions column trigger is now a ghost `MoreHorizontal` icon button, reducing visual noise in the table - The actions column is pinned to the right only when `hasDeleteAccess` is true; otherwise no fixed columns are configured - Fixed `fixedColumnIds` to include `"actions"` so the column receives correct sticky positioning behavior - Removed `overflow-hidden` from pinned cells in the MCP logs table to prevent the dropdown from being clipped - Reduced the actions column size from 72 to 56px ## Type of change - [ ] Bug fix - [x] Feature - [ ] Refactor - [ ] Documentation - [ ] Chore/CI ## Affected areas - [ ] Core (Go) - [ ] Transports (HTTP) - [ ] Providers/Integrations - [ ] Plugins - [x] UI (React) - [ ] Docs ## How to test 1. Navigate to the Logs page as a user with delete access. 2. Confirm the actions column is pinned to the right of the table. 3. Click the `⋯` icon on any row and verify the dropdown appears with a "Delete" option. 4. Click "Delete" and confirm the log is deleted without the row click handler firing. 5. Repeat on the MCP Logs page. 6. Log in as a user without delete access and confirm the actions column is not present. ```sh cd ui pnpm i pnpm build ``` ## Screenshots/Recordings _Add before/after screenshots showing the old delete button vs. the new dropdown._ ## Breaking changes - [ ] Yes - [x] No ## Related issues ## Security considerations No new security implications. Delete access gating remains unchanged. ## Checklist - [ ] I read `docs/contributing/README.md` and followed the guidelines - [ ] I added/updated tests where appropriate - [ ] I updated documentation where needed - [ ] I verified builds succeed (Go and UI) - [ ] I verified the CI pipeline passes locally if applicable

…wing text (#3481) ## Summary Fixes layout overflow issues in the Model Catalog table where long provider names and model badge text would break out of their columns or cause uneven column sizing. ## Changes - Added `table-fixed` layout with explicit `<colgroup>` column widths (26% / 44% / 16% / 14%) to enforce stable column proportions - Added `overflow-hidden` and `truncate` to the Provider name cell so long names are clipped cleanly instead of overflowing - Added `shrink-0` to the "CUSTOM" badge so it doesn't compress when the provider name is long - Added `max-w-[220px] truncate` to model name badges in `ModelsUsedCell` to prevent individual badges from stretching too wide ## Type of change - [x] Bug fix - [ ] Feature - [ ] Refactor - [ ] Documentation - [ ] Chore/CI ## Affected areas - [ ] Core (Go) - [ ] Transports (HTTP) - [ ] Providers/Integrations - [ ] Plugins - [x] UI (React) - [ ] Docs ## How to test Navigate to the Model Catalog page and verify: 1. Provider names that are long truncate cleanly within their column 2. The "CUSTOM" badge remains visible and does not shrink when next to a long provider name 3. Model name badges in the models column truncate at a reasonable width 4. Column widths remain stable regardless of content length ```sh cd ui pnpm i || npm i pnpm build || npm run build ``` ## Screenshots/Recordings If UI changes, add before/after screenshots or short clips. ## Breaking changes - [ ] Yes - [x] No ## Related issues ## Security considerations None. ## Checklist - [ ] I read `docs/contributing/README.md` and followed the guidelines - [ ] I added/updated tests where appropriate - [ ] I updated documentation where needed - [ ] I verified builds succeed (Go and UI) - [ ] I verified the CI pipeline passes locally if applicable

…y names (#3482) ## Summary Fixes layout issues in the model provider keys table where long key/model/server names would overflow their cells and cause the table to render incorrectly. ## Changes - Applied `table-fixed` layout to the keys table and defined explicit column widths via `<colgroup>` (64% for the name column, 12% each for the remaining three columns) to enforce stable column sizing regardless of content length - Added `overflow-hidden` to the name cell and `min-w-0` + `truncate` to the name text span so long strings are clipped with an ellipsis instead of breaking the layout - Added a trailing newline at end of file ## Type of change - [x] Bug fix - [ ] Feature - [ ] Refactor - [ ] Documentation - [ ] Chore/CI ## Affected areas - [ ] Core (Go) - [ ] Transports (HTTP) - [ ] Providers/Integrations - [ ] Plugins - [x] UI (React) - [ ] Docs ## How to test 1. Navigate to the Workspace → Providers page and open a provider that has keys with long names (e.g. a vLLM model name or a long API key label). 2. Verify that the name column truncates with an ellipsis rather than overflowing into adjacent columns. 3. Verify that the three action columns (weight, status, actions) maintain consistent widths. ```sh cd ui pnpm i || npm i pnpm test || npm test pnpm build || npm run build ``` ## Screenshots/Recordings Add before/after screenshots showing the table with a long key name to confirm truncation behavior. ## Breaking changes - [ ] Yes - [x] No ## Related issues ## Security considerations None. ## Checklist - [ ] I read `docs/contributing/README.md` and followed the guidelines - [ ] I added/updated tests where appropriate - [ ] I updated documentation where needed - [ ] I verified builds succeed (Go and UI) - [ ] I verified the CI pipeline passes locally if applicable

…limits table (#3483) ## Summary Replaces the individual inline Edit and Delete action buttons in the model limits table with a consolidated `DropdownMenu` (three-dot menu). The delete confirmation dialog is also lifted out of the per-row render loop and rendered once at the table level, driven by a `deleteModelConfigId` state value. ## Changes - Replaced per-row Edit and Delete buttons with a single `MoreHorizontal` icon button that opens a `DropdownMenu` containing Edit and Delete items. - Moved the `AlertDialog` for delete confirmation out of the table row loop into a single top-level instance, controlled by `deleteModelConfigId` state. This prevents multiple dialog instances from being mounted simultaneously. - Added `deletingModelConfig` derived from `deleteModelConfigId` via `useMemo`, keeping it in sync with the RTK cache similarly to `editingModelConfig`. - Cleared `deleteModelConfigId` on successful deletion to close the dialog automatically. - Removed the hover/focus-dependent opacity animation on the action cell since the dropdown replaces that pattern. ## Type of change - [ ] Bug fix - [ ] Feature - [x] Refactor - [ ] Documentation - [ ] Chore/CI ## Affected areas - [ ] Core (Go) - [ ] Transports (HTTP) - [ ] Providers/Integrations - [ ] Plugins - [x] UI (React) - [ ] Docs ## How to test 1. Navigate to the Model Limits table in the workspace. 2. Hover over any model limit row and click the `...` (MoreHorizontal) button. 3. Verify the dropdown shows **Edit** and **Delete** options. 4. Click **Edit** and confirm the model limit sheet opens with the correct config pre-populated. 5. Click **Delete** and confirm the confirmation dialog appears with the correct model name (truncated if over 30 characters). 6. Confirm deletion succeeds, the dialog closes, and a success toast is shown. 7. Confirm that users without update/delete RBAC access see the respective menu items disabled. ```sh cd ui pnpm i || npm i pnpm test || npm test pnpm build || npm run build ``` ## Screenshots/Recordings _Add before/after screenshots showing the old separate Edit/Delete buttons vs. the new dropdown menu._ ## Breaking changes - [x] No ## Related issues ## Security considerations RBAC checks for `Governance` update and delete operations are preserved on the new dropdown menu items. ## Checklist - [ ] I read `docs/contributing/README.md` and followed the guidelines - [ ] I added/updated tests where appropriate - [ ] I updated documentation where needed - [ ] I verified builds succeed (Go and UI) - [ ] I verified the CI pipeline passes locally if applicable

## Summary Replaces the individual Edit and Delete action buttons in the routing rules table with a consolidated `DropdownMenu` triggered by a `MoreHorizontal` icon. This reduces visual clutter in the actions column and provides a more consistent UX pattern for row-level actions. ## Changes - Replaced separate Edit and Delete ghost buttons with a single `MoreHorizontal` icon button that opens a dropdown menu containing both actions - Edit and Delete items within the dropdown remain gated by `canUpdate` and `canDelete` permissions respectively, now using the `disabled` prop instead of conditional rendering - The Delete item uses the `destructive` variant to visually distinguish it from the Edit action - Changed `catch (error: any)` to `catch (error: unknown)` for improved type safety - Added a trailing newline to the end of the file ## Type of change - [ ] Bug fix - [ ] Feature - [x] Refactor - [ ] Documentation - [ ] Chore/CI ## Affected areas - [ ] Core (Go) - [ ] Transports (HTTP) - [ ] Providers/Integrations - [ ] Plugins - [x] UI (React) - [ ] Docs ## How to test Navigate to the routing rules table and verify: 1. Each row displays a `MoreHorizontal` icon button in the actions column 2. Clicking the icon opens a dropdown with Edit and Delete options 3. Edit and Delete options are disabled when the user lacks the respective permissions 4. Selecting Edit opens the edit flow for the correct rule 5. Selecting Delete triggers the delete confirmation dialog for the correct rule 6. Row click propagation is not triggered when interacting with the dropdown ```sh cd ui pnpm i || npm i pnpm test || npm test pnpm build || npm run build ``` ## Screenshots/Recordings Before: Two separate icon buttons (pencil and trash) visible inline on each row. After: A single `⋯` icon button per row that reveals Edit and Delete options in a dropdown, with Delete styled in red. ## Breaking changes - [ ] Yes - [x] No ## Related issues ## Security considerations No security implications. Permission checks (`canUpdate`, `canDelete`) are preserved. ## Checklist - [ ] I read `docs/contributing/README.md` and followed the guidelines - [ ] I added/updated tests where appropriate - [ ] I updated documentation where needed - [ ] I verified builds succeed (Go and UI) - [ ] I verified the CI pipeline passes locally if applicable

…g overrides table (#3485) ## Summary Replaces the individual Edit and Delete action buttons in the pricing overrides table with a consolidated `DropdownMenu` triggered by a `MoreHorizontal` icon. Also fixes a sidebar active state bug where sub-items were incorrectly matching routes, and adds `hasAPIKeyAccess` to the sidebar's memoization dependencies. ## Changes - Replaced separate Edit and Delete icon buttons in the pricing overrides table rows with a single `MoreHorizontal` actions dropdown containing labeled Edit and Delete menu items. The Delete item uses the destructive variant for visual clarity. - Fixed sidebar sub-item active state detection to use `isRouteMatch` instead of `pathname.startsWith`, preventing incorrect active highlighting on partial path matches. - Added `hasAPIKeyAccess` to the sidebar's `useMemo` dependency array, which was previously missing and could cause stale renders. ## Type of change - [ ] Bug fix - [x] Feature - [x] Refactor - [ ] Documentation - [ ] Chore/CI ## Affected areas - [ ] Core (Go) - [ ] Transports (HTTP) - [ ] Providers/Integrations - [ ] Plugins - [x] UI (React) - [ ] Docs ## How to test 1. Navigate to the custom pricing overrides table. 2. Hover over a row and click the `⋯` (MoreHorizontal) button — a dropdown should appear with **Edit** and **Delete** options. 3. Clicking **Edit** should open the edit drawer without triggering row selection. 4. Clicking **Delete** should open the delete confirmation dialog without triggering row selection. 5. Verify sidebar sub-item active states are correct when navigating between nested routes — only the exact matching route should appear active. ```sh cd ui pnpm i || npm i pnpm test || npm test pnpm build || npm run build ``` ## Screenshots/Recordings Before: Two separate ghost icon buttons (pencil and trash) visible inline on each row. After: A single `⋯` button per row that reveals a dropdown with labeled **Edit** and **Delete** actions. ## Breaking changes - [ ] Yes - [x] No ## Related issues ## Security considerations No security implications. ## Checklist - [ ] I read `docs/contributing/README.md` and followed the guidelines - [ ] I added/updated tests where appropriate - [ ] I updated documentation where needed - [ ] I verified builds succeed (Go and UI) - [ ] I verified the CI pipeline passes locally if applicable

…s column in MCP clients table (#3486) ## Summary Replaces the per-row inline action buttons (reconnect + delete) in the MCP clients table with a consolidated `MoreHorizontal` dropdown menu, and moves the actions column to a sticky right-pinned position so it remains visible when the table scrolls horizontally. ## Changes - Replaced the individual `Reconnect` (with tooltip) and `Delete` (with inline `AlertDialog`) buttons with a single `DropdownMenu` triggered by a `MoreHorizontal` icon button. - The delete confirmation `AlertDialog` is now lifted out of the table row and controlled via a `clientToDelete` state variable, preventing multiple dialog instances from being mounted inside the DOM simultaneously. - The actions column header and cell are now `sticky right-0` with `PIN_SHADOW_RIGHT` applied, keeping the actions visible during horizontal scroll. - The table container changed from `overflow-hidden` to `overflow-auto` to enable horizontal scrolling. - Reconnect and delete menu items are conditionally rendered based on RBAC access, rather than being rendered-but-disabled. - The `MoreHorizontal` button shows a `Loader2` spinner while a reconnect is in progress for that row. - Added `group` class to table rows to allow the sticky actions cell to mirror the row hover background. ## Type of change - [ ] Bug fix - [ ] Feature - [x] Refactor - [ ] Documentation - [ ] Chore/CI ## Affected areas - [ ] Core (Go) - [ ] Transports (HTTP) - [ ] Providers/Integrations - [ ] Plugins - [x] UI (React) - [ ] Docs ## How to test 1. Navigate to the MCP Registry page in the workspace UI. 2. Verify each MCP client row shows a `⋯` (MoreHorizontal) button in the rightmost column. 3. Click the button and confirm the dropdown shows **Reconnect** and **Delete** options (subject to RBAC permissions). 4. Select **Reconnect** and confirm the spinner appears on the button while reconnecting. 5. Select **Delete** and confirm the confirmation dialog appears with the correct server name, and that confirming removes the client. 6. Resize the browser window to trigger horizontal scrolling and confirm the actions column remains pinned to the right. ```sh cd ui pnpm i || npm i pnpm build || npm run build ``` ## Screenshots/Recordings Before/after screenshots recommended showing the old inline icon buttons vs. the new dropdown menu. ## Breaking changes - [x] No ## Related issues ## Security considerations RBAC checks are preserved — reconnect and delete menu items are only rendered when the user has the corresponding `Update` or `Delete` permission on `MCPGateway`. ## Checklist - [ ] I read `docs/contributing/README.md` and followed the guidelines - [ ] I added/updated tests where appropriate - [ ] I updated documentation where needed - [ ] I verified builds succeed (Go and UI) - [ ] I verified the CI pipeline passes locally if applicable

…d active toggle switch in teams and virtual keys tables (#3487) ## Summary Replaces the inline edit/delete action buttons in the Teams and Virtual Keys tables with a consolidated `MoreHorizontal` dropdown menu per row. The actions column is now sticky-pinned to the right edge of the table so it remains visible when the table scrolls horizontally. The Virtual Keys table also replaces the status badge with an inline active/inactive toggle switch. ## Changes - Extracted `TeamActionsMenu` and `VKActionsMenu` components that render a `DropdownMenu` containing Edit and Delete items, with the delete confirmation `AlertDialog` controlled via local state rather than being triggered directly from an `AlertDialogTrigger`. - Removed the hover-only opacity animation on action buttons in favor of always-visible dropdown triggers. - The actions `TableHead` and `TableCell` are now sticky (`sticky right-0 z-10`) with `PIN_SHADOW_RIGHT` applied and background colors that match the row hover state, keeping the pinned column visually consistent. - Tables are given a `min-w` value and their container uses `overflow-auto` to support horizontal scrolling without breaking the sticky column. - `VKStatusBadge` is replaced by `VKActiveSwitch`, which renders a `Switch` component and calls `useUpdateVirtualKeyMutation` to toggle `is_active` inline. Managed-by-profile keys disable the switch and show a tooltip title. - The managed-by-profile delete tooltip/disabled-button pattern is replaced by a disabled destructive `DropdownMenuItem` with a `title` attribute. - `handleEditVirtualKey` no longer requires a `MouseEvent` argument since click propagation is handled at the cell level. ## Type of change - [ ] Bug fix - [ ] Feature - [x] Refactor - [ ] Documentation - [ ] Chore/CI ## Affected areas - [ ] Core (Go) - [ ] Transports (HTTP) - [ ] Providers/Integrations - [ ] Plugins - [x] UI (React) - [ ] Docs ## How to test 1. Navigate to the **Governance → Teams** page. - Verify the actions column stays pinned to the right when scrolling horizontally. - Click the `MoreHorizontal` button on a row and confirm Edit and Delete items appear. - Confirm the delete confirmation dialog opens from the dropdown and completes successfully. 2. Navigate to the **Virtual Keys** page. - Verify the active toggle switch reflects the current `is_active` state and toggling it updates the key immediately with a success toast. - Confirm keys managed by an access profile show a disabled switch and a disabled Delete item in the dropdown. - Verify the sticky actions column behaves correctly on horizontal scroll. ```sh cd ui pnpm i pnpm build ``` ## Screenshots/Recordings _Add before/after screenshots showing the dropdown menu and sticky column behavior._ ## Breaking changes - [x] No ## Related issues ## Security considerations No new auth surfaces introduced. RBAC checks (`hasUpdateAccess`, `hasDeleteAccess`) are preserved on all action items. ## Checklist - [ ] I read `docs/contributing/README.md` and followed the guidelines - [ ] I added/updated tests where appropriate - [ ] I updated documentation where needed - [ ] I verified builds succeed (Go and UI) - [ ] I verified the CI pipeline passes locally if applicable

## Summary Replaces GORM's `Migrator.HasColumn` / `AddColumn` calls in the `migrate_calendar_aligned` migration with raw SQL, and adds doc comments to all legacy budget migration helpers. The GORM migrator approach was unreliable across dialects; the new `hasColumn` helper queries `pragma_table_info` (SQLite) or `information_schema.columns` (all other databases) directly, and column addition is done via a plain `ALTER TABLE ... ADD COLUMN` statement. ## Changes - Added a `hasColumn` helper that checks for a column's existence using dialect-aware raw SQL instead of the GORM migrator API. - Replaced `mig.HasColumn` / `mig.AddColumn` calls in `migrateCalendarAlignedToBudgetsAndRateLimitsTable` with `hasColumn` and raw `ALTER TABLE` statements for both `governance_budgets.calendar_aligned` and `governance_rate_limits.calendar_aligned`. - Added doc comments to all legacy budget migration types and helper functions (`legacyBudgetVirtualKey`, `legacyBudgetVirtualKeyProviderConfig`, `legacyBudgetTeam`, `sqliteColumnInfo`, `legacyBudgetColumnModel`, `currentBudgetOwnerModel`, `quoteSQLiteIdentifier`, `sqliteTableColumns`, `sqliteTableHasColumn`). ## Type of change - [x] Bug fix - [ ] Feature - [ ] Refactor - [ ] Documentation - [ ] Chore/CI ## Affected areas - [x] Core (Go) - [ ] Transports (HTTP) - [ ] Providers/Integrations - [ ] Plugins - [ ] UI (React) - [ ] Docs ## How to test ```sh go test ./framework/configstore/... ``` Run against both a SQLite and a supported relational database to confirm the `calendar_aligned` column is added correctly on a fresh migration and that re-running the migration is idempotent (column already exists case is skipped without error). ## Screenshots/Recordings N/A ## Breaking changes - [ ] Yes - [x] No ## Related issues N/A ## Security considerations No auth, secrets, or PII implications. Raw SQL identifiers in the new helper use parameterised queries, so there is no SQL injection risk. ## Checklist - [ ] I read `docs/contributing/README.md` and followed the guidelines - [ ] I added/updated tests where appropriate - [ ] I updated documentation where needed - [ ] I verified builds succeed (Go and UI) - [ ] I verified the CI pipeline passes locally if applicable

) ## Summary Removes the `defaultFilterDataLimit` (500) cap that was previously applied to all `GetDistinct*` and `GetAvailable*` filter-data queries. This ensures that filter dropdowns and autocomplete fields return the full set of distinct values rather than silently truncating results at 500 entries. ## Changes - Removed the `defaultFilterDataLimit = 500` constant and all `.Limit(defaultFilterDataLimit)` calls from both the materialized view query paths (`matviews.go`) and the direct RDB query paths (`rdb.go`). - Affected queries: distinct models, aliases, stop reasons, routing engines, key pairs, tool names, server labels, and MCP virtual keys. - The existing `defaultFilterDataCutoffDays = 30` time-based filter remains in place to keep queries scoped to recent data. ## Type of change - [ ] Bug fix - [x] Feature - [ ] Refactor - [ ] Documentation - [ ] Chore/CI ## Affected areas - [x] Core (Go) - [ ] Transports (HTTP) - [ ] Providers/Integrations - [ ] Plugins - [ ] UI (React) - [ ] Docs ## How to test Verify that filter endpoints return more than 500 distinct values when the dataset contains them, and that no results are truncated. ```sh go test ./framework/logstore/... ``` Seed a test database with more than 500 distinct models (or aliases, etc.) within the 30-day window and confirm all values are returned by the corresponding `GetDistinct*` method. ## Breaking changes - [ ] Yes - [x] No ## Related issues ## Security considerations Removing the result cap means larger payloads may be returned by filter-data endpoints. Ensure that the 30-day cutoff and existing authentication/authorization controls are sufficient to prevent abuse or excessive memory usage in environments with very high cardinality data. ## Checklist - [ ] I read `docs/contributing/README.md` and followed the guidelines - [ ] I added/updated tests where appropriate - [ ] I updated documentation where needed - [ ] I verified builds succeed (Go and UI) - [ ] I verified the CI pipeline passes locally if applicable

## Summary Adds proper `service_tier` translation between Bifrost's OpenAI-compatible values and the native wire formats for Anthropic and Gemini providers, rather than passing the raw string through unchanged. ## Changes - Introduced four mapping helpers in the Anthropic provider (`MapBifrostServiceTierToAnthropicRequest`, `MapAnthropicRequestServiceTierToBifrost`, `MapAnthropicServiceTierToBifrost`, `MapBifrostServiceTierToAnthropicResponse`) to translate between Bifrost values (`auto`, `default`, `priority`, `flex`) and Anthropic's request values (`auto`, `standard_only`) and response values (`standard`, `priority`, `batch`). - Applied these mappers in both the chat and responses conversion paths for Anthropic, covering request encoding and response decoding in both directions. - Added a `ServiceTier` typed string and constants (`unspecified`, `flex`, `standard`, `priority`) to the Gemini types, along with a `ServiceTier` field on `GenerationConfig`. - Introduced `mapBifrostServiceTierToGemini` and `mapGeminiServiceTierToBifrost` helpers and wired them into both the chat and responses parameter conversion paths for Gemini. - Added tests covering forward and reverse service tier mapping for Gemini chat, responses, and the reverse-mapping path from a `GeminiGenerationRequest` back to a `BifrostResponsesRequest`. ## Type of change - [ ] Bug fix - [x] Feature - [ ] Refactor - [ ] Documentation - [ ] Chore/CI ## Affected areas - [x] Core (Go) - [ ] Transports (HTTP) - [x] Providers/Integrations - [ ] Plugins - [ ] UI (React) - [ ] Docs ## How to test ```sh go test ./core/providers/anthropic/... go test ./core/providers/gemini/... ``` The new Gemini tests (`TestServiceTierMappingChat`, `TestServiceTierMappingResponses`, `TestServiceTierReverseMapping`) exercise all tier values in both directions. Verify that: - Bifrost `"default"` → Anthropic request `"standard_only"`, Gemini `"standard"` - Bifrost `"auto"` / `"priority"` → Anthropic request `"auto"` - Anthropic response `"standard"` → Bifrost `"default"` - Gemini `"flex"` / `"priority"` round-trip correctly through Bifrost ## Breaking changes - [ ] Yes - [x] No ## Related issues ## Security considerations None. ## Checklist - [ ] I read `docs/contributing/README.md` and followed the guidelines - [x] I added/updated tests where appropriate - [ ] I updated documentation where needed - [x] I verified builds succeed (Go and UI) - [ ] I verified the CI pipeline passes locally if applicable

## Summary When a tool call contains invalid JSON arguments, the Bedrock provider now falls back to an empty JSON object `{}` instead of forwarding the malformed payload. This prevents downstream errors caused by passing invalid JSON to the Bedrock API. ## Changes - When `json.Compact` fails on tool call arguments, the input is now set to `{}` rather than preserving the raw (invalid) string. Forwarding invalid JSON to Bedrock would cause API errors, so discarding it in favor of a safe empty object is the correct behavior. ## Type of change - [x] Bug fix - [ ] Feature - [ ] Refactor - [ ] Documentation - [ ] Chore/CI ## Affected areas - [ ] Core (Go) - [ ] Transports (HTTP) - [x] Providers/Integrations - [ ] Plugins - [ ] UI (React) - [ ] Docs ## How to test Construct a tool call with malformed JSON arguments and send it through the Bedrock provider. Verify that the request succeeds and the tool input is treated as an empty object rather than causing a serialization or API error. ```sh go test ./core/providers/bedrock/... ``` ## Breaking changes - [ ] Yes - [x] No ## Related issues ## Security considerations No security implications. Invalid input is sanitized to an empty object rather than being forwarded. ## Checklist - [ ] I read `docs/contributing/README.md` and followed the guidelines - [ ] I added/updated tests where appropriate - [ ] I updated documentation where needed - [ ] I verified builds succeed (Go and UI) - [ ] I verified the CI pipeline passes locally if applicable

## Summary When processing Anthropic streaming events, the `message_start` event was previously discarded entirely. This meant the assistant role information included in the initial message event was never forwarded to the caller, resulting in incomplete stream chunks. ## Changes - When a `message_start` event contains a non-empty role on the message, a `chat.completion.chunk` stream response is now returned with the role populated in the delta, rather than returning `nil` and silently dropping it. ## Type of change - [x] Bug fix - [ ] Feature - [ ] Refactor - [ ] Documentation - [ ] Chore/CI ## Affected areas - [x] Core (Go) - [ ] Transports (HTTP) - [x] Providers/Integrations - [ ] Plugins - [ ] UI (React) - [ ] Docs ## How to test Send a streaming chat completion request through the Anthropic provider and verify that the first chunk in the stream includes the assistant role in the delta. ```sh go test ./core/providers/anthropic/... ``` Confirm the first streamed chunk contains: ```json { "object": "chat.completion.chunk", "choices": [ { "index": 0, "delta": { "role": "assistant" } } ] } ``` ## Screenshots/Recordings N/A ## Breaking changes - [ ] Yes - [x] No ## Related issues N/A ## Security considerations No security implications. ## Checklist - [ ] I read `docs/contributing/README.md` and followed the guidelines - [ ] I added/updated tests where appropriate - [ ] I updated documentation where needed - [ ] I verified builds succeed (Go and UI) - [ ] I verified the CI pipeline passes locally if applicable

…ver-side search and pagination (#3567) ## Summary All "get distinct / available" filter-data methods now accept `limit int` and `query string` parameters, enabling server-side search filtering and result capping. Previously these methods returned unbounded result sets with no way to narrow results by a search term, which could cause large memory allocations and slow queries on tables with many distinct values. ## Changes - Added `limit` and `query` parameters to all `GetDistinct*` and `GetAvailable*` methods across the `LogStore` interface, `RDBLogStore`, `HybridLogStore`, `LoggerPlugin`, `LogManager`, and `PluginLogManager`. - On the database layer, a `LIKE`/`ILIKE` filter is applied when `query` is non-empty, and `LIMIT` is pushed down to the query so the DB engine caps the result set rather than Go-side slicing. - A helper `applyLikeFilter` was added to `RDBLogStore` to emit `ILIKE` on Postgres and `LIKE` on other dialects. - Materialized-view paths (`getDistinct*FromMatView`) were updated in the same way — `ILIKE` filtering and `LIMIT` are applied before results are returned. - `GetDistinctRoutingEngines` (both raw and matview paths) now sorts results and truncates to `limit` after the in-process comma-split deduplication step. - `GetDistinctMetadataKeys` applies an in-process case-insensitive substring filter on both key names and values when `query` is set, and caps the number of returned keys at `limit`. - The `/api/logs/filterdata` and `/api/mcp/logs/filterdata` HTTP handlers now read an optional `q` query parameter and pass it through to all downstream calls with a `defaultFilterDataLimit` of 1000. - Cache bypass: when `q` is non-empty the filter-data cache is skipped entirely (both read and write), since search results are user-specific and should not be shared across callers. - Existing performance tests updated to pass `(ctx, 1000, "")` to match the new signatures. ## Type of change - [ ] Bug fix - [x] Feature - [ ] Refactor - [ ] Documentation - [ ] Chore/CI ## Affected areas - [x] Core (Go) - [x] Transports (HTTP) - [ ] Providers/Integrations - [x] Plugins - [ ] UI (React) - [ ] Docs ## How to test ```sh go test ./framework/logstore/... ./plugins/logging/... ./transports/bifrost-http/... ``` To validate search filtering end-to-end: 1. Start the server with a populated log database. 2. Call `GET /api/logs/filterdata?dimensions=models&q=gpt` — the response should only contain model names matching `gpt`. 3. Call the same endpoint without `q` — the full list (up to 1000) should be returned and cached. 4. Call `GET /api/mcp/logs/filterdata?dimensions=tool_names&q=search` — only matching tool names should be returned and the result should not be written to cache. ## Breaking changes - [x] Yes All `LogStore`, `LogManager`, and `LoggerPlugin` method signatures have changed. Any external implementations of these interfaces must add `limit int, query string` to the affected methods. ## Related issues ## Security considerations The `query` string is passed to the database as a parameterised `LIKE`/`ILIKE` pattern (`?` placeholder), so there is no SQL injection risk. The `%` wildcards are added in Go before binding, not interpolated into the query string directly. ## Checklist - [ ] I read `docs/contributing/README.md` and followed the guidelines - [x] I added/updated tests where appropriate - [ ] I updated documentation where needed - [x] I verified builds succeed (Go and UI) - [ ] I verified the CI pipeline passes locally if applicable

…ounced `q` query param (#3568) ## Summary Filter dropdowns in the logs and MCP logs sidebars previously fetched all available options upfront. This PR wires the search input in each `SearchableCheckboxList` to the backend filter data query, so that typing in a filter panel sends a debounced `q` parameter to `/logs/filterdata` or `/mcp-logs/filterdata`, narrowing results server-side rather than relying solely on client-side filtering. ## Changes - Added an optional `onSearch` callback to `SearchableCheckboxList` in both `logsFilterSidebar.tsx` and `mcpFilterSidebar.tsx`. When provided, a 300 ms debounced effect fires `onSearch` with the current trimmed query. - Each filter component (`StopReasonFilter`, `ModelsFilter`, `AliasesFilter`, `SelectedKeysFilter`, `VirtualKeysFilter`, `RoutingEnginesFilter`, `RoutingRulesFilter`, `ToolNamesFilter`, `ServersFilter`) now holds a `searchQuery` state and passes it as `q` to `useGetAvailableFilterDataQuery` / `useGetMCPLogsFilterDataQuery`. - `MetadataFilters` gets its own search input with a local debounce (since it doesn't use `SearchableCheckboxList`) and passes the debounced value as `q` to the filter data query. - The `getAvailableFilterData` and `getMCPAvailableFilterData` API query builders now accept an optional `q` field and append it as a URL query parameter when present. ## Type of change - [ ] Bug fix - [x] Feature - [ ] Refactor - [ ] Documentation - [ ] Chore/CI ## Affected areas - [ ] Core (Go) - [ ] Transports (HTTP) - [ ] Providers/Integrations - [ ] Plugins - [x] UI (React) - [ ] Docs ## How to test 1. Open the logs or MCP logs page and expand any filter panel (e.g. Models, Aliases, Virtual Keys). 2. Type in the search box inside the filter panel. 3. After ~300 ms, verify that the displayed options update to reflect server-filtered results matching the query. 4. Clear the search box and confirm the full list is restored. 5. Verify that the Metadata filter search input similarly narrows the displayed metadata keys. ```sh cd ui pnpm i || npm i pnpm test || npm test pnpm build || npm run build ``` ## Screenshots/Recordings _Add before/after screenshots or a short clip showing the filter search narrowing results._ ## Breaking changes - [ ] Yes - [x] No ## Related issues ## Security considerations The `q` parameter is passed as a plain URL query string to existing authenticated endpoints. No new auth surface or PII handling is introduced beyond what the filter data endpoints already expose. ## Checklist - [ ] I read `docs/contributing/README.md` and followed the guidelines - [ ] I added/updated tests where appropriate - [ ] I updated documentation where needed - [ ] I verified builds succeed (Go and UI) - [ ] I verified the CI pipeline passes locally if applicable

…nputs (#3563) ## Summary Adds a search icon and a loading spinner to the search inputs in the logs and MCP filter sidebars. When a filter list is actively fetching data, the search icon is replaced with an animated spinner to give users visual feedback. When idle, a static search icon is shown instead of a bare input field. ## Changes - Added a `Search` icon inside the search input for all `SearchableCheckboxList` filter sections in both `logsFilterSidebar.tsx` and `mcpFilterSidebar.tsx`, positioned absolutely on the left side of the input with appropriate padding adjustments. - Added a `LoaderCircle` spinner that replaces the `Search` icon while `isFetching` is true, providing real-time feedback during server-side search queries. - Exposed `isFetching` from `useGetAvailableFilterDataQuery` and `useGetMCPLogsFilterDataQuery` in all filter components (Stop Reason, Models, Aliases, Selected Keys, Virtual Keys, Routing Engines, Routing Rules, Metadata, Tool Names, Servers) and passed it down as a `fetching` prop to `SearchableCheckboxList`. - Added the `Search` icon to the Session and User plain text inputs, which previously had no icon. - Fixed indentation in the metadata filter entries block. ## Type of change - [ ] Bug fix - [x] Feature - [ ] Refactor - [ ] Documentation - [ ] Chore/CI ## Affected areas - [ ] Core (Go) - [ ] Transports (HTTP) - [ ] Providers/Integrations - [ ] Plugins - [x] UI (React) - [ ] Docs ## How to test 1. Open the logs or MCP logs page with the filter sidebar visible. 2. Expand any searchable filter section (e.g., Models, Aliases, Virtual Keys). 3. Verify a search icon appears on the left side of the search input. 4. Type a query and observe the spinner replacing the search icon while results are being fetched, then reverting to the search icon once the fetch completes. 5. Check the Session and User filter inputs also display the search icon. ```sh cd ui pnpm i pnpm build ``` ## Screenshots/Recordings Before: Search inputs had no icon and no loading indicator. After: Search inputs display a `Search` icon at rest and an animated `LoaderCircle` spinner while fetching filter options. ## Breaking changes - [x] No ## Related issues ## Security considerations None. ## Checklist - [ ] I read `docs/contributing/README.md` and followed the guidelines - [ ] I added/updated tests where appropriate - [ ] I updated documentation where needed - [ ] I verified builds succeed (Go and UI) - [ ] I verified the CI pipeline passes locally if applicable

…cross providers (#3574)

## Summary Adds Bedrock mantle provider support for stream termination handling and fixes JSON unmarshalling of `created_at`/`completed_at` fields that Bedrock returns as floats instead of integers. ## Changes - Added `schemas.Bedrock` to the `ProviderSendsDoneMarker` switch case so that Bedrock streams are correctly terminated after `finish_reason` rather than waiting for a `[DONE]` marker that never arrives. - Added a custom `UnmarshalJSON` method on `BifrostResponsesResponse` to handle Bedrock's float-typed `created_at` and `completed_at` timestamp fields, converting them to `int` as expected by the struct. ## Type of change - [x] Bug fix - [ ] Feature - [ ] Refactor - [ ] Documentation - [ ] Chore/CI ## Affected areas - [x] Core (Go) - [ ] Transports (HTTP) - [x] Providers/Integrations - [ ] Plugins - [ ] UI (React) - [ ] Docs ## How to test Send a streaming request through the Bedrock provider and verify the stream terminates correctly after `finish_reason` without hanging. Also verify that responses with float-typed `created_at`/`completed_at` fields are correctly parsed. ```sh go test ./... ``` ## Breaking changes - [ ] Yes - [x] No ## Related issues ## Security considerations No security implications. ## Checklist - [ ] I read `docs/contributing/README.md` and followed the guidelines - [ ] I added/updated tests where appropriate - [ ] I updated documentation where needed - [ ] I verified builds succeed (Go and UI) - [ ] I verified the CI pipeline passes locally if applicable

The stats field in the MCP logs list response (/api/mcp-logs) was only populated with TotalExecutions, leaving success_rate, average_latency, and total_cost as zero values - stale and misleading. Remove it from the list response struct and from the Go result assembly. UI now reads total_count from pagination (already present in response) for the paginator, and uses the dedicated /api/mcp-logs/stats endpoint for the stat cards, which computes all fields correctly. Preserves: - log rows, pagination, has_logs - /api/mcp-logs/stats endpoint - UI stat cards (they use getMCPLogsStatsQuery) Signed-off-by: Vaibhav mittal <vaibhavmittal929@gmail.com>

Signed-off-by: Vaibhav mittal <vaibhavmittal929@gmail.com>

…le assignment (#3560) ## Summary This PR adds support for attaching virtual keys directly to an access profile, enabling enterprise users to associate VKs with access profile templates as an alternative to team or customer assignment. Previously, virtual keys could only be assigned to a team or a customer (mutually exclusive). This change extends that model to include access profiles as a third mutually exclusive assignment target. ## Changes - Added `access_profile_id` column and index to `governance_virtual_keys` via a new database migration (`migrationAddVKAccessProfileIDColumn`). - Extended `TableVirtualKey` with an `AccessProfileID *uint` field. - Updated `CreateVirtualKeyRequest` and `UpdateVirtualKeyRequest` HTTP handler structs to accept `access_profile_id`, and replaced the binary team/customer mutual-exclusion check with a three-way entity count guard. - In the update path, `access_profile_id` is persisted via a separate `Updates` call because the existing `UpdateVirtualKey` select list excludes it to protect config-sync paths. - Added `useGetAccessProfilesQuery` stub to the OSS fallback API layer so the UI degrades gracefully when the enterprise backend is absent. - Added `GetAccessProfilesParams` and `GetAccessProfilesResponse` types to the OSS fallback type definitions. - Updated `VirtualKeySheet` to support `access_profile` as a form entity type, including a `defaultAccessProfileId` prop, a locked-assignment mode (`isAPLocked`), a reassignment confirmation dialog, and a combobox for selecting an access profile. - Updated `VirtualKeysTable` to display an "AP: \<name\>" badge for VKs assigned to an access profile and to include access profile names in CSV exports. - Registered `AccessProfileVirtualKeys` as a cache tag in the base RTK API. ## Type of change - [ ] Bug fix - [x] Feature - [ ] Refactor - [ ] Documentation - [ ] Chore/CI ## Affected areas - [x] Core (Go) - [x] Transports (HTTP) - [ ] Providers/Integrations - [ ] Plugins - [x] UI (React) - [ ] Docs ## How to test 1. Run the service — the migration will automatically add `access_profile_id` to `governance_virtual_keys`. 2. Create a virtual key via the API with `access_profile_id` set and verify it is persisted. 3. Attempt to create a virtual key with both `team_id` and `access_profile_id` set — expect a 400 error. 4. Update a virtual key to assign it to an access profile and confirm the previous team/customer assignment is cleared. 5. In the UI (enterprise build), open the virtual key sheet and verify the "Assign to Access Profile" option appears when access profiles exist, the combobox populates correctly, and the reassignment warning dialog triggers when changing an already-assigned profile. 6. Verify the virtual keys table shows the "AP: \<name\>" badge and that CSV export includes the access profile name. ```sh # Core/Transports go version go test ./... # UI cd ui pnpm i || npm i pnpm test || npm test pnpm build || npm run build ``` ## Breaking changes - [ ] Yes - [x] No ## Security considerations `access_profile_id` follows the same access-control patterns as `team_id` and `customer_id`. No new auth surfaces are introduced. The OSS fallback returns `undefined` data, ensuring no enterprise-only data is exposed in non-enterprise builds. ## Checklist - [ ] I read `docs/contributing/README.md` and followed the guidelines - [ ] I added/updated tests where appropriate - [ ] I updated documentation where needed - [ ] I verified builds succeed (Go and UI) - [ ] I verified the CI pipeline passes locally if applicable

…_teams (#3395) ## Summary Adds a `source_id` field to the `governance_teams` table to allow teams to be looked up by an external identifier. This enables integration scenarios where teams are created or managed by an external system and need to be referenced by a stable, external ID. ## Changes - Added an optional `SourceID` field (`*string`) to `TableTeam` with a unique index on `governance_teams.source_id` - Added a `BeforeSave` hook on `TableTeam` that trims whitespace from `SourceID` and coerces blank strings to `nil`, preventing empty strings from violating the unique constraint - Added a `migrationAddTeamSourceIDColumn` database migration that creates the `source_id` column and its unique index, with a rollback path that drops both - Added `GetTeamBySourceID` to the `ConfigStore` interface and its `RDBConfigStore` implementation, which queries by `source_id` and returns `ErrNotFound` when no match exists - Added the stub implementation of `GetTeamBySourceID` to the `MockConfigStore` used in HTTP transport tests ## Type of change - [ ] Bug fix - [x] Feature - [ ] Refactor - [ ] Documentation - [ ] Chore/CI ## Affected areas - [x] Core (Go) - [x] Transports (HTTP) - [ ] Providers/Integrations - [ ] Plugins - [ ] UI (React) - [ ] Docs ## How to test ```sh go test ./framework/configstore/... go test ./transports/bifrost-http/... ``` - Create a team with a `source_id` set and verify it can be retrieved via `GetTeamBySourceID`. - Attempt to create two teams with the same `source_id` and verify a unique constraint violation is returned. - Save a team with a whitespace-only `source_id` and verify it is stored as `NULL`. ## Breaking changes - [ ] Yes - [x] No ## Related issues ## Security considerations The `source_id` unique index prevents duplicate external identifiers from being inserted. No PII or secrets are involved. ## Checklist - [ ] I read `docs/contributing/README.md` and followed the guidelines - [ ] I added/updated tests where appropriate - [ ] I updated documentation where needed - [ ] I verified builds succeed (Go and UI) - [ ] I verified the CI pipeline passes locally if applicable

…()` for `ProxyConfig` to prevent partial value leakage in API responses (#3445) ## Summary Introduces a `FullyRedacted` method on `EnvVar` and a dedicated `MarshalForStorage` method on `ProxyConfig`, along with a shared `EnvVarAsString` helper, to ensure proxy secrets are never partially exposed in API responses and that `EnvVar` fields are consistently serialized as plain strings when persisting proxy configuration to the database. Previously, `json.Marshal` was used directly in the GORM `BeforeSave` hook, which would serialize `EnvVar` fields as structured objects rather than the flat string format expected in storage. Additionally, the old `Redacted()` logic on `ProxyConfig` could leak substrings of literal passwords through partial masking. ## Changes - Added `EnvVar.FullyRedacted()` which replaces any non-empty value with the fixed placeholder `<REDACTED>`, ensuring no substring of the original secret is exposed. `FromEnv` and `EnvVar` metadata are preserved so env references remain visible and round-trip update merges still match via `Equals`. - Added `EnvVarAsString` utility function that returns the wire-form string for an `*EnvVar`: the env var token if sourced from the environment, or the literal value otherwise. - Added `ProxyConfig.MarshalForStorage()` which uses `EnvVarAsString` to flatten all `EnvVar` fields into plain strings for database persistence. `json.Marshal` on `*ProxyConfig` is preserved for HTTP API responses where clients expect the full `value/env_var/from_env` object structure. - Replaced `json.Marshal(p.ProxyConfig)` with `p.ProxyConfig.MarshalForStorage()` in the GORM `BeforeSave` hook. - Simplified `ProxyConfig.Redacted()` by removing redundant `IsFromEnv()` branching. Passwords and CA certificates now use `FullyRedacted()` to guarantee full opacity, while URL and username delegate to `.Redacted()`. A nil receiver guard was also added. - Applied the same `EnvVarAsString` simplification to `NetworkConfig.MarshalJSON` for `CACertPEM`. ## Type of change - [x] Bug fix - [x] Refactor - [ ] Feature - [ ] Documentation - [ ] Chore/CI ## Affected areas - [x] Core (Go) - [x] Transports (HTTP) - [x] Providers/Integrations - [ ] Plugins - [ ] UI (React) - [ ] Docs ## How to test ```sh go test ./core/schemas/... ./framework/configstore/... ``` Verify that after saving a provider with a proxy configuration containing both literal and `env.*`-sourced fields, the stored `proxy_config_json` column contains flat strings (e.g. `"url": "http://proxy.example.com"` or `"url": "env.PROXY_URL"`) rather than structured `EnvVar` objects. Verify that the HTTP API response for the same provider still returns the full `EnvVar` object structure for proxy fields, and that the `password` field is serialized as `{"val":"<REDACTED>"}` with no substring of the original value present. ## Breaking changes - [ ] Yes - [x] No ## Related issues ## Security considerations Proxy passwords are now fully opaque in API responses regardless of whether they are literal values or environment-sourced. The old `Redacted()` path could expose a prefix of a literal password through partial masking; `FullyRedacted()` eliminates this by always substituting the fixed `<REDACTED>` placeholder. Storage serialization writes the `env.*` token rather than the resolved secret value when the field is environment-sourced, avoiding accidental secret persistence in the database. ## Checklist - [ ] I read `docs/contributing/README.md` and followed the guidelines - [ ] I added/updated tests where appropriate - [ ] I updated documentation where needed - [ ] I verified builds succeed (Go and UI) - [ ] I verified the CI pipeline passes locally if applicable

coderabbitai · 2026-05-19T08:17:25Z

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 6559bf80-5d6a-4ce0-aa54-be2ff33255d1

📥 Commits

Reviewing files that changed from the base of the PR and between fdcacc5 and 7563ade.

📒 Files selected for processing (5)

core/bifrost.go
core/schemas/bifrost.go
transports/bifrost-http/handlers/config.go
transports/bifrost-http/handlers/wsresponses.go
transports/bifrost-http/lib/ctx.go

✅ Files skipped from review due to trivial changes (1)

core/schemas/bifrost.go

🚧 Files skipped from review as they are similar to previous changes (2)

transports/bifrost-http/handlers/config.go
core/bifrost.go

📝 Walkthrough

Summary by CodeRabbit

New Features
- Added request-scoped provider API key overrides: an override replaces only the key secret while preserving the selected key's identity and eligibility, and is applied per-request.
Bug Fixes / Behavior
- When an override is active, key rotation is avoided on retry after rate limits and override-aware logic ensures retries use the overridden secret without marking keys as used.
Security Enhancements
- Broadened blocking of API-key related headers across HTTP and WebSocket flows to prevent unintended header forwarding.

Walkthrough

This PR introduces request-scoped provider API key override via the x-bf-provider-api-key header: the header is extracted into context, a helper replaces only the selected key's secret with the override, retries skip rotation when overridden, and header propagation paths are denylisted.

Changes

Provider API Key Override

Layer / File(s)	Summary
Context key and header parsing `core/schemas/bifrost.go`, `transports/bifrost-http/lib/ctx.go`	Context key constant `BifrostContextKeyProviderAPIKey` is defined; HTTP header parsing extracts `x-bf-provider-api-key`, stores the trimmed value in context, and extends the security denylist to block key-related headers from general processing.
Override application logic `core/bifrost.go`	Adds `applyProviderAPIKeyOverride` to replace only `Key.Value` with the override wrapped as `EnvVar`. `SelectKeyForProviderRequestType` and `executeRequestWithRetries` apply this override; retries skip rotation and used-key marking when the override is active.
Security header enforcement `transports/bifrost-http/handlers/config.go`, `transports/bifrost-http/handlers/wsresponses.go`	`x-bf-api-key-id` and `x-bf-provider-api-key` are added to the `securityHeaders` blocklist and WS extra-header denylist, preventing unauthorized exposure or forwarding.

Sequence Diagram(s)

sequenceDiagram
  participant Client
  participant HTTP as ConvertToBifrostContext
  participant Selector as SelectKeyForProviderRequestType
  participant Override as applyProviderAPIKeyOverride
  participant Runner as executeRequestWithRetries
  Client->>HTTP: sends `x-bf-provider-api-key` header
  HTTP->>Selector: context (with override)
  Selector->>Override: selectedKey + context
  Override-->>Selector: key with secret replaced
  Selector->>Runner: selectedKey (override applied)
  loop retry attempts
    Runner->>Override: selectedKey + context
    Override-->>Runner: per-attempt currentKey (override applied)
    Runner->>Runner: on rate-limit, skip rotation if override present
  end

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Suggested reviewers

akshaydeo

Poem

🐇 I nibble headers, soft and sly,
A secret tucked where carrots lie.
I swap the filling, keep the shell,
Retry the hop, and guard it well.
Hop, hide, and watch the pathways pry.

🚥 Pre-merge checks | ✅ 5

✅ Passed checks (5 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title 'fix(inference): support request-level provider api key' directly describes the main feature added—request-scoped BYOK via x-bf-provider-api-key header.
Description check	✅ Passed	The description covers all major template sections: Summary, Changes, Type of change, Affected areas, How to test with build and curl commands, Screenshots, Breaking changes, Related issues, Security considerations, and a mostly-complete Checklist.
Docstring Coverage	✅ Passed	Docstring coverage is 80.00% which is sufficient. The required threshold is 80.00%.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch fix/request-level-byok

Warning

There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure.

🔧 golangci-lint (2.12.2)

level=error msg="[linters_context] typechecking error: pattern ./...: directory prefix . does not contain main module or its selected dependencies"

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@core/bifrost.go`:
- Around line 5317-5319: When applyProviderAPIKeyOverride(...) returns an
override (i.e., a request-scoped BYOK is active) set a request-scoped flag in
the context (e.g., ctx.SetValue(schemas.BifrostContextKeyRequestScopedBYOK,
true)) right after currentKey is set, and ensure the rate-limit/retry logic that
does configured-key rotation (the code path around the 429 handling / attempt
rotation logic) checks that flag and, if true, skips rotating configured keys
and uses the actual upstream credential from currentKey (and/or ctx value
BifrostContextKeySelectedKeyID) for retries; in short, mark when a header
override is active in applyProviderAPIKeyOverride and update the rotation/retry
code to no-op configured-key rotation when that flag is present.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 8949c6a9-ff76-49b3-90e0-0f52c215ca33

📥 Commits

Reviewing files that changed from the base of the PR and between 71d8375 and fdcacc5.

📒 Files selected for processing (5)

core/bifrost.go
core/schemas/bifrost.go
transports/bifrost-http/handlers/config.go
transports/bifrost-http/handlers/wsresponses.go
transports/bifrost-http/lib/ctx.go

greptile-apps · 2026-05-19T08:21:26Z

Confidence Score: 5/5

The change is safe to merge; the override is strictly request-scoped, no customer key touches storage or logs, and the previously reported WebSocket gap is now closed.

All three transport paths (HTTP, WebSocket createBifrostContextFromAuth, and the retry loop) consistently read and apply the BYOK value. The key struct copy is shallow but safe because only the Value (non-pointer EnvVar) field is replaced. Rate-limit rotation is correctly suppressed for BYOK requests. Security deny lists are updated in all required locations. No bugs or security issues were found.

No files require special attention.

Important Files Changed

Filename	Overview
core/bifrost.go	Adds applyProviderAPIKeyOverride helper and integrates BYOK into both SelectKeyForProviderRequestType and executeRequestWithRetries; rate-limit rotation and used-key tracking are correctly suppressed when BYOK is active.
core/schemas/bifrost.go	Adds BifrostContextKeyProviderAPIKey constant; no issues.
transports/bifrost-http/handlers/config.go	Adds x-bf-api-key-id (previously missing) and x-bf-provider-api-key to securityHeaders deny list; no issues.
transports/bifrost-http/handlers/wsresponses.go	Adds x-bf-provider-api-key case in createBifrostContextFromAuth (fixing the previously flagged WebSocket gap) and blocks the header in isSecurityDeniedExtraHeader; no issues.
transports/bifrost-http/lib/ctx.go	Adds x-bf-provider-api-key to the extra-header deny map and reads the header into BifrostContextKeyProviderAPIKey; return true correctly consumes the header before any forwarding logic runs.

_{Reviews (2): Last reviewed commit: "fix(inference): support request-level pr..." | Re-trigger Greptile}

Signed-off-by: Vaibhav mittal <vaibhavmittal929@gmail.com>

akshaydeo · 2026-05-20T07:22:35Z

+	// Detect request-scoped BYOK: when active, the actual upstream credential never changes,
+	// so configured-key rotation on rate-limit is pointless and misleading. We still run
+	// normal eligibility once on attempt 0 but skip rotation on subsequent rate-limit retries.
+	byokVal, hasProviderAPIKeyOverride := ctx.Value(schemas.BifrostContextKeyProviderAPIKey).(string)


we are doing this 2 times. once here and once in apply overrrides - can we change this pease

The merge-base changed after approval.

CLAassistant · 2026-05-20T13:13:51Z

Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you all sign our Contributor License Agreement before we can accept your contribution.
8 out of 15 committers have signed the CLA.

✅ impoiler
✅ Madhuvod
✅ Javtor
✅ BearTS
✅ binbandit
✅ Vaibhav701161
✅ d3lm
✅ kevinpdev
❌ akshaydeo
❌ Pratham-Mishra04
❌ TejasGhatte
❌ roroghost17
❌ danpiths
❌ etnperlong
❌ SahilChoudhary22
_{You have signed the CLA already but the status is still pending? Let us recheck it.}

impoiler and others added 30 commits May 16, 2026 03:33

harness improvements (#3457)

06eb289

Preserve Anthropic output schema refs (#3449)

c4a01bc

feat: use the new parameter json schema compliant to json schema spec (…

c3cb27a

…#3444)

impoiler and others added 16 commits May 17, 2026 15:19

chore(npx): bump @maximhq/bifrost to v1.6.3 (#3340)

44b56a9

fix(configstore): improve error message when API key name conflicts a…

ab42a5e

…cross providers (#3574)

fix(images): passthrough extra params (#3572)

dc20c79

Signed-off-by: Vaibhav mittal <vaibhavmittal929@gmail.com>

coderabbitai Bot requested a review from akshaydeo May 19, 2026 08:18

coderabbitai Bot requested changes May 19, 2026

View reviewed changes

Comment thread core/bifrost.go

fix(inference): support request-level provider api key

7563ade

Signed-off-by: Vaibhav mittal <vaibhavmittal929@gmail.com>

Vaibhav701161 force-pushed the fix/request-level-byok branch from fdcacc5 to 7563ade Compare May 19, 2026 08:34

coderabbitai Bot previously approved these changes May 19, 2026

View reviewed changes

akshaydeo reviewed May 20, 2026

View reviewed changes

akshaydeo force-pushed the dev branch from f036e5f to 5c46934 Compare May 20, 2026 10:00

akshaydeo requested a review from a team as a code owner May 20, 2026 10:00

akshaydeo force-pushed the dev branch 2 times, most recently from f59c88c to ff463d9 Compare May 22, 2026 15:16

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(inference): support request-level provider api key#3586

fix(inference): support request-level provider api key#3586
Vaibhav701161 wants to merge 100 commits into
devfrom
fix/request-level-byok

Vaibhav701161 commented May 19, 2026

Uh oh!

coderabbitai Bot commented May 19, 2026 •

edited

Loading

Summary by CodeRabbit

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Suggested reviewers

Poem

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

greptile-apps Bot commented May 19, 2026 •

edited

Loading

Uh oh!

akshaydeo May 20, 2026

Uh oh!

CLAassistant commented May 20, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

17 participants

Conversation

Vaibhav701161 commented May 19, 2026

Summary

Changes

Type of change

Affected areas

How to test

Screenshots/Recordings

Breaking changes

Related issues

Security considerations

Checklist

Uh oh!

coderabbitai Bot commented May 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Suggested reviewers

Poem

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

greptile-apps Bot commented May 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Confidence Score: 5/5

Important Files Changed

Uh oh!

akshaydeo May 20, 2026

Choose a reason for hiding this comment

Uh oh!

CLAassistant commented May 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

17 participants

coderabbitai Bot commented May 19, 2026 •

edited

Loading

greptile-apps Bot commented May 19, 2026 •

edited

Loading

CLAassistant commented May 20, 2026 •

edited

Loading