[fix]: allow model catalog URLs (pricing + parameters) to be overridden via env vars by dsherniiazov · Pull Request #3521 · maximhq/bifrost

dsherniiazov · 2026-05-15T08:31:31Z

Description

This PR fixes an issue where custom model catalog URLs (especially for model parameters) could not be properly overridden in certain deployment scenarios (Docker, Helm, air-gapped environments, etc.).

Changes

Added support for two new environment variables:
- BIFROST_PRICING_URL
- BIFROST_MODEL_PARAMETERS_URL
Refactored defaultURLWithEnv helper in framework/modelcatalog
Updated ModelCatalog initialization and sync logic to respect these env vars
Added comprehensive tests for the new behavior
Updated config resolution in transports/bifrost-http

Why

Previously the model parameters catalog always fell back to the hardcoded default URL, making it impossible to use a private/internal catalog without forking or complex config workarounds.
This change makes the model catalog fully customizable via environment variables (consistent with how other URLs like schema are handled).

Affected packages

core/framework
transports/bifrost-http

Testing

Added unit tests for defaultURLWithEnv
Verified that both pricing and model parameters URLs respect env overrides
Existing functionality remains unchanged

Type of Change: Bug fix / Improvement

Checklist

Changelog entry added to core/changelog.md
Tests added/updated
Documentation updated (if needed)

…filter inaccessible sidebar items (maximhq#3295) This PR improves RBAC granularity in the sidebar by introducing dedicated resource types for `APIKeys`, `Inference`, and `Metrics`, and fixes sidebar visibility logic so that items and groups are hidden when the user lacks access rather than relying on broader, less specific permissions. - Added three new `RbacResource` enum values: `APIKeys`, `Inference`, and `Metrics` to the fallback RBAC context. - The API Keys sidebar item now gates access via the new `hasAPIKeyAccess` (`RbacResource.APIKeys`) check instead of the generic `hasSettingsAccess`. - The MCP Logs sidebar item now correctly gates access via `hasMCPGatewayAccess` instead of the unrelated `hasLogsAccess`. - Introduced an `accessibleItems` memoized computation that filters out sidebar items and entire groups whose sub-items are all inaccessible, ensuring users never see empty navigation sections. Previously, access filtering only happened during search. - Removed unused imports (`PanelLeft`, `PanelRight`, `cn`). - [ ] Bug fix - [x] Feature - [x] Refactor - [ ] Documentation - [ ] Chore/CI - [ ] Core (Go) - [ ] Transports (HTTP) - [ ] Providers/Integrations - [ ] Plugins - [x] UI (React) - [ ] Docs 1. Log in as a user with restricted RBAC permissions that exclude `APIKeys` and/or `Settings`. 2. Verify the API Keys entry under the Config section is hidden for users without `APIKeys` view permission. 3. Verify the MCP Logs entry is hidden for users without `MCPGateway` view permission. 4. Verify that sidebar groups with no accessible sub-items are hidden entirely rather than showing an empty group. 5. Verify that users with full access see no change in sidebar behavior. ```sh cd ui pnpm i || npm i pnpm build || npm run build ``` _Add before/after screenshots showing sidebar items hidden for restricted users._ - [ ] Yes - [x] No _Link related issues here._ Access control checks for API Keys management are now scoped to a dedicated `APIKeys` RBAC resource rather than the broader `Settings` resource, reducing the risk of unintended access to key management for users who have settings visibility but should not manage API keys. - [ ] I read `docs/contributing/README.md` and followed the guidelines - [ ] I added/updated tests where appropriate - [ ] I updated documentation where needed - [ ] I verified builds succeed (Go and UI) - [ ] I verified the CI pipeline passes locally if applicable

…elete access (maximhq#3314) The delete button in log tables was always rendered (just disabled) for users without delete access. This PR hides the actions column entirely when the user lacks delete permissions, and fixes the RBAC resource check for MCP logs to use the correct `MCPGateway` resource instead of `Logs`. - The actions column in both the workspace logs and MCP logs tables is now conditionally included in the column definitions only when `hasDeleteAccess` is `true`, rather than always rendering a disabled button. - The delete button styling was updated to use more visible destructive colors (`text-destructive/60 border-destructive/60`) instead of the previous muted secondary foreground styles. - The RBAC resource used to gate delete access on the MCP logs page was corrected from `RbacResource.Logs` to `RbacResource.MCPGateway`. - [x] Bug fix - [ ] Feature - [ ] Refactor - [ ] Documentation - [ ] Chore/CI - [ ] Core (Go) - [ ] Transports (HTTP) - [ ] Providers/Integrations - [ ] Plugins - [x] UI (React) - [ ] Docs 1. Log in as a user **without** delete access on Logs or MCPGateway resources. 2. Navigate to the workspace logs page and the MCP logs page. 3. Verify the delete button/column is not visible. 4. Log in as a user **with** delete access. 5. Verify the delete button appears and is functional. ```sh cd ui pnpm i pnpm test pnpm build ``` Before: Delete button rendered but disabled for users without access. After: Delete column is hidden entirely for users without delete access. - [ ] Yes - [x] No The RBAC fix ensures MCP log deletion is gated on the correct `MCPGateway` resource permission, preventing users with only `Logs` delete access from incorrectly being granted delete access to MCP logs. - [ ] I read `docs/contributing/README.md` and followed the guidelines - [ ] I added/updated tests where appropriate - [ ] I updated documentation where needed - [ ] I verified builds succeed (Go and UI) - [ ] I verified the CI pipeline passes locally if applicable

…ogs route and sidebar (maximhq#3316) Introduces a dedicated `MCPLogs` RBAC resource, decoupling MCP log access control from the `MCPGateway` resource. This allows permissions for viewing and deleting MCP logs to be managed independently from gateway-level permissions. - Added `MCPLogs` as a new `RbacResource` enum value in the fallback RBAC context. - The MCP Logs route now checks `MCPLogs` view permission and renders a `NoPermissionView` when access is denied, rather than rendering the page unconditionally. - Delete access on the MCP Logs page now checks `RbacResource.MCPLogs` instead of `RbacResource.MCPGateway`. - The sidebar MCP Logs entry now uses `hasMCPLogsAccess` (derived from `RbacResource.MCPLogs`) to control visibility, rather than reusing `hasMCPGatewayAccess`. - [ ] Bug fix - [x] Feature - [ ] Refactor - [ ] Documentation - [ ] Chore/CI - [ ] Core (Go) - [ ] Transports (HTTP) - [ ] Providers/Integrations - [ ] Plugins - [x] UI (React) - [ ] Docs 1. Configure a role that has `MCPGateway` access but **no** `MCPLogs` access. 2. Log in as a user with that role and navigate to the MCP Logs page — the `NoPermissionView` should be displayed and the sidebar entry should be hidden. 3. Grant the role `MCPLogs` view access and confirm the page and sidebar entry become accessible. 4. Verify that delete functionality on the MCP Logs page is gated by `MCPLogs` delete permission independently of `MCPGateway` delete permission. ```sh cd ui pnpm i || npm i pnpm test || npm test pnpm build || npm run build ``` N/A - [x] Yes - [ ] No Any role configuration that previously relied on `MCPGateway` permissions to grant access to MCP Logs will need to be updated to explicitly grant `MCPLogs` permissions. N/A Access to MCP log data (which may contain sensitive tool execution details) is now enforced by a dedicated RBAC resource, reducing the risk of unintended access through overly broad `MCPGateway` permissions. - [ ] I read `docs/contributing/README.md` and followed the guidelines - [ ] I added/updated tests where appropriate - [ ] I updated documentation where needed - [ ] I verified builds succeed (Go and UI) - [ ] I verified the CI pipeline passes locally if applicable

…SQL helper to prevent malformed JSON from aborting list queries (maximhq#3407) ## Summary The `/api/logs` list query was aborting entirely when a single row contained malformed JSON in `input_history` or `responses_input_history`. The previous inline guard only checked the first character before casting to `jsonb`, so rows that appeared array-shaped but contained malformed JSON (unterminated structures, trailing commas, unpaired UTF-16 surrogates, `\u0000` escapes, etc.) would trigger a `22P02`/`22P05` error and kill the entire response. This PR fixes that by introducing a PL/pgSQL helper function (`bifrost_safe_jsonb`) that wraps the cast in an `EXCEPTION` block and falls back to returning the raw text on any parse failure. ## Changes - Added a new migration `migrationAddSafeJsonbFunction` that installs the `bifrost_safe_jsonb(text)` PL/pgSQL function on Postgres. The function validates the input, attempts the `jsonb` cast inside an `EXCEPTION` block, and returns the last array element on success or the raw text on any failure. - Replaced the multi-condition inline `CASE` guards in `listSelectColumns` for Postgres with calls to `bifrost_safe_jsonb`, simplifying the SQL and correctly handling all malformed-JSON edge cases that the previous character-check approach missed. - For SQLite, added `json_valid()`, `json_type()`, and `json_array_length()` guards to the `CASE` expressions to prevent extraction attempts on invalid or empty arrays. - Added `safe_jsonb_test.go` covering both the SQLite and Postgres dialect branches of `listSelectColumns`, as well as direct invocation of `bifrost_safe_jsonb` across all relevant edge cases (malformed structures, surrogate pairs, `\u0000` escapes, non-array values, SQL `NULL`). ## Type of change - [x] Bug fix - [ ] Feature - [ ] Refactor - [ ] Documentation - [ ] Chore/CI ## Affected areas - [x] Core (Go) - [ ] Transports (HTTP) - [ ] Providers/Integrations - [ ] Plugins - [ ] UI (React) - [ ] Docs ## How to test ```sh cd framework && docker compose up -d postgres go test ./framework/logstore/ -run 'MalformedInputHistory|BifrostSafeJsonb' -count=1 -v ``` Insert a row into the logs table with a malformed JSON value in `input_history` (e.g., `[{"key": "val"` — unterminated) and verify that a call to the list endpoint returns successfully without a 500 error, with the malformed row's `input_history` returned as raw text rather than aborting the query. ## Test Coverage ### `TestSearchLogs_MalformedInputHistory_{SQLite,Postgres}` — end-to-end list query | # | Case | Column | Payload shape | Pre-fix behavior | Path exercised | | --- | --- | --- | --- | --- | --- | | 1 | `unterminated_object_in_array` | `input_history` | `[{"role":"user","content":"hi"` | 22P02 aborts query | EXCEPTION fallback | | 2 | `garbage_after_bracket` | `input_history` | `[abc, not json]` | 22P02 aborts query | EXCEPTION fallback | | 3 | `trailing_comma` | `input_history` | `[{"role":"user","content":"hi"},]` | 22P02 aborts query | EXCEPTION fallback | | 4 | `unclosed_array_only` | `input_history` | `[` | 22P02 aborts query | EXCEPTION fallback | | 5 | `open_bracket_then_brace_unclosed` | `input_history` | `[{` | 22P02 aborts query | EXCEPTION fallback | | 6 | `nan_value_not_valid_json` | `input_history` | `[NaN]` | 22P02 aborts query | EXCEPTION fallback | | 7 | `infinity_value_not_valid_json` | `input_history` | `[Infinity]` | 22P02 aborts query | EXCEPTION fallback | | 8 | `unpaired_high_surrogate` | `input_history` | `[{"...":"bad \uD800 surrogate"}]` | 22P05 aborts query | EXCEPTION fallback | | 9 | `unpaired_low_surrogate` | `input_history` | `[{"...":"bad \uDC00 low"}]` | 22P05 aborts query | EXCEPTION fallback | | 10 | `bad_surrogate_pair_high_then_ascii` | `input_history` | `[{"c":"\uD800A"}]` | 22P05 aborts query | EXCEPTION fallback | | 11 | `u0000_escape_inside_string` | `input_history` | `[{"...":"null byte � here"}]` | 22P05 aborts query | EXCEPTION fallback | | 12 | `literal_backslash_u0000_valid_jsonb` | `input_history` | `[{"...":"... \\u0000 literal"}]` | OK (degraded to raw by old guard) | Fast path, last-element extraction | | 13 | `single_element_array` | `input_history` | `[{"role":"user","content":"only one"}]` | OK | Fast path | | 14 | `array_of_primitives` | `input_history` | `[1,2,3]` | OK | Fast path | | 15 | `array_with_null_last_element` | `input_history` | `[{...}, null]` | OK | Fast path | | 16 | `deeply_nested_valid` | `input_history` | `[{"role":"user","content":{"nested":{"deep":{"value":42}}}}]` | OK | Fast path | | 17 | `unicode_emoji_content` | `input_history` | `[{"...":"hello 🎉 world ✨"}]` | OK | Fast path | | 18 | `large_valid_array` | `input_history` | 1001-element array | OK | Fast path at scale | | 19 | `leading_whitespace_then_array` | `input_history` | ` [\t{...}]` | OK | `btrim` + fast path | | 20 | `top_level_object_not_array` | `input_history` | `{"not":"an array"}` | OK | Non-array fall-through | | 21 | `null_literal` | `input_history` | `null` | OK | Non-array fall-through | | 22 | `whitespace_only` | `input_history` | `" \t "` | OK | Empty-after-btrim fall-through | | 23 | `realtime_turn_malformed_passthrough` | `input_history` (object_type=`realtime.turn`) | `[{"role":"user"` | OK (outer CASE bypassed safe fn) | Realtime-turn bypass branch | | 24 | `malformed_responses_input_history` | `responses_input_history` | `[{"role":"user"` | 22P02 aborts query | Mirror column, EXCEPTION fallback | | 25 | `valid_responses_input_history` | `responses_input_history` | `[{...},{...}]` | OK | Mirror column, fast path | ## Breaking changes - [ ] Yes - [x] No ## Related issues [https://github.com/maximhq/bifrost/issues/3255](https://github.com/maximhq/bifrost/issues/3255#issuecomment-4427506449) ## Security considerations None. The function is `IMMUTABLE` and operates only on text values already stored in the database. No new inputs are exposed. ## Checklist - [ ] I read `docs/contributing/README.md` and followed the guidelines - [ ] I added/updated tests where appropriate - [ ] I updated documentation where needed - [ ] I verified builds succeed (Go and UI) - [ ] I verified the CI pipeline passes locally if applicable

…aximhq#3412) ## Summary Adds support for server-configured required request headers in the prompt playground. When the server specifies `required_headers` in its client config, users can now provide values for those headers directly in the settings panel, and they are forwarded with every chat completion request. ## Changes - Added `customHeaders` state and `requiredHeaders` derived from the core config's `client_config.required_headers` to the `PromptContext`, keeping header keys in sync with the server config while preserving user-entered values. - Exposed a "Required Headers" section in the settings panel that renders an input field for each required header name when any are configured. - Extended `ExecutionConfig` in the executor to accept `customHeaders`, which are merged into the fetch request headers (skipping any entries with empty names or values). - Passed `customHeaders` through both `handleSubmit` and `handleSubmitToolResult` execution paths and included it in their respective `useCallback` dependency arrays. ## Type of change - [ ] Bug fix - [x] Feature - [ ] Refactor - [ ] Documentation - [ ] Chore/CI ## Affected areas - [ ] Core (Go) - [ ] Transports (HTTP) - [ ] Providers/Integrations - [ ] Plugins - [x] UI (React) - [ ] Docs ## How to test 1. Configure `required_headers` in the server's client config (e.g., `["X-My-Custom-Header"]`). 2. Open the prompt playground and navigate to the settings panel. 3. Verify a "Required Headers" section appears with an input for each configured header name. 4. Enter a value for each header and send a chat completion request. 5. Confirm the header is present in the outgoing request. 6. Remove a header from the server config and verify it disappears from the UI without affecting other header values. ```sh cd ui pnpm i || npm i pnpm test || npm test pnpm build || npm run build ``` ## Screenshots/Recordings _Add before/after screenshots of the settings panel showing the new Required Headers section._ ## Breaking changes - [x] No ## Related issues ## Security considerations Header values are entered by the user and sent only to the configured backend endpoint. Empty header names or values are explicitly skipped before being added to the request, preventing accidental forwarding of blank headers. Users should be cautious not to enter sensitive credentials unless the connection to the server is secured. ## Checklist - [ ] I read `docs/contributing/README.md` and followed the guidelines - [ ] I added/updated tests where appropriate - [ ] I updated documentation where needed - [ ] I verified builds succeed (Go and UI) - [ ] I verified the CI pipeline passes locally if applicable

…3416) ## Summary When exporting virtual keys, pagination clamping was being applied unnecessarily, which could interfere with retrieving the full dataset. This PR skips the pagination limit/offset clamping when the export flag is set, while still ensuring the offset is non-negative. ## Changes - Pagination clamping via `ClampPaginationParams` is now bypassed when `params.Export` is `true`, allowing exports to retrieve data without artificially constrained limits - A minimal guard ensures `params.Offset` is still set to `0` if negative during an export request ## Type of change - [x] Bug fix - [ ] Feature - [ ] Refactor - [ ] Documentation - [ ] Chore/CI ## Affected areas - [ ] Core (Go) - [x] Transports (HTTP) - [ ] Providers/Integrations - [ ] Plugins - [ ] UI (React) - [ ] Docs ## How to test Trigger a virtual keys export request and verify that all keys are returned without being truncated by pagination limits. Compare the export result count against the total number of virtual keys in the system. ```sh go test ./... ``` ## Screenshots/Recordings N/A ## Breaking changes - [ ] Yes - [x] No ## Related issues maximhq#3414 ## Security considerations No additional security implications. Export access is still gated by existing authentication and authorization checks. ## Checklist - [ ] I read `docs/contributing/README.md` and followed the guidelines - [ ] I added/updated tests where appropriate - [ ] I updated documentation where needed - [ ] I verified builds succeed (Go and UI) - [ ] I verified the CI pipeline passes locally if applicable

## Summary Bumps the `@maximhq/bifrost` NPX package version to `1.6.3` to align the `package.json` and `package-lock.json` version fields, which were previously out of sync. ## Changes - Updated `package.json` version from `1.6.2` to `1.6.3` - Corrected `package-lock.json` to reflect `1.6.3` consistently across both the lockfile root and the package entry (previously mismatched at `1.0.6` and `1.0.4`) ## Type of change - [ ] Bug fix - [ ] Feature - [ ] Refactor - [ ] Documentation - [x] Chore/CI ## Affected areas - [ ] Core (Go) - [ ] Transports (HTTP) - [ ] Providers/Integrations - [ ] Plugins - [ ] UI (React) - [ ] Docs ## How to test ```sh cd npx/bifrost npm install npm pack --dry-run # Verify the reported version is 1.6.3 ``` ## Screenshots/Recordings N/A ## Breaking changes - [ ] Yes - [x] No ## Related issues N/A ## Security considerations No security implications. ## Checklist - [ ] I read `docs/contributing/README.md` and followed the guidelines - [ ] I added/updated tests where appropriate - [ ] I updated documentation where needed - [ ] I verified builds succeed (Go and UI) - [ ] I verified the CI pipeline passes locally if applicable

… bar click suppression (maximhq#3431) Adds a log volume histogram chart to the MCP Logs page, matching the existing chart behavior on the main Logs page. Also fixes a bug where clicking a bar immediately after a drag-select would overwrite the dragged time range with a single-bucket zoom. - Added `useGetMCPHistogramQuery` to the MCP Logs page to fetch histogram data with optional polling, and rendered the `LogsVolumeChart` component in the MCP Logs view. - Added `handleTimeRangeChange`, `handleResetZoom`, and `isZoomed` logic to the MCP Logs page, mirroring the behavior already present on the main Logs page. - Fixed `isZoomed` on the main Logs page to return `false` when a named `period` (e.g. `"1h"`) is active, so resetting zoom correctly clears the zoomed state. - When resetting zoom, `period: "1h"` and `polling: true` are now explicitly set in URL state to ensure the page returns to a live-polling relative range. - Fixed a race condition in `LogsVolumeChart` where Recharts fires a Bar `onClick` event immediately after a drag-select `mouseUp`, which was overwriting the dragged range with a single-bucket zoom. A `suppressNextBarClickRef` ref is set during drag completion and cleared on the next bar click to suppress the spurious event. - [x] Bug fix - [x] Feature - [x] UI (React) 1. Navigate to the MCP Logs page and confirm the log volume histogram chart renders and updates with polling. 2. Click a bar in the histogram and confirm the time range zooms into that bucket. 3. Drag-select a range on the histogram and confirm the time range updates to the dragged selection without immediately snapping to a single bucket. 4. Click "Reset Zoom" and confirm the chart returns to the default 1-hour live-polling view. ```sh cd ui pnpm i pnpm build ``` Before: MCP Logs page had no histogram chart. After: MCP Logs page displays the log volume histogram with zoom, drag-select, and reset zoom functionality identical to the main Logs page. - [x] No None. - [ ] I read `docs/contributing/README.md` and followed the guidelines - [ ] I added/updated tests where appropriate - [ ] I updated documentation where needed - [ ] I verified builds succeed (Go and UI) - [ ] I verified the CI pipeline passes locally if applicable

## Summary This PR refactors the semantic cache plugin to simplify its internal state management, improves cache lookup correctness, and adds a new `cache_hit_types` filter to the logs API and UI. The direct cache lookup path is now a single deterministic point-fetch by a UUIDv5 `directCacheID` (replacing the previous dual-path of chunk lookup + legacy metadata scan), and several context keys are consolidated. The UI gains a "Local Caching" filter sidebar section and cache hit type badges in the log detail view. ## Changes - **Semantic cache plugin refactor:** - Replaced the dual direct-search path (`performDirectChunkLookup` + `performLegacyDirectSearch`) with a single `performDirectSearch` that does an O(1) `GetChunk` by deterministic `directCacheID` (UUIDv5 derived from provider, model, cacheKey, requestHash, paramsHash). - `generateDirectCacheID` now returns an error instead of silently falling back to a string concatenation, making failures explicit. - `request_hash` is no longer stored as a top-level metadata field; it is encoded into the `directCacheID` instead. - Reduced context keys from ~10 to 4 (`directCacheIDKey`, `paramsHashKey`, `embeddingsKey`, `embeddingsInputTokensKey`), removing stale keys like `requestIDKey`, `requestHashKey`, `isCacheHitKey`, and `cacheHitTypeKey`. - `shouldSkipCaching` is extracted into its own method; cache-hit detection now reads `CacheDebug.CacheHit` from the response rather than a context flag. - `buildUnifiedMetadata` no longer accepts `requestHash` as a parameter. - `addSingleResponse` renamed to `addNonStreamingResponse`. - `StreamAccumulator` fields `HasError`, `FinalTimestamp`, and `FinishReason` on `StreamChunk` are removed; error streams are handled by early return in `PostLLMHook`. - Streaming replay goroutine now guards every send with `ctx.Done()` to prevent goroutine leaks on dropped consumers. - A background `runStreamCleanupLoop` goroutine (started by `Init`, stopped by `Cleanup` via `stopCh`) replaces the one-shot cleanup call, periodically reaping stale stream accumulators. - `buildResponseFromResult` now accepts `threshold`, `similarity`, and `inputTokens` as pointers, and `attachCacheDebug` is extracted as a shared helper for both streaming and non-streaming paths. - `isExpiredEntry` is extracted as a standalone function. - `chunkSortKey` replaces the large inline sort comparator in `processAccumulatedStream`. - Tools, stop sequences, modalities, include lists, and other order-insensitive set fields are now hashed with `hashSortedSet` / `sortedStringSet` to prevent MCP's randomized map iteration from perturbing the request hash. - `extractAttachmentsForCaching` is extracted so attachment URLs are included in the cache key metadata rather than the embedding text. - `extractTextForEmbedding` no longer returns a `paramsHash`; callers compute it once via `buildRequestMetadataForCaching` + `hashMap`. - `generateEmbedding` moved from `utils.go` to `search.go`. - `generateRequestHash` now accepts prebuilt metadata to avoid recomputing it. - `removeField` no longer mutates the input slice's backing array. - Added `PronunciationDictionaryLocators`, `TimestampGranularities`, `Include`, `AdditionalFormats`, and `InputImages` to their respective parameter metadata extractors. - Public context key names changed from `semantic_cache_*` to `semantic_cache-*` (underscore → hyphen separator after the plugin prefix). - `SelectFields` no longer includes `request_hash`. - `VectorStoreProperties` no longer includes a `request_hash` entry. - `CacheByModel` and `CacheByProvider` default-value log messages added. - **Log filtering — `cache_hit_types`:** - Added `CacheHitTypes []string` to `SearchFilters` in `framework/logstore/tables.go`. - `applyFilters` in `rdb.go` applies a JSON path filter on `cache_debug` for both SQLite (`json_extract`) and PostgreSQL (`substring` regex) dialects, restricted to the allowlist `["direct", "semantic"]`. - `canUseMatViewFilters` excludes queries with `CacheHitTypes` set from the materialized-view fast path. - HTTP handlers (`getLogs`, `getLogsStats`, `parseHistogramFilters`) parse a `cache_hit_types` comma-separated query parameter. - **UI:** - Added a "Local Caching" filter section to `LogsFilterSidebar` with checkboxes for "Direct cache" and "Semantic cache". - `cache_hit_types` is added to URL state, filter state, and the `buildFilterParams` API helper. - Log detail view shows "Direct Cache" (indigo) and "Semantic Cache" (rose) badges based on `cache_debug.hit_type`. - Plugins form now filters the provider dropdown to embedding-capable providers only (`EmbeddingSupportedProviders` for built-ins; `custom_provider_config.allowed_requests.embedding` for custom providers), shows an error message when no embedding provider is configured, and disables the toggle accordingly. - Embedding model input replaced with `ModelMultiselect` (single-select mode) scoped to the selected provider. - Provider dropdown clears the embedding model when the provider changes. - Provider icons rendered in the provider dropdown. - `EmbeddingSupportedProviders` constant added to `ui/lib/constants/logs.ts`. - **Misc:** - HTTP request logging in `CorsMiddleware` and an auth debug log are commented out. - `transports/bifrost-http/v1.5.x` added to `.gitignore`. - Minor formatting fixes in `core/schemas/bifrost.go` and `framework/modelcatalog/sync.go`. - Missing newline at end of `sync.go` added. ## Type of change - [ ] Bug fix - [x] Feature - [x] Refactor - [ ] Documentation - [ ] Chore/CI ## Affected areas - [x] Core (Go) - [x] Transports (HTTP) - [ ] Providers/Integrations - [x] Plugins - [x] UI (React) - [ ] Docs ## How to test ```sh # Core/Transports go test ./plugins/semanticcache/... go test ./framework/logstore/... go test ./transports/bifrost-http/... # UI cd ui pnpm i pnpm build ``` - Configure the semantic cache plugin with a direct and/or semantic cache type and verify that cache hits are recorded with the correct `hit_type` in `cache_debug`. - Query `/logs?cache_hit_types=direct` and `/logs?cache_hit_types=semantic` and confirm only matching entries are returned. - In the UI, open the logs filter sidebar and verify the "Local Caching" section appears with "Direct cache" and "Semantic cache" checkboxes that correctly filter the log list. - Open a log detail for a cache hit and confirm the appropriate badge ("Direct Cache" or "Semantic Cache") is displayed. - In the plugins form, verify that only embedding-capable providers appear in the provider dropdown and that the embedding model field uses the model multiselect. ## Breaking changes - [x] Yes The public semantic cache context key names have changed from `semantic_cache_*` to `semantic_cache-*`. Any caller setting `CacheKey`, `CacheTTLKey`, `CacheThresholdKey`, `CacheTypeKey`, or `CacheNoStoreKey` via the old string values will no longer be recognized by the plugin. Update all call sites to use the exported constants from the plugin package rather than raw string literals. `request_hash` is no longer stored as a top-level metadata field in the vector store. Existing cache entries written by prior versions will not be found by the new direct-search path (they will be treated as misses and re-populated). `ClearCacheForRequestID` is documented as currently broken for entries written by the new direct-search path; callers should not rely on it until the TODO is resolved. ## Related issues N/A ## Security considerations The `CacheHitTypes` filter allowlists values to `"direct"` and `"semantic"` before interpolating them into SQL, preventing arbitrary input from reaching the JSON path expression. ## Checklist - [ ] I read `docs/contributing/README.md` and followed the guidelines - [x] I added/updated tests where appropriate - [ ] I updated documentation where needed - [x] I verified builds succeed (Go and UI) - [ ] I verified the CI pipeline passes locally if applicable

…aximhq#3330) ## Summary Removes the `cleanup_on_shutdown` option from the semantic cache plugin. Cache data now always persists between Bifrost restarts. The previous behavior of deleting all cache entries and the vector store namespace on shutdown is no longer supported. ## Changes - Removed `CleanUpOnShutdown` field from `Config` struct in `plugins/semanticcache/main.go` and stripped the corresponding shutdown deletion logic from `Cleanup()` - Removed `cleanup_on_shutdown` from the JSON config schema (`transports/config.schema.json`), Helm values schema (`helm-charts/bifrost/values.schema.json`), Helm template helper (`_helpers.tpl`), and default `values.yaml` - Removed `cleanup_on_shutdown` from all example Kubernetes values files and documentation code samples - Added migration guide entry (Breaking Change 16) in `docs/migration-guides/v1.5.0.mdx` describing the removal, how to clear cache data using the existing invalidation endpoints, and how to handle dimension/provider/model rotation without the old escape hatch - Updated the semantic caching feature docs to remove references to `cleanup_on_shutdown` and the associated warning block - Removed `TestCleanup_DeletesEntriesAndNamespaceWhenEnabled` test and simplified `newTestPlugin` helper to drop the `cleanupOnShutdown` parameter across all test files ## Type of change - [ ] Bug fix - [ ] Feature - [x] Refactor - [ ] Documentation - [ ] Chore/CI ## Affected areas - [ ] Core (Go) - [ ] Transports (HTTP) - [ ] Providers/Integrations - [x] Plugins - [ ] UI (React) - [x] Docs ## How to test ```sh go test ./plugins/semanticcache/... ``` Verify that passing `cleanup_on_shutdown` in a semantic cache plugin config is rejected by schema validation. Confirm that restarting Bifrost with a semantic cache configured leaves existing vector store entries intact. ## Breaking changes - [x] Yes - [ ] No The `cleanup_on_shutdown` field is removed from the semantic cache plugin config schema and will be rejected by validation. Remove it from `config.json`, Helm values, and any `PUT /api/config` payloads. To clear cache data, use `DELETE /api/cache/clear/{cacheId}`, `DELETE /api/cache/clear-by-key/{cacheKey}`, or rotate `vector_store_namespace` to a fresh name. ## Related issues See Breaking Change 16 in the v1.5.0 migration guide. ## Security considerations None. ## Checklist - [ ] I read `docs/contributing/README.md` and followed the guidelines - [x] I added/updated tests where appropriate - [x] I updated documentation where needed - [ ] I verified builds succeed (Go and UI) - [ ] I verified the CI pipeline passes locally if applicable

## Summary Replaces the separate `PluginsForm` component with a fully self-contained `CachingView` that introduces a first-class **Direct / Direct + Semantic** mode toggle for the local cache plugin. Previously, the UI only exposed provider-backed semantic cache settings and had no concept of direct-only (hash-based) caching as a distinct, supported mode. This rewrite makes direct-only mode the default and gates semantic configuration behind an explicit mode selection. ## Changes - Deleted `pluginsForm.tsx` and consolidated all local cache configuration logic directly into `cachingView.tsx`. - Introduced a `CacheMode` type (`"direct"` | `"semantic"`) with a tab-based picker. Direct-only mode requires no embedding provider; semantic mode adds vector similarity on top and requires a provider, model, and dimension. - The enable/disable toggle now immediately calls `updatePlugin` or `createPlugin` (for first-time setup) rather than deferring the enabled-state change to the Save button, decoupling the plugin lifecycle from config edits. - Added `inferMode` to derive the active mode from a saved config, `isEmptyConfig` to detect zero-value configs from the API, `buildPayload` to strip semantic-only fields when persisting a direct-only config, and `validateForSave` for inline validation surfaced before the user clicks Save. - Structural change warnings (provider/model/dimension drift vs. server state) are now shown only when the user has actually modified those fields, rather than permanently in semantic mode. - Removed the Zod `cacheConfigSchema` validation path in favor of the new `validateForSave` function. - Removed the effect that auto-seeded a default provider/model/dimension on first load, since direct-only mode no longer requires those fields. - Per-request override documentation expanded to include `x-bf-cache-key` and `x-bf-cache-no-store` with clearer descriptions. ## Type of change - [ ] Bug fix - [ ] Feature - [x] Refactor - [ ] Documentation - [ ] Chore/CI ## Affected areas - [ ] Core (Go) - [ ] Transports (HTTP) - [ ] Providers/Integrations - [ ] Plugins - [x] UI (React) - [ ] Docs ## How to test ```sh cd ui pnpm i || npm i pnpm build || npm run build ``` 1. Navigate to the Workspace → Config → Caching view. 2. Verify the page loads with **Direct only** selected by default and no provider/model/dimension fields visible. 3. Switch to **Direct + Semantic** and confirm provider, model, and dimension fields appear with inline validation. 4. Toggle caching on without a vector store configured and confirm the toggle is disabled. 5. Save a direct-only config and confirm the plugin is created/updated with `dimension: 1` and no provider fields. 6. Save a semantic config with a valid provider, model, and dimension and confirm the full payload is persisted. 7. Reload the page and confirm the saved mode and config are correctly hydrated. ## Screenshots/Recordings Before/after screenshots recommended showing the mode tab picker, the conditional semantic fields, and the structural change warning banner. ## Breaking changes - [x] Yes - [ ] No The `PluginsForm` component is removed. Any code importing it directly will need to be updated. The enable/disable toggle now persists immediately rather than requiring a Save click, which changes the interaction model for existing users. ## Related issues N/A ## Security considerations No new auth, secrets, or PII handling introduced. API keys for embedding providers continue to be inherited from the provider's existing configuration and are not re-entered or stored in the cache config. ## Checklist - [ ] I read `docs/contributing/README.md` and followed the guidelines - [ ] I added/updated tests where appropriate - [ ] I updated documentation where needed - [ ] I verified builds succeed (Go and UI) - [ ] I verified the CI pipeline passes locally if applicable

…and plugin reloads (maximhq#3423) ## Summary The `CacheHandler` previously captured a reference to the `semantic_cache` plugin at boot time. This caused two bugs: (1) if the plugin was not present in `config.json` at startup, cache-clear routes were never registered, resulting in HTTP 405 for the entire process lifetime; (2) if the plugin was loaded or reloaded via `/api/plugins` after boot, the handler held a stale (or nil) pointer and would silently misbehave. Additionally, `GET /api/plugins/:name` was returning the raw plugin config without runtime status, causing the UI to see an empty status when refetching a single plugin. ## Changes - `CacheHandler` now accepts a `CacheClearerResolver` function instead of a concrete plugin pointer. The resolver is called on every cache-clear request, so plugin lifecycle changes via `/api/plugins` are always honored. - `CacheClearer` and `CacheClearerResolver` are exported so server wiring can supply the resolver without importing the plugin's concrete type. - Cache routes are registered unconditionally at startup. When no plugin is loaded, requests return HTTP 400 with a descriptive message instead of HTTP 405. - The server wiring in `RegisterAPIRoutes` uses a closure over `lib.FindPluginAs` to resolve the plugin per request, replacing the boot-time capture. - `getPlugin` now returns the same response shape as list/create/update (with runtime status merged in), fixing the empty status seen by `useGetPluginQuery` in the UI. - Tests cover the new "plugin not loaded" path for both `clearCache` and `clearCacheByKey`, and existing tests are updated to use the resolver-based constructor. ## Type of change - [x] Bug fix - [ ] Feature - [ ] Refactor - [ ] Documentation - [ ] Chore/CI ## Affected areas - [ ] Core (Go) - [x] Transports (HTTP) - [ ] Providers/Integrations - [x] Plugins - [ ] UI (React) - [ ] Docs ## How to test ```sh go test ./transports/bifrost-http/handlers/... go test ./transports/bifrost-http/... ``` 1. Start the server **without** `semantic_cache` in `config.json`. Issue `DELETE /api/cache/clear/{cacheId}` — expect HTTP 400 with `"semantic_cache plugin is not loaded"` (previously HTTP 405). 2. Load the `semantic_cache` plugin via `POST /api/plugins`. Repeat the request — expect the cache-clear to succeed. 3. Reload or remove the plugin via `PUT`/`DELETE /api/plugins`. Verify the handler reflects the new state on the next request without a server restart. 4. Issue `GET /api/plugins/{name}` for a loaded plugin and confirm the response includes runtime status fields, matching the shape returned by the list endpoint. ## Breaking changes - [x] Yes - [ ] No `NewCacheHandler` now accepts a `CacheClearerResolver` function instead of a `schemas.LLMPlugin`. Any caller constructing a `CacheHandler` directly must be updated to pass a resolver. ## Related issues ## Security considerations None. The change does not affect authentication, secrets, or PII handling. ## Checklist - [ ] I read `docs/contributing/README.md` and followed the guidelines - [x] I added/updated tests where appropriate - [ ] I updated documentation where needed - [x] I verified builds succeed (Go and UI) - [ ] I verified the CI pipeline passes locally if applicable

…rch paths in semantic cache (maximhq#3424) ## Summary Fixes several correctness issues in the semantic cache plugin's `PostLLMHook` and related helpers: cache telemetry (`cache_debug`) was previously invisible to callers using `no-store`, cache-hit replay detection was fragile, non-positive per-request TTL overrides could silently kill cache writes, and requests with a `cache_type` header narrowed to a path the plugin cannot serve would still produce orphan cache entries. ## Changes - **Early exit for unsupported search paths in `PreLLMHook`**: When `resolveCacheTypes` resolves to a path the plugin cannot actually serve (e.g. `x-bf-cache-type=semantic` against a direct-only plugin, or an unknown header value), the hook now clears cache state and returns early instead of proceeding to generate an embedding or write an orphan entry under a random request UUID that no future read can match. - **Separated cache-hit replay handling from write-skip logic**: The `shouldSkipCaching` method (which conflated cache-hit detection with write-skip conditions) is replaced by `shouldSkipCacheWrite`. Cache-hit replay is now handled as a dedicated early return in `PostLLMHook` before any telemetry stamping, while `shouldSkipCacheWrite` gates only the write decision after telemetry is already stamped. This ensures `cache_debug` is always populated for callers using `no-store` or large-payload modes. - **Telemetry stamped before write decision**: `stampCacheDebugForMiss` is now called before `shouldSkipCacheWrite` is consulted, so observability is not conditional on whether the entry is ultimately written. - **Non-positive TTL overrides fall back to plugin default**: `resolveTTL` now treats a zero or negative per-request TTL override as "use default" rather than applying it, which would have set `expires_at=now` and silently discarded the cache write. - **Cleaned up stale comments**: Removed an outdated ordering constraint comment in `PostLLMHook` that no longer applies after the restructuring. - **Tests updated**: Test cases for `shouldSkipCaching` are renamed and updated to reflect the new `shouldSkipCacheWrite` contract. The cache-hit replay test case is removed from this suite (it is now an early return in `PostLLMHook`, not a condition inside the helper). A new default-is-false test is added. ## Type of change - [x] Bug fix - [ ] Feature - [x] Refactor - [ ] Documentation - [ ] Chore/CI ## Affected areas - [ ] Core (Go) - [ ] Transports (HTTP) - [ ] Providers/Integrations - [x] Plugins - [ ] UI (React) - [ ] Docs ## How to test ```sh go test ./plugins/semanticcache/... ``` Validate the following scenarios: - A request with `x-bf-cache-type=semantic` against a plugin configured with `Provider=""` or `Dimension=1` should log a warning and skip caching entirely — no orphan entry should appear in the store. - A request with `Cache-Control: no-store` should still produce a populated `cache_debug` field in the response with `cache_hit=false`. - A per-request TTL override of `0s` should fall back to the plugin's configured default TTL and not silently discard the cache write. ## Breaking changes - [x] No ## Security considerations None. ## Checklist - [ ] I read `docs/contributing/README.md` and followed the guidelines - [x] I added/updated tests where appropriate - [ ] I updated documentation where needed - [x] I verified builds succeed (Go and UI) - [ ] I verified the CI pipeline passes locally if applicable

## Summary Adds a standalone end-to-end test suite for the `semantic_cache` plugin under `tests/semanticcache`. The suite validates the full caching lifecycle against a live Bifrost instance — plugin creation/teardown, cache miss/hit assertions, cross-provider behavior, streaming, and log cross-checking — without provisioning any infrastructure itself. ## Changes - **`e2e_test.go`** — `TestMain` entry point: loads config, initializes the report directory, checks Bifrost reachability, enforces plugin-absent precondition (with `RUN_FORCE=1` auto-delete), runs all phases, and performs best-effort teardown on exit. - **`preconditions_test.go`** — Phase 0 checks: Bifrost reachable, OpenAI configured, optional providers (Gemini, Anthropic) present with warnings if absent. - **`http_test.go`** — HTTP helpers for all request types: chat completions (streaming and non-streaming), text completions, embeddings, image generation, and the Responses API. Each helper dumps full request/response bodies to the report directory for forensics. - **`plugin_test.go`** — Plugin lifecycle helpers (`pluginCreate`, `pluginUpdate`, `pluginDelete`, `pluginGet`) mirroring the exact wire format the UI sends to `/api/plugins`. - **`assert_test.go`** — Assertion helpers (`assertMiss`, `assertHit`, `assertNoCacheDebug`, `assertSameCacheID`, `assertDifferentCacheID`) plus a configurable async write-settle wait (`SC_WRITE_SETTLE_MS`) to account for the plugin's async PostLLMHook store write. - **`cache_test.go`** — Cache management helpers (`clearByCacheID`, `clearByCacheKey`) wrapping the `/api/cache/clear/*` endpoints. - **`logs_crosscheck_test.go`** — Cross-checks the persisted log row's `cache_debug` against the in-flight response stamp, with polling to handle Bifrost's async logging pipeline and float epsilon tolerance for JSON encoder differences. - **`fixtures_test.go`** — Hand-curated paraphrase pairs for Phase 2 semantic cases, designed to land well above (canonical→paraphrase) or well below (canonical→unrelated) the default 0.8 similarity threshold. - **`log_test.go`** — Structured per-run logging to `reports/<UTC-timestamp>/run.log` with optional `TRAIL_SESSION_ID` stamping for trail integration. - **`go.mod`** — Standalone module (`github.com/maximhq/bifrost/tests/semanticcache`), consistent with the `tests/governance` pattern, excluded from the repo's `go.work`. - **`README.md`** — Documents prerequisites, env vars, run commands, trail integration, and report output format. - **`.gitignore`** — Excludes `reports/` and `*.log` from version control. Notable design decisions: the suite is intentionally verify-only (no infrastructure provisioning), uses a dedicated vector store namespace (`BifrostSemanticCachePluginE2E`) to isolate test data, and writes full wire-level request/response artifacts per step to support post-mortem debugging without re-running. ## Type of change - [ ] Bug fix - [ ] Feature - [ ] Refactor - [ ] Documentation - [x] Chore/CI ## Affected areas - [ ] Core (Go) - [ ] Transports (HTTP) - [ ] Providers/Integrations - [x] Plugins - [ ] UI (React) - [ ] Docs ## How to test Requires a running Bifrost instance with Weaviate configured, OpenAI (required), and optionally Gemini and Anthropic providers. ```sh cd tests/semanticcache # All phases GOWORK=off go test -v ./... # Single phase GOWORK=off go test -v -run TestPhase1_DirectOnly ./... # Auto-delete any pre-existing plugin row before run RUN_FORCE=1 GOWORK=off go test -v ./... # Keep plugin after run for post-mortem inspection RUN_KEEP_PLUGIN=1 GOWORK=off go test -v ./... ``` Environment variables: | Variable | Default | Purpose | |---|---|---| | `BIFROST_URL` | `http://localhost:8080` | Bifrost base URL | | `SC_CHAT_MODEL_OPENAI` | `openai/gpt-4o-mini` | OpenAI chat model | | `SC_CHAT_MODEL_OPENAI_ALT` | `openai/gpt-4o` | Alternate OpenAI model for cache-by-model cases | | `SC_EMBED_MODEL_OPENAI` | `text-embedding-3-small` | Embedding model for Phase 2 | | `SC_CHAT_MODEL_GEMINI` | `gemini/gemini-2.5-flash` | Gemini chat model | | `SC_CHAT_MODEL_ANTHROPIC` | `anthropic/claude-haiku-4-5` | Anthropic chat model | | `SC_NAMESPACE` | `BifrostSemanticCachePluginE2E` | Vector store namespace | | `SC_WRITE_SETTLE_MS` | `500` | Async write settle wait in ms | | `RUN_FORCE` | unset | `1` to delete pre-existing plugin before run | | `RUN_KEEP_PLUGIN` | unset | `1` to skip teardown on exit | | `TRAIL_SESSION_ID` | unset | Stamped onto every log line for trail integration | ## Screenshots/Recordings N/A ## Breaking changes - [x] No ## Related issues N/A ## Security considerations No secrets are stored in the test suite. API keys are consumed from the existing Bifrost provider configuration and never passed directly through the test harness. ## Checklist - [ ] I read `docs/contributing/README.md` and followed the guidelines - [x] I added/updated tests where appropriate - [x] I updated documentation where needed - [x] I verified builds succeed (Go and UI) - [ ] I verified the CI pipeline passes locally if applicable

## Summary Adds a comprehensive end-to-end test suite (`TestDirect`) for the semantic cache plugin operating in direct-only mode. The suite covers 55 test cases (plan §1.1–1.55) validating cache hit/miss behavior, key isolation, TTL handling, config flag mutations, normalization, streaming, multi-endpoint support, parameter hashing, tool definitions, and cache management operations. ## Changes - Introduces `tests/semanticcache/direct_test.go` with `TestDirect`, covering: - **Basic hit/miss and key isolation** (1.1, 1.2, 1.3, 1.4) - **`cache_by_model` and `cache_by_provider` flag behavior** (1.5–1.8), including serial config-mutation cases that restore baseline via `t.Cleanup` - **`exclude_system_prompt` flag** (1.9, 1.10) - **Conversation threshold boundary conditions** (1.11, 1.12) - **TTL expiry, per-request TTL override, invalid TTL fallback, and zero/negative TTL fallback** (1.13, 1.14, 1.15, 1.54) - **`no-store` header semantics**, including case-sensitivity and explicit `false` value (1.16, 1.17, 1.45, 1.46) - **`cache-type` header behavior** in direct-only mode, including the `semantic` header bug case (1.18, 1.19) - **Streaming SSE**: hit/miss, chunk replay order, and non-final chunk cache_debug absence (1.24, 1.25, 1.47) - **Multi-endpoint coverage**: text completions, responses API, embeddings, and image generation (1.20–1.23) - **Input normalization**: case folding, whitespace trimming, Unicode, and large prompts (1.26–1.29) - **Image attachment hashing**: same URL hits, different URL misses (1.30, 1.31) - **Edge cases**: nil content messages, empty messages array, unknown cache ID deletion (1.42, 1.43, 1.40) - **Parameter hash isolation**: temperature, top_p, seed, max_tokens, top_logprobs, tools (order-independent and name-change), prompt_cache_key, service_tier, store flag (1.32–1.37, 1.48–1.52) - **Cache management**: clear by cache ID, clear by key (1.38, 1.39) - **Plugin status round-trip** via GET (1.44) - **`/api/logs` cross-check**: verifies persisted `cache_debug` matches in-flight response stamp (1.55) - **`responses` API `previous_response_id` isolation** (1.53) - **Threshold header no-op in direct-only mode** (1.41) - Adds helper functions: `simpleChat`, `chatWithSystem`, `chatWithImage`, `restoreDirectBaseline`, `assertHitAndReturnCacheDebug` - Establishes a parallelism contract: cases that mutate plugin config run serially (no `t.Parallel()`); all others run concurrently with unique cache keys to prevent collisions ## Type of change - [ ] Bug fix - [ ] Feature - [ ] Refactor - [ ] Documentation - [x] Chore/CI ## Affected areas - [ ] Core (Go) - [ ] Transports (HTTP) - [ ] Providers/Integrations - [x] Plugins - [ ] UI (React) - [ ] Docs ## How to test ```sh # Run the full direct-mode suite go test ./tests/semanticcache/... -run TestDirect -v -timeout 300s # Skip the expensive image generation case SC_SKIP_IMAGE_GEN=1 go test ./tests/semanticcache/... -run TestDirect -v -timeout 300s ``` Required environment variables (same as the broader semantic cache e2e suite): - `OPENAI_MODEL` — primary OpenAI-compatible model (e.g. `openai/gpt-4o-mini`) - `OPENAI_MODEL_ALT` — secondary model for cross-model isolation cases - `OPENAI_EMBED` — embedding model name (e.g. `text-embedding-3-small`) - `ANTHRO_MODEL` — (optional) Anthropic model; cases 1.7 and 1.8 skip if unset - `SC_SKIP_IMAGE_GEN=1` — (optional) skip case 1.23 to avoid DALL-E costs ## Screenshots/Recordings N/A — test-only change. ## Breaking changes - [x] No ## Related issues ## Security considerations No new auth, secrets, or PII surface. Test prompts are benign and do not contain sensitive data. ## Checklist - [ ] I read `docs/contributing/README.md` and followed the guidelines - [x] I added/updated tests where appropriate - [ ] I updated documentation where needed - [x] I verified builds succeed (Go and UI) - [ ] I verified the CI pipeline passes locally if applicable

## Summary Adds a comprehensive integration test suite for the semantic cache mode (Phase 2), covering the full lifecycle of semantic similarity-based cache hits and misses using Weaviate as the vector store and OpenAI's `text-embedding-3-small` as the embedding model. This suite validates that the semantic cache behaves correctly across a wide range of real-world scenarios, complementing the existing direct-mode (Phase 1) tests. ## Changes - Added `TestParaphraseFixtures` to pre-flight all paraphrase pairs against the live embedding model, asserting cosine similarity thresholds before any semantic cache cases run. This prevents flaky downstream failures caused by borderline fixture pairs. - Added `TestSemantic` containing 44 sub-cases (2.1–2.44) covering: - Semantic hit on paraphrase, miss on unrelated content - Per-request threshold overrides (relax, tighten, clamp above/below valid range) - `x-bf-cache-type` header forcing direct-only or semantic-only lookup paths - Cache key and model/provider isolation in semantic mode - `cache_by_model=false` and `cache_by_provider=false` cross-model/cross-provider hits - Streaming replay of semantic hits, including tool call preservation - TTL expiry, per-request TTL, TTL=0 fallback, and `no-store` header semantics - Namespace isolation and dimension-change silent miss behavior - Embedding endpoint bypass (semantic search skipped for `/v1/embeddings`) - Image generation and Responses API semantic hits - Text completion semantic hits - Gemini provider with OpenAI embedding provider - `params_hash` isolation (temperature, service tier, store flag, prompt cache key, previous response ID) - `exclude_system_prompt` flag effect on semantic matching - Conversation message threshold skipping semantic search - Attachment URL changes causing misses - `cache_debug` field presence and correctness on hits and misses, including log endpoint cross-check - Streaming chunk-level `cache_debug` placement (final chunk only) - Serial (non-parallel) cases that mutate plugin config restore baseline via `t.Cleanup` to avoid test pollution. - A dedicated Weaviate namespace (`cfg.Namespace + "Semantic"`) is used to avoid dimension conflicts with the Phase 1 direct-mode namespace. ## Type of change - [ ] Bug fix - [ ] Feature - [ ] Refactor - [ ] Documentation - [x] Chore/CI ## Affected areas - [ ] Core (Go) - [ ] Transports (HTTP) - [ ] Providers/Integrations - [x] Plugins - [ ] UI (React) - [ ] Docs ## How to test ```sh # Run fixture pre-flight (requires OpenAI embedding access) go test ./tests/semanticcache/... -run TestParaphraseFixtures -v # Run full semantic suite go test ./tests/semanticcache/... -run TestSemantic -v -timeout 10m # Skip fixture verification if embedding access is unavailable SC_SKIP_FIXTURE_VERIFY=1 go test ./tests/semanticcache/... -run TestSemantic -v -timeout 10m # Skip image generation cases if DALL-E is unavailable SC_SKIP_IMAGE_GEN=1 go test ./tests/semanticcache/... -run TestSemantic -v -timeout 10m ``` Required environment/config: - `cfg.OpenAIEmbed` — embedding model name (e.g. `text-embedding-3-small`) - `cfg.OpenAIModel` / `cfg.OpenAIModelAlt` — chat models for isolation tests - `cfg.AnthroModel` — optional; skipped if empty (case 2.13) - `cfg.GeminiModel` — optional; skipped if empty (case 2.28) - `cfg.Namespace` — base Weaviate namespace; suite appends `Semantic` suffix - `SC_SKIP_FIXTURE_VERIFY=1` — skip embedding pre-flight - `SC_SKIP_IMAGE_GEN=1` — skip DALL-E case ## Screenshots/Recordings N/A ## Breaking changes - [x] No ## Related issues N/A ## Security considerations No new auth, secrets, or PII handling introduced. Tests call live external APIs (OpenAI, optionally Anthropic/Gemini) and require valid credentials in the test environment; no credentials are hardcoded. ## Checklist - [ ] I read `docs/contributing/README.md` and followed the guidelines - [x] I added/updated tests where appropriate - [ ] I updated documentation where needed - [x] I verified builds succeed (Go and UI) - [ ] I verified the CI pipeline passes locally if applicable

## Summary Adds an end-to-end lifecycle test for the semantic cache plugin, covering the full disable → re-enable → delete → recreate flow and asserting that namespace data persists across each state transition. ## Changes - Introduces `TestLifecycle` in `tests/semanticcache/lifecycle_test.go`, which runs 10 serial subtests (3.1–3.10) exercising the plugin's lifecycle state machine: - **3.1** – Disabling the plugin via PUT sets `enabled=false` and `status=disabled` - **3.2** – Requests while disabled bypass the cache pipeline entirely (no `cache_debug` header) - **3.3 / 3.4** – Cache-clear endpoints (`/api/cache/clear/{id}` and `/api/cache/clear-by-key/{k}`) return HTTP 400 when the plugin is not loaded - **3.5** – Re-enabling via PUT restores `enabled=true` and `status=active` - **3.6** – Entries written before disable are still queryable after re-enable - **3.7** – DELETE removes both the DB row and the in-memory plugin instance - **3.8** – Requests after delete bypass the cache pipeline (no `cache_debug` header) - **3.9** – Recreating the plugin with the same config succeeds and surfaces `status=active` - **3.10** – Entries written before delete are still queryable after recreate, validating the namespace-persistence contract introduced by the removal of `CleanUpOnShutdown` - Tests are intentionally serial (no `t.Parallel()`) because each subtest mutates globally shared plugin lifecycle state - A `t.Cleanup` handler performs best-effort key clearing regardless of which lifecycle state the plugin is left in at teardown ## Type of change - [ ] Bug fix - [ ] Feature - [ ] Refactor - [ ] Documentation - [x] Chore/CI ## Affected areas - [ ] Core (Go) - [ ] Transports (HTTP) - [ ] Providers/Integrations - [x] Plugins - [ ] UI (React) - [ ] Docs ## How to test ```sh go test ./tests/semanticcache/... -run TestLifecycle -v ``` Expected outcome: all 10 subtests (3.1–3.10) pass, with structured log output at each step confirming correct status transitions and cache hit/miss behaviour. ## Breaking changes - [x] No ## Related issues ## Security considerations None. Tests run against a local Bifrost instance and do not introduce new auth paths, secrets handling, or PII exposure. ## Checklist - [ ] I read `docs/contributing/README.md` and followed the guidelines - [x] I added/updated tests where appropriate - [ ] I updated documentation where needed - [x] I verified builds succeed (Go and UI) - [ ] I verified the CI pipeline passes locally if applicable

…kefile targets (maximhq#3429) ## Summary Adds Makefile targets for running `semantic_cache` plugin unit tests and end-to-end tests, with optional integration of the `trail` CLI for capture-based debugging sessions. ## Changes - Added `test-semantic-cache` target that runs e2e tests from `tests/semanticcache`, supporting a `CACHE_TYPE` variable (`direct` or `semantic`) to filter which test phases are executed. Automatically wraps the run in `trail run` if the `trail` binary is available on `PATH`. - Added `test-semantic-cache-complete` target that runs both the plugin unit tests (`plugins/semanticcache`) and the e2e tests in sequence, optionally wrapping the entire session in a single `trail run` invocation. - Added `_test-semantic-cache-complete-inner` as an internal helper target that performs the actual sequential execution of unit and e2e tests with formatted output banners. - Registered all three new targets in the `.PHONY` declaration. ## Type of change - [ ] Bug fix - [ ] Feature - [ ] Refactor - [ ] Documentation - [x] Chore/CI ## Affected areas - [ ] Core (Go) - [ ] Transports (HTTP) - [ ] Providers/Integrations - [x] Plugins - [ ] UI (React) - [ ] Docs ## How to test ```sh # Run all semantic_cache e2e tests make test-semantic-cache # Run only direct cache tests CACHE_TYPE=direct make test-semantic-cache # Run only semantic cache tests CACHE_TYPE=semantic make test-semantic-cache # Run both unit and e2e tests together make test-semantic-cache-complete # Force e2e run regardless of preconditions RUN_FORCE=1 make test-semantic-cache-complete ``` If `trail` is installed and on `PATH`, all commands will automatically wrap execution in a `trail run` session for capture-based debugging. ## Breaking changes - [ ] Yes - [x] No ## Related issues ## Security considerations None. ## Checklist - [ ] I read `docs/contributing/README.md` and followed the guidelines - [ ] I added/updated tests where appropriate - [ ] I updated documentation where needed - [ ] I verified builds succeed (Go and UI) - [ ] I verified the CI pipeline passes locally if applicable

## Summary Briefly explain the purpose of this PR and the problem it solves. ## Changes - What was changed and why - Any notable design decisions or trade-offs ## Type of change - [ ] Bug fix - [ ] Feature - [ ] Refactor - [ ] Documentation - [ ] Chore/CI ## Affected areas - [ ] Core (Go) - [ ] Transports (HTTP) - [ ] Providers/Integrations - [ ] Plugins - [ ] UI (React) - [ ] Docs ## How to test Describe the steps to validate this change. Include commands and expected outcomes. ```sh # Core/Transports go version go test ./... # UI cd ui pnpm i || npm i pnpm test || npm test pnpm build || npm run build ``` If adding new configs or environment variables, document them here. ## Screenshots/Recordings If UI changes, add before/after screenshots or short clips. ## Breaking changes - [ ] Yes - [ ] No If yes, describe impact and migration instructions. ## Related issues Link related issues and discussions. Example: Closes maximhq#123 ## Security considerations Note any security implications (auth, secrets, PII, sandboxing, etc.). ## Checklist - [ ] I read `docs/contributing/README.md` and followed the guidelines - [ ] I added/updated tests where appropriate - [ ] I updated documentation where needed - [ ] I verified builds succeed (Go and UI) - [ ] I verified the CI pipeline passes locally if applicable

…maximhq#3444)

…tions column (maximhq#3480) ## Summary Replaces the direct delete button in the logs and MCP logs action columns with a dropdown menu triggered by a `MoreHorizontal` icon. This improves the UI by providing a more scalable actions pattern while keeping the delete functionality accessible. The actions column is also now properly pinned to the right side of the table when the user has delete access. ## Changes - Replaced the inline destructive `Trash2` button with a `DropdownMenu` containing a "Delete" item for both logs and MCP logs tables - The actions column trigger is now a ghost `MoreHorizontal` icon button, reducing visual noise in the table - The actions column is pinned to the right only when `hasDeleteAccess` is true; otherwise no fixed columns are configured - Fixed `fixedColumnIds` to include `"actions"` so the column receives correct sticky positioning behavior - Removed `overflow-hidden` from pinned cells in the MCP logs table to prevent the dropdown from being clipped - Reduced the actions column size from 72 to 56px ## Type of change - [ ] Bug fix - [x] Feature - [ ] Refactor - [ ] Documentation - [ ] Chore/CI ## Affected areas - [ ] Core (Go) - [ ] Transports (HTTP) - [ ] Providers/Integrations - [ ] Plugins - [x] UI (React) - [ ] Docs ## How to test 1. Navigate to the Logs page as a user with delete access. 2. Confirm the actions column is pinned to the right of the table. 3. Click the `⋯` icon on any row and verify the dropdown appears with a "Delete" option. 4. Click "Delete" and confirm the log is deleted without the row click handler firing. 5. Repeat on the MCP Logs page. 6. Log in as a user without delete access and confirm the actions column is not present. ```sh cd ui pnpm i pnpm build ``` ## Screenshots/Recordings _Add before/after screenshots showing the old delete button vs. the new dropdown._ ## Breaking changes - [ ] Yes - [x] No ## Related issues ## Security considerations No new security implications. Delete access gating remains unchanged. ## Checklist - [ ] I read `docs/contributing/README.md` and followed the guidelines - [ ] I added/updated tests where appropriate - [ ] I updated documentation where needed - [ ] I verified builds succeed (Go and UI) - [ ] I verified the CI pipeline passes locally if applicable

…wing text (maximhq#3481) ## Summary Fixes layout overflow issues in the Model Catalog table where long provider names and model badge text would break out of their columns or cause uneven column sizing. ## Changes - Added `table-fixed` layout with explicit `<colgroup>` column widths (26% / 44% / 16% / 14%) to enforce stable column proportions - Added `overflow-hidden` and `truncate` to the Provider name cell so long names are clipped cleanly instead of overflowing - Added `shrink-0` to the "CUSTOM" badge so it doesn't compress when the provider name is long - Added `max-w-[220px] truncate` to model name badges in `ModelsUsedCell` to prevent individual badges from stretching too wide ## Type of change - [x] Bug fix - [ ] Feature - [ ] Refactor - [ ] Documentation - [ ] Chore/CI ## Affected areas - [ ] Core (Go) - [ ] Transports (HTTP) - [ ] Providers/Integrations - [ ] Plugins - [x] UI (React) - [ ] Docs ## How to test Navigate to the Model Catalog page and verify: 1. Provider names that are long truncate cleanly within their column 2. The "CUSTOM" badge remains visible and does not shrink when next to a long provider name 3. Model name badges in the models column truncate at a reasonable width 4. Column widths remain stable regardless of content length ```sh cd ui pnpm i || npm i pnpm build || npm run build ``` ## Screenshots/Recordings If UI changes, add before/after screenshots or short clips. ## Breaking changes - [ ] Yes - [x] No ## Related issues ## Security considerations None. ## Checklist - [ ] I read `docs/contributing/README.md` and followed the guidelines - [ ] I added/updated tests where appropriate - [ ] I updated documentation where needed - [ ] I verified builds succeed (Go and UI) - [ ] I verified the CI pipeline passes locally if applicable

…y names (maximhq#3482) ## Summary Fixes layout issues in the model provider keys table where long key/model/server names would overflow their cells and cause the table to render incorrectly. ## Changes - Applied `table-fixed` layout to the keys table and defined explicit column widths via `<colgroup>` (64% for the name column, 12% each for the remaining three columns) to enforce stable column sizing regardless of content length - Added `overflow-hidden` to the name cell and `min-w-0` + `truncate` to the name text span so long strings are clipped with an ellipsis instead of breaking the layout - Added a trailing newline at end of file ## Type of change - [x] Bug fix - [ ] Feature - [ ] Refactor - [ ] Documentation - [ ] Chore/CI ## Affected areas - [ ] Core (Go) - [ ] Transports (HTTP) - [ ] Providers/Integrations - [ ] Plugins - [x] UI (React) - [ ] Docs ## How to test 1. Navigate to the Workspace → Providers page and open a provider that has keys with long names (e.g. a vLLM model name or a long API key label). 2. Verify that the name column truncates with an ellipsis rather than overflowing into adjacent columns. 3. Verify that the three action columns (weight, status, actions) maintain consistent widths. ```sh cd ui pnpm i || npm i pnpm test || npm test pnpm build || npm run build ``` ## Screenshots/Recordings Add before/after screenshots showing the table with a long key name to confirm truncation behavior. ## Breaking changes - [ ] Yes - [x] No ## Related issues ## Security considerations None. ## Checklist - [ ] I read `docs/contributing/README.md` and followed the guidelines - [ ] I added/updated tests where appropriate - [ ] I updated documentation where needed - [ ] I verified builds succeed (Go and UI) - [ ] I verified the CI pipeline passes locally if applicable

…limits table (maximhq#3483) ## Summary Replaces the individual inline Edit and Delete action buttons in the model limits table with a consolidated `DropdownMenu` (three-dot menu). The delete confirmation dialog is also lifted out of the per-row render loop and rendered once at the table level, driven by a `deleteModelConfigId` state value. ## Changes - Replaced per-row Edit and Delete buttons with a single `MoreHorizontal` icon button that opens a `DropdownMenu` containing Edit and Delete items. - Moved the `AlertDialog` for delete confirmation out of the table row loop into a single top-level instance, controlled by `deleteModelConfigId` state. This prevents multiple dialog instances from being mounted simultaneously. - Added `deletingModelConfig` derived from `deleteModelConfigId` via `useMemo`, keeping it in sync with the RTK cache similarly to `editingModelConfig`. - Cleared `deleteModelConfigId` on successful deletion to close the dialog automatically. - Removed the hover/focus-dependent opacity animation on the action cell since the dropdown replaces that pattern. ## Type of change - [ ] Bug fix - [ ] Feature - [x] Refactor - [ ] Documentation - [ ] Chore/CI ## Affected areas - [ ] Core (Go) - [ ] Transports (HTTP) - [ ] Providers/Integrations - [ ] Plugins - [x] UI (React) - [ ] Docs ## How to test 1. Navigate to the Model Limits table in the workspace. 2. Hover over any model limit row and click the `...` (MoreHorizontal) button. 3. Verify the dropdown shows **Edit** and **Delete** options. 4. Click **Edit** and confirm the model limit sheet opens with the correct config pre-populated. 5. Click **Delete** and confirm the confirmation dialog appears with the correct model name (truncated if over 30 characters). 6. Confirm deletion succeeds, the dialog closes, and a success toast is shown. 7. Confirm that users without update/delete RBAC access see the respective menu items disabled. ```sh cd ui pnpm i || npm i pnpm test || npm test pnpm build || npm run build ``` ## Screenshots/Recordings _Add before/after screenshots showing the old separate Edit/Delete buttons vs. the new dropdown menu._ ## Breaking changes - [x] No ## Related issues ## Security considerations RBAC checks for `Governance` update and delete operations are preserved on the new dropdown menu items. ## Checklist - [ ] I read `docs/contributing/README.md` and followed the guidelines - [ ] I added/updated tests where appropriate - [ ] I updated documentation where needed - [ ] I verified builds succeed (Go and UI) - [ ] I verified the CI pipeline passes locally if applicable

…#3484) ## Summary Replaces the individual Edit and Delete action buttons in the routing rules table with a consolidated `DropdownMenu` triggered by a `MoreHorizontal` icon. This reduces visual clutter in the actions column and provides a more consistent UX pattern for row-level actions. ## Changes - Replaced separate Edit and Delete ghost buttons with a single `MoreHorizontal` icon button that opens a dropdown menu containing both actions - Edit and Delete items within the dropdown remain gated by `canUpdate` and `canDelete` permissions respectively, now using the `disabled` prop instead of conditional rendering - The Delete item uses the `destructive` variant to visually distinguish it from the Edit action - Changed `catch (error: any)` to `catch (error: unknown)` for improved type safety - Added a trailing newline to the end of the file ## Type of change - [ ] Bug fix - [ ] Feature - [x] Refactor - [ ] Documentation - [ ] Chore/CI ## Affected areas - [ ] Core (Go) - [ ] Transports (HTTP) - [ ] Providers/Integrations - [ ] Plugins - [x] UI (React) - [ ] Docs ## How to test Navigate to the routing rules table and verify: 1. Each row displays a `MoreHorizontal` icon button in the actions column 2. Clicking the icon opens a dropdown with Edit and Delete options 3. Edit and Delete options are disabled when the user lacks the respective permissions 4. Selecting Edit opens the edit flow for the correct rule 5. Selecting Delete triggers the delete confirmation dialog for the correct rule 6. Row click propagation is not triggered when interacting with the dropdown ```sh cd ui pnpm i || npm i pnpm test || npm test pnpm build || npm run build ``` ## Screenshots/Recordings Before: Two separate icon buttons (pencil and trash) visible inline on each row. After: A single `⋯` icon button per row that reveals Edit and Delete options in a dropdown, with Delete styled in red. ## Breaking changes - [ ] Yes - [x] No ## Related issues ## Security considerations No security implications. Permission checks (`canUpdate`, `canDelete`) are preserved. ## Checklist - [ ] I read `docs/contributing/README.md` and followed the guidelines - [ ] I added/updated tests where appropriate - [ ] I updated documentation where needed - [ ] I verified builds succeed (Go and UI) - [ ] I verified the CI pipeline passes locally if applicable

…g overrides table (maximhq#3485) ## Summary Replaces the individual Edit and Delete action buttons in the pricing overrides table with a consolidated `DropdownMenu` triggered by a `MoreHorizontal` icon. Also fixes a sidebar active state bug where sub-items were incorrectly matching routes, and adds `hasAPIKeyAccess` to the sidebar's memoization dependencies. ## Changes - Replaced separate Edit and Delete icon buttons in the pricing overrides table rows with a single `MoreHorizontal` actions dropdown containing labeled Edit and Delete menu items. The Delete item uses the destructive variant for visual clarity. - Fixed sidebar sub-item active state detection to use `isRouteMatch` instead of `pathname.startsWith`, preventing incorrect active highlighting on partial path matches. - Added `hasAPIKeyAccess` to the sidebar's `useMemo` dependency array, which was previously missing and could cause stale renders. ## Type of change - [ ] Bug fix - [x] Feature - [x] Refactor - [ ] Documentation - [ ] Chore/CI ## Affected areas - [ ] Core (Go) - [ ] Transports (HTTP) - [ ] Providers/Integrations - [ ] Plugins - [x] UI (React) - [ ] Docs ## How to test 1. Navigate to the custom pricing overrides table. 2. Hover over a row and click the `⋯` (MoreHorizontal) button — a dropdown should appear with **Edit** and **Delete** options. 3. Clicking **Edit** should open the edit drawer without triggering row selection. 4. Clicking **Delete** should open the delete confirmation dialog without triggering row selection. 5. Verify sidebar sub-item active states are correct when navigating between nested routes — only the exact matching route should appear active. ```sh cd ui pnpm i || npm i pnpm test || npm test pnpm build || npm run build ``` ## Screenshots/Recordings Before: Two separate ghost icon buttons (pencil and trash) visible inline on each row. After: A single `⋯` button per row that reveals a dropdown with labeled **Edit** and **Delete** actions. ## Breaking changes - [ ] Yes - [x] No ## Related issues ## Security considerations No security implications. ## Checklist - [ ] I read `docs/contributing/README.md` and followed the guidelines - [ ] I added/updated tests where appropriate - [ ] I updated documentation where needed - [ ] I verified builds succeed (Go and UI) - [ ] I verified the CI pipeline passes locally if applicable

…s column in MCP clients table (maximhq#3486) ## Summary Replaces the per-row inline action buttons (reconnect + delete) in the MCP clients table with a consolidated `MoreHorizontal` dropdown menu, and moves the actions column to a sticky right-pinned position so it remains visible when the table scrolls horizontally. ## Changes - Replaced the individual `Reconnect` (with tooltip) and `Delete` (with inline `AlertDialog`) buttons with a single `DropdownMenu` triggered by a `MoreHorizontal` icon button. - The delete confirmation `AlertDialog` is now lifted out of the table row and controlled via a `clientToDelete` state variable, preventing multiple dialog instances from being mounted inside the DOM simultaneously. - The actions column header and cell are now `sticky right-0` with `PIN_SHADOW_RIGHT` applied, keeping the actions visible during horizontal scroll. - The table container changed from `overflow-hidden` to `overflow-auto` to enable horizontal scrolling. - Reconnect and delete menu items are conditionally rendered based on RBAC access, rather than being rendered-but-disabled. - The `MoreHorizontal` button shows a `Loader2` spinner while a reconnect is in progress for that row. - Added `group` class to table rows to allow the sticky actions cell to mirror the row hover background. ## Type of change - [ ] Bug fix - [ ] Feature - [x] Refactor - [ ] Documentation - [ ] Chore/CI ## Affected areas - [ ] Core (Go) - [ ] Transports (HTTP) - [ ] Providers/Integrations - [ ] Plugins - [x] UI (React) - [ ] Docs ## How to test 1. Navigate to the MCP Registry page in the workspace UI. 2. Verify each MCP client row shows a `⋯` (MoreHorizontal) button in the rightmost column. 3. Click the button and confirm the dropdown shows **Reconnect** and **Delete** options (subject to RBAC permissions). 4. Select **Reconnect** and confirm the spinner appears on the button while reconnecting. 5. Select **Delete** and confirm the confirmation dialog appears with the correct server name, and that confirming removes the client. 6. Resize the browser window to trigger horizontal scrolling and confirm the actions column remains pinned to the right. ```sh cd ui pnpm i || npm i pnpm build || npm run build ``` ## Screenshots/Recordings Before/after screenshots recommended showing the old inline icon buttons vs. the new dropdown menu. ## Breaking changes - [x] No ## Related issues ## Security considerations RBAC checks are preserved — reconnect and delete menu items are only rendered when the user has the corresponding `Update` or `Delete` permission on `MCPGateway`. ## Checklist - [ ] I read `docs/contributing/README.md` and followed the guidelines - [ ] I added/updated tests where appropriate - [ ] I updated documentation where needed - [ ] I verified builds succeed (Go and UI) - [ ] I verified the CI pipeline passes locally if applicable

…d active toggle switch in teams and virtual keys tables (maximhq#3487) ## Summary Replaces the inline edit/delete action buttons in the Teams and Virtual Keys tables with a consolidated `MoreHorizontal` dropdown menu per row. The actions column is now sticky-pinned to the right edge of the table so it remains visible when the table scrolls horizontally. The Virtual Keys table also replaces the status badge with an inline active/inactive toggle switch. ## Changes - Extracted `TeamActionsMenu` and `VKActionsMenu` components that render a `DropdownMenu` containing Edit and Delete items, with the delete confirmation `AlertDialog` controlled via local state rather than being triggered directly from an `AlertDialogTrigger`. - Removed the hover-only opacity animation on action buttons in favor of always-visible dropdown triggers. - The actions `TableHead` and `TableCell` are now sticky (`sticky right-0 z-10`) with `PIN_SHADOW_RIGHT` applied and background colors that match the row hover state, keeping the pinned column visually consistent. - Tables are given a `min-w` value and their container uses `overflow-auto` to support horizontal scrolling without breaking the sticky column. - `VKStatusBadge` is replaced by `VKActiveSwitch`, which renders a `Switch` component and calls `useUpdateVirtualKeyMutation` to toggle `is_active` inline. Managed-by-profile keys disable the switch and show a tooltip title. - The managed-by-profile delete tooltip/disabled-button pattern is replaced by a disabled destructive `DropdownMenuItem` with a `title` attribute. - `handleEditVirtualKey` no longer requires a `MouseEvent` argument since click propagation is handled at the cell level. ## Type of change - [ ] Bug fix - [ ] Feature - [x] Refactor - [ ] Documentation - [ ] Chore/CI ## Affected areas - [ ] Core (Go) - [ ] Transports (HTTP) - [ ] Providers/Integrations - [ ] Plugins - [x] UI (React) - [ ] Docs ## How to test 1. Navigate to the **Governance → Teams** page. - Verify the actions column stays pinned to the right when scrolling horizontally. - Click the `MoreHorizontal` button on a row and confirm Edit and Delete items appear. - Confirm the delete confirmation dialog opens from the dropdown and completes successfully. 2. Navigate to the **Virtual Keys** page. - Verify the active toggle switch reflects the current `is_active` state and toggling it updates the key immediately with a success toast. - Confirm keys managed by an access profile show a disabled switch and a disabled Delete item in the dropdown. - Verify the sticky actions column behaves correctly on horizontal scroll. ```sh cd ui pnpm i pnpm build ``` ## Screenshots/Recordings _Add before/after screenshots showing the dropdown menu and sticky column behavior._ ## Breaking changes - [x] No ## Related issues ## Security considerations No new auth surfaces introduced. RBAC checks (`hasUpdateAccess`, `hasDeleteAccess`) are preserved on all action items. ## Checklist - [ ] I read `docs/contributing/README.md` and followed the guidelines - [ ] I added/updated tests where appropriate - [ ] I updated documentation where needed - [ ] I verified builds succeed (Go and UI) - [ ] I verified the CI pipeline passes locally if applicable

…ses to Chat (maximhq#3584) ## Summary Reasoning content (e.g. from DeepSeek thinking mode or Anthropic extended thinking) was being silently dropped when converting between the Responses and Chat message formats. This caused multi-turn flows that route through the Responses→Chat fallback path to 400 on providers that require reasoning content to be echoed back. This PR fixes the bidirectional conversion so reasoning is preserved across both directions. ## Changes - **`ToResponsesMessages`**: Reasoning content on an assistant `ChatMessage` is now emitted as a `reasoning`-typed `ResponsesMessage` *before* any tool calls or text content, matching the order providers expect and allowing clients to echo it back correctly. - **`ToChatMessages`**: Reasoning messages are no longer silently skipped. Instead, they are buffered and attached to the next assistant turn (whether that turn carries text content or tool calls), populating both `Reasoning` and `ReasoningDetails` fields including signatures, summaries, and encrypted content. - Null-safety guards were added in the UI log detail view and column helpers to prevent crashes when `input_history` contains `null`/`undefined` entries (which can occur when reasoning messages are injected into history). ## Type of change - [x] Bug fix - [ ] Feature - [ ] Refactor - [ ] Documentation - [ ] Chore/CI ## Affected areas - [x] Core (Go) - [ ] Transports (HTTP) - [ ] Providers/Integrations - [ ] Plugins - [x] UI (React) - [ ] Docs ## How to test ```sh # Core/Transports go version go test ./... ``` Specific test cases added: - `TestToChatMessages_AttachesReasoningToNextAssistantMessage` — verifies reasoning is buffered and attached to the following assistant text message, including signature preservation. - `TestToChatMessages_AttachesReasoningToToolCallAssistantMessage` — verifies reasoning is attached to tool-call assistant messages. - `TestToResponsesMessages_EmitsReasoningMessageBeforeToolCalls` — verifies reasoning is emitted first when an assistant message has both reasoning and tool calls. - `TestToResponsesMessages_EmitsReasoningMessageBeforeTextContent` — verifies reasoning is emitted first when an assistant message has both reasoning and text content. ## Breaking changes - [ ] Yes - [x] No ## Related issues Fixes multi-turn DeepSeek thinking mode and Anthropic extended thinking flows that were returning 400 errors due to missing reasoning content in echoed history. ## Security considerations No auth, secrets, PII, or sandboxing implications. Reasoning content is treated the same as other message content. ## Checklist - [x] I read `docs/contributing/README.md` and followed the guidelines - [x] I added/updated tests where appropriate - [ ] I updated documentation where needed - [x] I verified builds succeed (Go and UI) - [x] I verified the CI pipeline passes locally if applicable

…aximhq#3585) ## Summary This PR adds proper `anthropic-beta` header propagation for Vertex AI's `Responses` and `ResponsesStream` endpoints, ensuring beta feature headers are correctly filtered and applied when making Anthropic-on-Vertex requests. ## Changes - Applied `FilterBetaHeadersForProvider` and `MergeBetaHeaders` logic to the Vertex `Responses` method, replacing the previous unconditional `SetExtraHeaders` behavior with explicit beta header setting or deletion. - Applied the same beta header logic to `ResponsesStream`, injecting the filtered beta headers into the request headers map before streaming begins. - Removed an unnecessary blank line in `Responses` after the `bifrostErr` check. - Removed redundant parentheses around a boolean condition in `ChatCompletionStream`. ## Type of change - [ ] Bug fix - [x] Feature - [ ] Refactor - [ ] Documentation - [ ] Chore/CI ## Affected areas - [ ] Core (Go) - [ ] Transports (HTTP) - [x] Providers/Integrations - [ ] Plugins - [ ] UI (React) - [ ] Docs ## How to test Send a request to the Vertex `Responses` or `ResponsesStream` endpoint using an Anthropic model that requires a beta header (e.g., a model requiring `interleaved-thinking-20250522`). Verify that the `anthropic-beta` header is correctly set in the outgoing request and that the response is successful. ```sh go test ./... ``` ## Breaking changes - [ ] Yes - [x] No ## Related issues ## Security considerations No security implications. Changes are limited to header propagation logic for Vertex AI Anthropic requests. ## Checklist - [ ] I read `docs/contributing/README.md` and followed the guidelines - [ ] I added/updated tests where appropriate - [ ] I updated documentation where needed - [ ] I verified builds succeed (Go and UI) - [ ] I verified the CI pipeline passes locally if applicable

## Summary Briefly explain the purpose of this PR and the problem it solves. ## Changes - What was changed and why - Any notable design decisions or trade-offs ## Type of change - [ ] Bug fix - [ ] Feature - [ ] Refactor - [ ] Documentation - [ ] Chore/CI ## Affected areas - [ ] Core (Go) - [ ] Transports (HTTP) - [ ] Providers/Integrations - [ ] Plugins - [ ] UI (React) - [ ] Docs ## How to test Describe the steps to validate this change. Include commands and expected outcomes. ```sh # Core/Transports go version go test ./... # UI cd ui pnpm i || npm i pnpm test || npm test pnpm build || npm run build ``` If adding new configs or environment variables, document them here. ## Screenshots/Recordings If UI changes, add before/after screenshots or short clips. ## Breaking changes - [ ] Yes - [ ] No If yes, describe impact and migration instructions. ## Related issues Link related issues and discussions. Example: Closes maximhq#123 ## Security considerations Note any security implications (auth, secrets, PII, sandboxing, etc.). ## Checklist - [ ] I read `docs/contributing/README.md` and followed the guidelines - [ ] I added/updated tests where appropriate - [ ] I updated documentation where needed - [ ] I verified builds succeed (Go and UI) - [ ] I verified the CI pipeline passes locally if applicable

…imhq#3591) ## Summary Fixes a nil-dereference panic in fasthttp's `connsCleaner` that occurred when a mid-stream client disconnect caused `LargeResponseReader.Close` to call `ReleaseResponse` on a connection that had already been torn down by `SetupStreamCancellation`. The fix prevents the double-release by checking `BifrostContextKeyConnectionClosed` before draining and releasing the fasthttp response. ## Changes - `LargeResponseReader` now holds a `*schemas.BifrostContext` reference so that `Close()` can inspect `BifrostContextKeyConnectionClosed` before attempting to drain and release the underlying fasthttp response. When the flag is set (indicating the connection was already closed mid-stream), `Close()` skips the drain and release, leaking `r.Resp` to the GC instead — mirroring the existing behavior of `ReleaseStreamingResponse`. - `SetupStreamCancellation` is now wired into `SetupStreamingPassthrough` so that mid-stream client disconnects unblock the transport's `Read` via `CloseWithError` and set `BifrostContextKeyConnectionClosed` before `LargeResponseReader.Close` runs. - In `SetupStreamCancellation`, `BifrostContextKeyConnectionClosed` is now set unconditionally after a close attempt in the `done`+cancelled-context race branch, regardless of whether the close returned an error. Previously the flag was only set on a successful close, meaning a failed close (e.g. against an already-pooled conn) left the flag unset and allowed a second release to proceed. ## Type of change - [x] Bug fix ## Affected areas - [x] Core (Go) - [x] Transports (HTTP) - [x] UI (React) ## How to test Simulate a mid-stream client disconnect against a streaming endpoint and confirm the server does not panic with a nil-dereference in fasthttp's `connsCleaner`. Verify that normal stream completion (EOF) still correctly releases the response. ```sh go version go test ./... ``` ## Breaking changes - [x] No ## Security considerations None. This change only affects connection lifecycle management for streaming responses and does not touch auth, secrets, or PII handling. ## Checklist - [ ] I read `docs/contributing/README.md` and followed the guidelines - [ ] I added/updated tests where appropriate - [ ] I updated documentation where needed - [ ] I verified builds succeed (Go and UI) - [ ] I verified the CI pipeline passes locally if applicable

…aximhq#3590) ## Summary Add optional cluster metadata columns to the logs table and a per-node usage aggregation query on the LogStore interface. These are foundational hooks for improved governance accuracy in multi-node deployments. ## Changes - 3 new context keys for passing node ID and governance resource IDs through the request pipeline - 3 new nullable columns on the `Log` struct: `cluster_node_id`, `budget_ids`, `rate_limit_ids` - `NodeUsageAggregate` struct and `GetNodeUsageSince` method on the `LogStore` interface for aggregating a node's usage since a given timestamp - Non-blocking migration: column additions are instant (`ALTER TABLE ADD COLUMN`), index built via `CREATE INDEX CONCURRENTLY` in a background goroutine - Logging plugin gains `SetClusterNodeID()` setter and stamps entries with cluster metadata from the request context when set ## Type of change - [ ] Bug fix - [x] Feature - [ ] Refactor - [ ] Documentation - [ ] Chore/CI ## Affected areas - [x] Core (Go) - [ ] Transports (HTTP) - [ ] Providers/Integrations - [x] Plugins - [ ] UI (React) - [ ] Docs ## How to test ```sh make build LOCAL=1 make test-core make test-plugins ``` Start with a fresh database and verify migration completes without errors. The new columns are nullable and unused unless explicitly wired by the caller. ## Screenshots/Recordings N/A — no UI changes. ## Breaking changes - [ ] Yes - [x] No New columns are nullable and optional. The `GetNodeUsageSince` method is additive to the `LogStore` interface. Existing log entries are unaffected. ## Related issues N/A ## Security considerations No security implications. New columns store opaque IDs only. ## Checklist - [x] I read `docs/contributing/README.md` and followed the guidelines - [ ] I added/updated tests where appropriate - [x] I updated documentation where needed - [x] I verified builds succeed (Go and UI) - [ ] I verified the CI pipeline passes locally if applicable

## Summary Fixes a bug in `ReleaseStreamingResponse` where the response was being released inside a `defer` (which runs even after a panic) before the body stream had been drained, and also removes redundant stream closing logic that could interfere with proper connection handling. ## Changes - Moved `fasthttp.ReleaseResponse(resp)` out of the `defer` block and into the body drain block, so the response is only released after the body stream has been fully drained. This ensures connections are not reused in a dirty state. - Removed the explicit `io.Closer` and `streamCloserWithError` closing logic after draining, as closing the body stream separately is no longer necessary and was causing unintended side effects. - Replaced the ASCII art broker diagram in the clustering docs with a Mermaid flowchart for better rendering in supported environments, and added a plain-text description of the message flow for clarity. ## Type of change - [x] Bug fix - [ ] Feature - [ ] Refactor - [ ] Documentation - [ ] Chore/CI ## Affected areas - [x] Core (Go) - [ ] Transports (HTTP) - [x] Providers/Integrations - [ ] Plugins - [ ] UI (React) - [x] Docs ## How to test ```sh go test ./core/providers/utils/... ``` Validate that streaming responses do not produce "whitespace in header" errors on connection reuse, and that no response leaks occur when a panic is triggered mid-stream. ## Breaking changes - [ ] Yes - [x] No ## Related issues ## Security considerations None. ## Checklist - [ ] I read `docs/contributing/README.md` and followed the guidelines - [ ] I added/updated tests where appropriate - [ ] I updated documentation where needed - [ ] I verified builds succeed (Go and UI) - [ ] I verified the CI pipeline passes locally if applicable

## Summary Fixes flaky Fireworks provider tests by hardcoding model names and increasing token limits to prevent truncation-related test failures. ## Changes - Replaced dynamic model resolution (`resolveFireworksModels`) with hardcoded model names (`deepseek-v4-pro` for chat/text, `qwen3-embedding-8b` for embeddings) in the Fireworks test suite, removing the dependency on runtime model discovery - Set text completion and embedding test scenarios to always enabled (`true`) since models are now statically defined - Updated the Fireworks key configuration to use `"*"` as the model wildcard instead of an empty slice, ensuring the key applies to all models - Increased `MaxCompletionTokens` / `MaxOutputTokens` limits across streaming tests (150→300, 200→300, 50→500) to reduce the likelihood of responses being cut off mid-stream, which was causing assertion failures ## Type of change - [x] Bug fix - [ ] Feature - [ ] Refactor - [ ] Documentation - [ ] Chore/CI ## Affected areas - [x] Core (Go) - [ ] Transports (HTTP) - [x] Providers/Integrations - [ ] Plugins - [ ] UI (React) - [ ] Docs ## How to test ```sh cd core go test ./providers/fireworks/... -v -run TestFireworks go test ./internal/llmtests/... -v ``` Expected: all Fireworks provider tests pass without flakiness related to model resolution or truncated streaming responses. ## Breaking changes - [ ] Yes - [x] No ## Related issues ## Security considerations None. No secrets or auth changes introduced; the `"*"` wildcard applies only to internal key-to-model routing logic. ## Checklist - [ ] I read `docs/contributing/README.md` and followed the guidelines - [x] I added/updated tests where appropriate - [ ] I updated documentation where needed - [x] I verified builds succeed (Go and UI) - [ ] I verified the CI pipeline passes locally if applicable

## Summary Briefly explain the purpose of this PR and the problem it solves. ## Changes - What was changed and why - Any notable design decisions or trade-offs ## Type of change - [ ] Bug fix - [ ] Feature - [ ] Refactor - [ ] Documentation - [ ] Chore/CI ## Affected areas - [ ] Core (Go) - [ ] Transports (HTTP) - [ ] Providers/Integrations - [ ] Plugins - [ ] UI (React) - [ ] Docs ## How to test Describe the steps to validate this change. Include commands and expected outcomes. ```sh # Core/Transports go version go test ./... # UI cd ui pnpm i || npm i pnpm test || npm test pnpm build || npm run build ``` If adding new configs or environment variables, document them here. ## Screenshots/Recordings If UI changes, add before/after screenshots or short clips. ## Breaking changes - [ ] Yes - [ ] No If yes, describe impact and migration instructions. ## Related issues Link related issues and discussions. Example: Closes maximhq#123 ## Security considerations Note any security implications (auth, secrets, PII, sandboxing, etc.). ## Checklist - [ ] I read `docs/contributing/README.md` and followed the guidelines - [ ] I added/updated tests where appropriate - [ ] I updated documentation where needed - [ ] I verified builds succeed (Go and UI) - [ ] I verified the CI pipeline passes locally if applicable

The merge-base changed after approval.

coderabbitai

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

transports/bifrost-http/lib/config_test.go (1)

16337-16370: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Make default-fallback subtests hermetic against ambient env vars.

These subtests can become flaky if BIFROST_PRICING_URL or BIFROST_MODEL_PARAMETERS_URL are set in the runner environment. Explicitly neutralize non-target vars in each case before asserting defaults.

Suggested fix

 t.Run("fallback to defaults when db and file are missing", func(t *testing.T) {
+    t.Setenv(modelcatalog.PricingURLEnvVar, "")
+    t.Setenv(modelcatalog.ModelParametersURLEnvVar, "")
     normalizedTable, normalizedModelCatalog, needsDBUpdate := ResolveFrameworkPricingConfig(nil, nil)
     require.False(t, needsDBUpdate)
     require.Equal(t, defaultURL, *normalizedTable.PricingURL)
     require.Equal(t, defaultModelParamsURL, *normalizedTable.ModelParametersURL)
@@
 t.Run("fallback default pricing url uses env override", func(t *testing.T) {
     envURL := "https://env-default.example.com/pricing.json"
     t.Setenv(modelcatalog.PricingURLEnvVar, envURL)
+    t.Setenv(modelcatalog.ModelParametersURLEnvVar, "")
@@
 t.Run("fallback default model parameters url uses env override", func(t *testing.T) {
     envURL := "https://env-default.example.com/model-parameters.json"
     t.Setenv(modelcatalog.ModelParametersURLEnvVar, envURL)
+    t.Setenv(modelcatalog.PricingURLEnvVar, "")

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@transports/bifrost-http/lib/config_test.go` around lines 16337 - 16370, These
subtests are flaky because ambient env vars (modelcatalog.PricingURLEnvVar /
modelcatalog.ModelParametersURLEnvVar) can affect ResolveFrameworkPricingConfig;
in each subtest (the three that assert fallback defaults) neutralize the
non-target env vars by calling t.Setenv for both modelcatalog.PricingURLEnvVar
and modelcatalog.ModelParametersURLEnvVar (set to empty string or an explicit
unset) before invoking ResolveFrameworkPricingConfig so the defaults are
hermetic; update the subtests that reference ResolveFrameworkPricingConfig and
the env var constants to ensure the environment is controlled for each case.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@transports/bifrost-http/lib/config.go`:
- Around line 3091-3094: The current block calls
configstore.GenerateFrameworkConfigHash once with all three override inputs so
an unresolved filePricingURL can cause the whole hash to fail and prevent
fileChanged from being set for other valid overrides; change the logic to
generate hashes independently per override source (call
GenerateFrameworkConfigHash or an equivalent helper separately for
filePricingURL, fileModelParametersURL, and fileSyncSeconds), check each call's
error individually, and set fileChanged if any of the per-source hashes succeed
(combine results rather than short-circuiting), keeping references to
fileConfig, fileConfig.Pricing, skipURLBackfill, filePricingURL,
fileModelParametersURL, fileSyncSeconds,
configstore.GenerateFrameworkConfigHash, h, err, and fileChanged to locate where
to change the code.

---

Outside diff comments:
In `@transports/bifrost-http/lib/config_test.go`:
- Around line 16337-16370: These subtests are flaky because ambient env vars
(modelcatalog.PricingURLEnvVar / modelcatalog.ModelParametersURLEnvVar) can
affect ResolveFrameworkPricingConfig; in each subtest (the three that assert
fallback defaults) neutralize the non-target env vars by calling t.Setenv for
both modelcatalog.PricingURLEnvVar and modelcatalog.ModelParametersURLEnvVar
(set to empty string or an explicit unset) before invoking
ResolveFrameworkPricingConfig so the defaults are hermetic; update the subtests
that reference ResolveFrameworkPricingConfig and the env var constants to ensure
the environment is controlled for each case.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 16ddcece-63dd-4166-ab56-de9c485ab18b

📥 Commits

Reviewing files that changed from the base of the PR and between 607dd80 and 9fc1eb9.

📒 Files selected for processing (5)

core/changelog.md
framework/modelcatalog/config.go
framework/modelcatalog/main.go
transports/bifrost-http/lib/config.go
transports/bifrost-http/lib/config_test.go

coderabbitai

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@transports/bifrost-http/lib/config.go`:
- Around line 3003-3006: The code sets defaultModelParametersURL from
modelcatalog.DefaultModelParametersURL and then overrides it with the env var
(modelcatalog.ModelParametersURLEnvVar) and later persists that resolved value
into the DB, which makes env-backed overrides "sticky" across redeploys because
dbConfig.* wins in Phase 3; change the logic so env-derived values are not
persisted as the canonical DB default or are treated as a refresh: either (A)
keep env overrides only in-memory and do not write them into the persisted
defaults, or (B) when loading persisted dbConfig.* compare the current env value
(modelcatalog.ModelParametersURLEnvVar and the analogous BIFROST_PRICING env
var) to the stored value and, if they differ, treat that as a refresh condition
that replaces the DB value before Phase 3 precedence is applied; update the code
paths that compute defaultModelParametersURL and the similar blocks (lines
around the other two occurrences) to implement one of these approaches so
Docker/Helm env overrides are honored on redeploy.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: ddfefebe-67f2-45c0-81c9-459222a34b59

📥 Commits

Reviewing files that changed from the base of the PR and between 9fc1eb9 and 7f5b84e.

📒 Files selected for processing (4)

framework/modelcatalog/config.go
framework/modelcatalog/main.go
transports/bifrost-http/lib/config.go
transports/bifrost-http/lib/config_test.go

✅ Files skipped from review due to trivial changes (1)

framework/modelcatalog/config.go

coderabbitai · 2026-05-21T07:13:07Z

 	defaultModelParametersURL := modelcatalog.DefaultModelParametersURL
+	if value := strings.TrimSpace(os.Getenv(modelcatalog.ModelParametersURLEnvVar)); value != "" {
+		defaultModelParametersURL = value
+	}


⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Env-backed URL overrides become sticky after the first DB backfill.

BIFROST_PRICING_URL / BIFROST_MODEL_PARAMETERS_URL are only folded into the default layer here, and the resolved values are then persisted. After the first boot with either env var set, a later redeploy with a different env value keeps using the stale DB row because Phase 3 still lets dbConfig.* win. That defeats the deployment-level override this PR is adding for Docker/Helm/air-gapped installs.

Please either avoid persisting env-derived defaults, or treat a current env override that differs from the stored row as a refresh condition before applying DB precedence.

Also applies to: 3119-3138, 3177-3197

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@transports/bifrost-http/lib/config.go` around lines 3003 - 3006, The code sets defaultModelParametersURL from modelcatalog.DefaultModelParametersURL and then overrides it with the env var (modelcatalog.ModelParametersURLEnvVar) and later persists that resolved value into the DB, which makes env-backed overrides "sticky" across redeploys because dbConfig.* wins in Phase 3; change the logic so env-derived values are not persisted as the canonical DB default or are treated as a refresh: either (A) keep env overrides only in-memory and do not write them into the persisted defaults, or (B) when loading persisted dbConfig.* compare the current env value (modelcatalog.ModelParametersURLEnvVar and the analogous BIFROST_PRICING env var) to the stored value and, if they differ, treat that as a refresh condition that replaces the DB value before Phase 3 precedence is applied; update the code paths that compute defaultModelParametersURL and the similar blocks (lines around the other two occurrences) to implement one of these approaches so Docker/Helm env overrides are honored on redeploy.

greptile-apps · 2026-05-21T07:14:17Z

Want your agent to iterate on Greptile's feedback? Try greploops.

BearTS · 2026-05-21T08:23:50Z

hey @dsherniiazov we do support overriding pricing url and model parameter url using config json
Recently, I worked on fixing a bug regarding config hash mechanism #3610 surrounding it.

impoiler and others added 30 commits May 15, 2026 11:29

harness improvements (maximhq#3457)

cda4391

Preserve Anthropic output schema refs (maximhq#3449)

17e45f5

feat: use the new parameter json schema compliant to json schema spec (…

c66bc56

…maximhq#3444)

roroghost17 and others added 9 commits May 19, 2026 15:01

Merge branch 'dev' into fix/3238-custom-model-parameters

607dd80

dsherniiazov dismissed coderabbitai[bot]’s stale review via 607dd80 May 19, 2026 18:08

coderabbitai Bot previously approved these changes May 19, 2026

View reviewed changes

akshaydeo force-pushed the dev branch from f036e5f to 5c46934 Compare May 20, 2026 10:00

dsherniiazov force-pushed the fix/3238-custom-model-parameters branch from a23c861 to bf3a8e2 Compare May 20, 2026 13:56

greptile-apps Bot reviewed May 20, 2026

View reviewed changes

Comment thread transports/bifrost-http/lib/config.go Outdated

coderabbitai Bot previously approved these changes May 20, 2026

View reviewed changes

dsherniiazov dismissed coderabbitai[bot]’s stale review via 9fc1eb9 May 20, 2026 14:19

dsherniiazov force-pushed the fix/3238-custom-model-parameters branch from bf3a8e2 to 9fc1eb9 Compare May 20, 2026 14:19

coderabbitai Bot requested changes May 20, 2026

View reviewed changes

Comment thread transports/bifrost-http/lib/config.go Outdated

Merge upstream dev into custom model parameters fix

6f32c34

dsherniiazov force-pushed the fix/3238-custom-model-parameters branch from 9fc1eb9 to 6f32c34 Compare May 20, 2026 14:35

test: isolate pricing URL env fallback cases

1a3ce06

coderabbitai Bot previously approved these changes May 20, 2026

View reviewed changes

Merge branch 'dev' into fix/3238-custom-model-parameters

7f5b84e

dsherniiazov dismissed coderabbitai[bot]’s stale review via 7f5b84e May 21, 2026 07:08

coderabbitai Bot requested a review from roroghost17 May 21, 2026 07:10

coderabbitai Bot requested changes May 21, 2026

View reviewed changes

dsherniiazov closed this May 21, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[fix]: allow model catalog URLs (pricing + parameters) to be overridden via env vars#3521

[fix]: allow model catalog URLs (pricing + parameters) to be overridden via env vars#3521
dsherniiazov wants to merge 194 commits into
maximhq:devfrom
dsherniiazov:fix/3238-custom-model-parameters

dsherniiazov commented May 15, 2026

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Uh oh!

coderabbitai Bot May 21, 2026

Uh oh!

greptile-apps Bot commented May 21, 2026

Uh oh!

BearTS commented May 21, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

18 participants

Conversation

dsherniiazov commented May 15, 2026

Changes

Why

Affected packages

Testing

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot May 21, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps Bot commented May 21, 2026

Uh oh!

BearTS commented May 21, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

18 participants