Skip to content

feat(bench): inline run-time settings modal on bench tabs#1187

Open
jasonpaulso wants to merge 2 commits into
jundot:mainfrom
jasonpaulso:pr/bench-inline-settings
Open

feat(bench): inline run-time settings modal on bench tabs#1187
jasonpaulso wants to merge 2 commits into
jundot:mainfrom
jasonpaulso:pr/bench-inline-settings

Conversation

@jasonpaulso
Copy link
Copy Markdown
Contributor

Summary

Add a per-tab run-time settings modal to the Benchmark and Accuracy Benchmark tabs. Each tab can now adjust the full set of model settings (sampling, MTP, dflash, specprefill, turboquant, VLM MTP, etc.) for the next run without permanently mutating the model's saved settings — useful when iterating on bench configurations.

The current dashboard requires opening the model-settings modal (which writes back to disk on save), making it awkward to do quick what-if comparisons on bench runs. This change makes those overrides ephemeral by default with an explicit Save / Save as Profile / Reset workflow.

What changed

  • New shared macro dashboard/_settings_fields.html — the Basic + Advanced settings form, parameterized on a state object name (modelSettings, benchSettings, or accSettings). Replaces the inline form in _modal_model_settings.html.
  • New bench-tab modal dashboard/_modal_bench_settings.html — wraps the macro, adds bench-specific actions (Save, Save as Profile, Run, Reset) and a profile-creation form. Bound to a per-tab state object so the bench and accuracy tabs maintain independent overrides.
  • Generic panel ops in dashboard.js_modelToSettingsState / _settingsStateToPayload / _emptySettingsState / _ENGINE_INIT_KEYS / _panelHydrate / _panelDirty / _panelRequiresReload / _panelReset / _panelSave / _panelSaveAsProfile. Bench and accuracy tabs expose thin wrappers (benchSettingsSave(), accSettingsReset(), etc.) so templates remain readable.
  • Bench/accuracy backends (omlx/admin/benchmark.py, omlx/admin/accuracy_benchmark.py, omlx/model_settings.py) — accept an optional settings_override payload on the run request and apply it ephemerally for that run. Persisted model settings are unchanged unless the user explicitly saves.
  • Tests — added coverage in tests/test_benchmark.py, tests/test_accuracy_benchmark.py, tests/test_model_settings.py for the override path and ephemeral application.

Field coverage

The shared macro covers every field the model-settings modal already exposes, including the recently-added upstream additions:

_ENGINE_INIT_KEYS is updated so the "requires reload" hint surfaces correctly when any of those new fields change in a bench panel.

Test plan

  • Bench tab → open settings modal → all fields render and bind to benchSettings.
  • Accuracy bench tab → same flow with accSettings, independent of bench tab state.
  • Edit a setting in the bench panel, run benchmark → run uses the override; saved model settings unchanged.
  • Save writes the override back to the model's persisted settings.
  • Save as Profile creates a named profile from the current panel state.
  • Reset restores the last hydrated baseline.
  • Original model-settings modal in the admin tab still renders and saves correctly (no regression from macro extraction).
  • Profile pills row renders on bench panels and applies on click.
  • pytest tests/test_model_settings.py tests/test_benchmark.py tests/test_accuracy_benchmark.py — 113 passed (1 pre-existing failure on main unrelated to this PR).

Notes

  • The macro lives in a separate file (_settings_fields.html) so future model-settings additions only need to be made in one place to flow to all three contexts (modal + 2 bench panels).
  • Bench-panel save reuses the existing PUT /admin/api/models/{id}/settings endpoint; there's no new persistence path.

Copilot AI review requested due to automatic review settings May 11, 2026 18:25
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds per-tab “Run-time Settings” modals for the Benchmark and Accuracy Benchmark tabs, enabling per-run (ephemeral) ModelSettings overrides without mutating persisted model_settings.json unless the user explicitly saves.

Changes:

  • Refactors the model settings form into a shared Jinja macro and introduces new bench/accuracy run-time settings modals bound to per-tab Alpine state.
  • Adds a backend “ephemeral overrides” layer in ModelSettingsManager and plumbs settings_override through benchmark/accuracy run requests for per-run application.
  • Updates dashboard JS to hydrate/compare/save per-tab settings state, and adds tests for override behavior and request schema.

Reviewed changes

Copilot reviewed 14 out of 14 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
tests/test_model_settings.py Adds coverage for ModelSettingsManager.ephemeral_overrides behavior (stacking, unknown keys, reversion).
tests/test_benchmark.py Verifies BenchmarkRequest.settings_override defaults/accepts dict.
tests/test_accuracy_benchmark.py Verifies AccuracyBenchmarkRequest.settings_override defaults/accepts dict.
omlx/model_settings.py Implements ephemeral override stacking and merge-on-read in get_settings().
omlx/admin/templates/dashboard/_settings_fields.html New shared macro for rendering settings fields in modal/panel contexts.
omlx/admin/templates/dashboard/_modal_model_settings.html Replaces large inline form with shared settings_fields macro.
omlx/admin/templates/dashboard/_modal_bench_settings.html New run-time settings modal macro for bench/accuracy tabs (Save / Save as Profile / Run / Reset).
omlx/admin/templates/dashboard/_bench.html Adds “Run-time settings” button and modified badge for throughput bench tab.
omlx/admin/templates/dashboard/_bench_accuracy.html Adds “Run-time settings” button and modified badge for accuracy bench tab.
omlx/admin/templates/dashboard.html Includes one run-time settings modal instance per bench tab.
omlx/admin/templates/base.html Replaces lucide polling with a MutationObserver-based icon replacement pass.
omlx/admin/static/js/dashboard.js Adds per-tab settings state, hydrate/dirty/save/apply ops, and sends settings_override on run requests.
omlx/admin/benchmark.py Accepts settings_override and applies it ephemerally for the duration of the run.
omlx/admin/accuracy_benchmark.py Accepts settings_override and applies it ephemerally for the duration of the run.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread omlx/admin/static/js/dashboard.js Outdated
Comment thread omlx/admin/static/js/dashboard.js Outdated
Comment on lines +117 to +123
<label class="block text-xs font-bold uppercase tracking-wider text-neutral-500 mb-2">{{ t('modal.model_settings.ttl_seconds') }}</label>
<input type="number" x-model.number="{{ state }}.ttl_seconds"
{% if mode == 'modal' %}:placeholder="ttlPlaceholder"{% else %}placeholder="default"{% endif %}
:disabled="{{ selected_model }}?.pinned"
min="0" step="60"
:class="{{ selected_model }}?.pinned ? 'bg-neutral-100 text-neutral-400 cursor-not-allowed' : ''"
class="w-full px-4 py-2.5 border border-neutral-200 rounded-xl text-sm focus:ring-2 focus:ring-neutral-900 focus:border-transparent transition-all">
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@copilot apply changes based on this feedback

Comment thread omlx/admin/static/js/dashboard.js Outdated
jasonpaulso added a commit to jasonpaulso/omlx-plus that referenced this pull request May 11, 2026
Four issues called out in review:

1. **Panel apply helpers persisted to the wrong model and hit the
   backend.** `_panelApplyProfile` / `_panelApplyTemplate` reused
   `applyProfileToForm` / `applyTemplateToForm`, which both POST to
   `/admin/api/models/${selectedModel.id}/...` and update server-side
   `active_profile_name`. In the bench/accuracy panels `selectedModel`
   may be unset or differ from the panel's `benchModelId` / `accModelId`,
   so the helpers (a) could target the wrong model, and (b) broke the
   ephemeral-by-default contract for bench panels.

   Fix: extract pure `_mergeProfileSettingsIntoState`,
   `_mergePresetSettingsIntoState`, and `_resetStatePresetFields` helpers
   that operate on any target state object without touching
   `selectedModel`/`activeProfileName` or calling the backend. The modal
   wrappers (`applyProfileToForm`, `applyPresetToForm`) now use these
   helpers and keep their own POST/active-profile bookkeeping. The
   panel wrappers (`_panelApplyProfile`, `_panelApplyTemplate`,
   `_panelApplyPreset`) use the helpers directly — no fetch, no
   `selectedModel` dependency, panel state only.

   `_panelApplyTemplate` previously also created a server-side profile
   when the template's name was new; that side effect is dropped too.
   Templates carry the same shape as profiles, so we merge them into
   panel state directly.

2. **`ttl_seconds || null` collapsed explicit 0.** The TTL field accepts
   0 (meaning "no TTL"), but `s.ttl_seconds || null` mapped 0 → null,
   making it impossible to send an explicit 0 via `settings_override`
   or the bench panel's Save. Use `Number.isFinite(s.ttl_seconds) ?
   s.ttl_seconds : null` to mirror the temperature/top_p/top_k pattern
   already used elsewhere in `_settingsStateToPayload`.

3. **TTL placeholder hard-coded `"default"` in panel mode.** The
   modal already has a localized `ttlPlaceholder` getter that produces
   the right hint (`ttl_pinned` / `ttl_global_fallback` / `ttl_no_ttl`).
   The macro now binds `:placeholder="ttlPlaceholder"` unconditionally
   so the bench/accuracy panels render the same localized hint. The
   getter degrades cleanly when `selectedModel` is unset (the `pinned`
   branch is gated by an optional-chain).

4. **`_panelHydrate` silently swallowed non-OK profile fetches.**
   Other dashboard fetches redirect to `/admin` on 401 so users get a
   visible session-expired prompt; the profile fetch did not. It also
   discarded all other errors. Now: 401 redirects, other non-OK
   responses surface via `settingsStatus`, and network errors are
   both logged and surfaced.

Verified: `node --check` clean, `dashboard.html` renders (3
`:placeholder="ttlPlaceholder"` bindings, 0 `placeholder="default"`
residues), `tests/test_benchmark.py tests/test_accuracy_benchmark.py
tests/test_model_settings.py` — 113 passed.
jasonpaulso added a commit to jasonpaulso/omlx-plus that referenced this pull request May 11, 2026
Four issues called out in review:

1. **Panel apply helpers persisted to the wrong model and hit the
   backend.** `_panelApplyProfile` / `_panelApplyTemplate` reused
   `applyProfileToForm` / `applyTemplateToForm`, which both POST to
   `/admin/api/models/${selectedModel.id}/...` and update server-side
   `active_profile_name`. In the bench/accuracy panels `selectedModel`
   may be unset or differ from the panel's `benchModelId` / `accModelId`,
   so the helpers (a) could target the wrong model, and (b) broke the
   ephemeral-by-default contract for bench panels.

   Fix: extract pure `_mergeProfileSettingsIntoState`,
   `_mergePresetSettingsIntoState`, and `_resetStatePresetFields` helpers
   that operate on any target state object without touching
   `selectedModel`/`activeProfileName` or calling the backend. The modal
   wrappers (`applyProfileToForm`, `applyPresetToForm`) now use these
   helpers and keep their own POST/active-profile bookkeeping. The
   panel wrappers (`_panelApplyProfile`, `_panelApplyTemplate`,
   `_panelApplyPreset`) use the helpers directly — no fetch, no
   `selectedModel` dependency, panel state only.

   `_panelApplyTemplate` previously also created a server-side profile
   when the template's name was new; that side effect is dropped too.
   Templates carry the same shape as profiles, so we merge them into
   panel state directly.

2. **`ttl_seconds || null` collapsed explicit 0.** The TTL field accepts
   0 (meaning "no TTL"), but `s.ttl_seconds || null` mapped 0 → null,
   making it impossible to send an explicit 0 via `settings_override`
   or the bench panel's Save. Use `Number.isFinite(s.ttl_seconds) ?
   s.ttl_seconds : null` to mirror the temperature/top_p/top_k pattern
   already used elsewhere in `_settingsStateToPayload`.

3. **TTL placeholder hard-coded `"default"` in panel mode.** The
   modal already has a localized `ttlPlaceholder` getter that produces
   the right hint (`ttl_pinned` / `ttl_global_fallback` / `ttl_no_ttl`).
   The macro now binds `:placeholder="ttlPlaceholder"` unconditionally
   so the bench/accuracy panels render the same localized hint. The
   getter degrades cleanly when `selectedModel` is unset (the `pinned`
   branch is gated by an optional-chain).

4. **`_panelHydrate` silently swallowed non-OK profile fetches.**
   Other dashboard fetches redirect to `/admin` on 401 so users get a
   visible session-expired prompt; the profile fetch did not. It also
   discarded all other errors. Now: 401 redirects, other non-OK
   responses surface via `settingsStatus`, and network errors are
   both logged and surfaced.

Verified: `node --check` clean, `dashboard.html` renders (3
`:placeholder="ttlPlaceholder"` bindings, 0 `placeholder="default"`
residues), `tests/test_benchmark.py tests/test_accuracy_benchmark.py
tests/test_model_settings.py` — 113 passed.
@jasonpaulso jasonpaulso force-pushed the pr/bench-inline-settings branch from 3b2e672 to 43b09ad Compare May 11, 2026 18:48
Run-time settings now live in a per-tab modal triggered from the
Configuration card header on Throughput and Intelligence. Settings are
ephemeral by default — applied only to the next bench run via a new
SettingsManager.ephemeral_overrides() context manager that stacks
overrides on top of the persisted layer. Both runners accept an optional
settings_override field on the request body and wrap their model load +
run in the override context, covering engine-init flags and
sampling-class settings (bench reloads the model each run).

Inside the modal: profile / global-template / preset pills scoped to the
selected model, Reset (header), Save as Profile, Save to model settings
(with dynamic save / saving / saved icon states), and a Run-benchmark
shortcut that fires the run and dismisses the modal. Settings field
markup is extracted to a shared Jinja macro (_settings_fields.html) so
the bench-settings and model-settings modals stay in lockstep.

base.html: Lucide processor is now resilient — per-icon try/catch (one
bad icon no longer kills the whole pass, which was silently dropping
icons later in DOM order on the accuracy tab) and a MutationObserver
replaces the 300ms poll, catching dynamically revealed icons
immediately.

Tests: 13 ephemeral-override unit tests covering nested / concurrent /
exception / mutual-exclusion paths, plus schema tests for both bench
endpoints.
Four issues called out in review:

1. **Panel apply helpers persisted to the wrong model and hit the
   backend.** `_panelApplyProfile` / `_panelApplyTemplate` reused
   `applyProfileToForm` / `applyTemplateToForm`, which both POST to
   `/admin/api/models/${selectedModel.id}/...` and update server-side
   `active_profile_name`. In the bench/accuracy panels `selectedModel`
   may be unset or differ from the panel's `benchModelId` / `accModelId`,
   so the helpers (a) could target the wrong model, and (b) broke the
   ephemeral-by-default contract for bench panels.

   Fix: extract pure `_mergeProfileSettingsIntoState`,
   `_mergePresetSettingsIntoState`, and `_resetStatePresetFields` helpers
   that operate on any target state object without touching
   `selectedModel`/`activeProfileName` or calling the backend. The modal
   wrappers (`applyProfileToForm`, `applyPresetToForm`) now use these
   helpers and keep their own POST/active-profile bookkeeping. The
   panel wrappers (`_panelApplyProfile`, `_panelApplyTemplate`,
   `_panelApplyPreset`) use the helpers directly — no fetch, no
   `selectedModel` dependency, panel state only.

   `_panelApplyTemplate` previously also created a server-side profile
   when the template's name was new; that side effect is dropped too.
   Templates carry the same shape as profiles, so we merge them into
   panel state directly.

2. **`ttl_seconds || null` collapsed explicit 0.** The TTL field accepts
   0 (meaning "no TTL"), but `s.ttl_seconds || null` mapped 0 → null,
   making it impossible to send an explicit 0 via `settings_override`
   or the bench panel's Save. Use `Number.isFinite(s.ttl_seconds) ?
   s.ttl_seconds : null` to mirror the temperature/top_p/top_k pattern
   already used elsewhere in `_settingsStateToPayload`.

3. **TTL placeholder hard-coded `"default"` in panel mode.** The
   modal already has a localized `ttlPlaceholder` getter that produces
   the right hint (`ttl_pinned` / `ttl_global_fallback` / `ttl_no_ttl`).
   The macro now binds `:placeholder="ttlPlaceholder"` unconditionally
   so the bench/accuracy panels render the same localized hint. The
   getter degrades cleanly when `selectedModel` is unset (the `pinned`
   branch is gated by an optional-chain).

4. **`_panelHydrate` silently swallowed non-OK profile fetches.**
   Other dashboard fetches redirect to `/admin` on 401 so users get a
   visible session-expired prompt; the profile fetch did not. It also
   discarded all other errors. Now: 401 redirects, other non-OK
   responses surface via `settingsStatus`, and network errors are
   both logged and surfaced.

Verified: `node --check` clean, `dashboard.html` renders (3
`:placeholder="ttlPlaceholder"` bindings, 0 `placeholder="default"`
residues), `tests/test_benchmark.py tests/test_accuracy_benchmark.py
tests/test_model_settings.py` — 113 passed.
@jasonpaulso jasonpaulso force-pushed the pr/bench-inline-settings branch from 43b09ad to ec4cb0f Compare May 19, 2026 15:42
jasonpaulso added a commit to jasonpaulso/omlx-plus that referenced this pull request May 19, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants