v0.50.258 — login stability batch (#1419) + redirect-encoding Opus follow-up by nesquena-hermes · Pull Request #1422 · nesquena/hermes-webui

nesquena-hermes · 2026-05-01T21:30:44Z

v0.50.258 Batch Release — login stability

Summary

Small focused batch on top of v0.50.257 — 1 PR plus 1 Opus pre-release follow-up that fixes a real multi-param query truncation bug.

Constituent PR

PR	Author	Summary	Size
#1419	@bsgdigital	Login stability: 30-day session TTL, redirect-back after auth, /health connectivity probe with auto-retry	+48/-2 (2f)

Three independent fixes for users on flaky networks (VPN, Tailscale):

SESSION_TTL 24h → 30d so users aren't kicked out daily
?next=<original-path> redirect-back after session expiry
/health probe distinguishes "session expired" from "can't reach server" + auto-recovers

Pre-applied Opus follow-up

Multi-param redirect URL encoding bug — the original PR #1419 implementation built the outer ?next= parameter via quote(path, safe='/:@!$&\'()*+,;=') which kept ? and & literal. Two real problems:

(a) Multi-param query truncation — /api/sessions?limit=50&offset=0 round-tripped as /api/sessions?limit=50 because the inner & terminated the outer next value and offset=0 leaked out as a top-level query parameter the login page ignored.

(b) Parser-divergence footgun — an attacker-controlled path with embedded &next=https://evil.com injected a second top-level next parameter. Browsers' URLSearchParams.get() returns first-match (benign), Python's parse_qs returns last-match (the evil URL). The downstream _safeNextPath() already rejects non-/ prefixes so the actual exploit is closed, but the parser divergence is a footgun.

Fix: encode the entire path?query blob with safe='/' so ?, &, = all percent-encode. The outer next then holds exactly one path-with-query string the browser auto-decodes once. 6 regression tests pin the round-trip behavior across all 4 problematic shapes.

Tests

3610 passed, 0 failed (master 3604, +6 new)
Browser tests + Phase 2 API sanity → ALL CHECKS PASSED
CI: 3.11/3.12/3.13 all green

Independent review

Pinging @nesquena. Especially keen for eyes on:

The encoding fix shape — safe='/' + path/query reconstruction is the cleanest version I could think of, but worth a sanity check that I haven't broken any subpath-mount edge cases
The _safeNextPath() open-redirect guard chain — verified the attacker-injection scenario can't escape same-origin even with the encoding bug, but worth one more pair of eyes
The 6 regression tests — they're behavioral round-trips, not just source-level pins, but the source-level pin (safe='/') is also there as belt-and-suspenders

What's NOT in this batch

feat(deepseek): add V4 Pro/Flash [1m] variants to model dropdown #1418 (DeepSeek 1M variants) — held: hard dependency on agent PR #18534 (still open). Without it, dropdown selections silently route through deepseek-chat. The PR body originally cited #18509 (wrong PR) — I posted a correction tagging the actual companion #18534.
All other open PRs are draft, hold-labeled, or awaiting contributor response

…r recovers When the server is unreachable (VPN/Tailscale off), the login page now polls /health every 3 seconds instead of failing silently. Once the server becomes reachable, the page reloads automatically so the user doesn't have to manually refresh.

…onnectivity probe

…HANGELOG PR #1419 (login session TTL + redirect-back + connectivity probe) had a real bug in the server-side ?next= construction: quote(path, safe='/:@!$&'()*+,;=') keeps ? and & literal, so: (a) /api/sessions?limit=50&offset=0 round-trips as /api/sessions?limit=50 — the inner & terminates the outer next= value and offset=0 leaks as a top-level outer query the login page ignores. (b) An attacker-controlled path with embedded &next=https://evil.com injects a second top-level next parameter. Browsers parse first-match (benign), Python parse_qs parses last-match (the evil URL) — the parser-divergence is a footgun even though _safeNextPath() in login.js rejects the actual exploit. Fix: encode the entire path?query blob with safe='/' so ?, &, = all percent-encode. The outer next then holds exactly one path-with-query string the browser auto-decodes once. 6 regression tests in test_v050258_opus_followups.py pin round-trip behavior across simple paths, single-query, multi-param queries, attacker-injection neutralization, and the SESSION_TTL=30d constant. Full suite: 3610 passed, 0 failed.

nesquena

Review — end-to-end ✅ (clean approve, encoding fix verified, behavioral harness across 12 scenarios)

Release v0.50.258 — small focused login-stability batch. PR #1419 by @bsgdigital ships three independent fixes for flaky-network users; the bot's Opus advisor caught a real multi-param query truncation bug in the original ?next= encoding and fixed it before tag.

Squash audit

?  #1419  Login stability: 30-day TTL, ?next= redirect-back, /health probe   @bsgdigital
?         Opus follow-up: redirect URL encoding (safe='/') + 6 regression tests
?         release: v0.50.258 CHANGELOG

What this ships

(1) SESSION_TTL 24h → 30d at api/auth.py:27: users on flaky networks (VPN, Tailscale) no longer get kicked out daily. The cookie itself was already long-lived; this just stops the server-side session record from expiring.

(2) ?next=<original-path> redirect-back at api/auth.py:230-258: when an unauthenticated request hits a protected route, the redirect to /login now includes a next= parameter so _safeNextPath() in static/login.js can navigate back after successful auth.

(3) /health connectivity probe at static/login.js:74-109: on page load, fetch('health', ...) distinguishes "session expired" from "can't reach server". On unreachable: shows a clear error, disables the form, retries every 3s, and auto-reloads when the server is back. /health is verified public at api/auth.py:22 (PUBLIC_PATHS = frozenset({'/login', '/health', ...})).

Opus pre-release follow-up — encoding fix verified

The original PR #1419 encoded the next parameter with quote(path, safe='/:@!$&\'()*+,;=') (kept ? and & literal). Two real problems:

(a) Multi-param query truncation: /api/sessions?limit=50&offset=0 round-tripped as /api/sessions?limit=50 because the inner & terminated the outer next value. offset=0 leaked out as a top-level query parameter the login page ignored.

(b) Parser-divergence footgun: an attacker-controlled path embedding &next=https://evil.com injected a second top-level next parameter. Browsers' URLSearchParams.get() returns first-match (benign), Python's parse_qs returns last-match (the evil URL). The downstream _safeNextPath() rejects non-/ prefixes so the actual exploit is closed, but the parser divergence is a footgun.

Fix at api/auth.py:255-258 encodes the entire path?query blob with safe='/':

_path_with_query = parsed.path or '/'
if parsed.query:
    _path_with_query += '?' + parsed.query
_next = _urlparse.quote(_path_with_query, safe='/')
handler.send_header('Location', '/login?next=' + _next)

Behavioural harness — 12 round-trip scenarios

✅ '/foo'                              ''                           → '/foo'
✅ '/foo/bar'                          'a=1'                        → '/foo/bar?a=1'
✅ '/foo/bar'                          'a=1&b=2'                    → '/foo/bar?a=1&b=2'  ← FIXES regression
✅ '/admin'                            'action=foo&next=evil'       → '/admin?action=foo&next=evil'  ← FIXES injection
✅ '/foo with space'                   ''                           → '/foo with space'
✅ '/foo+bar'                          ''                           → '/foo+bar'
✅ '/foo%bar'                          ''                           → '/foo%bar'
✅ '/api'                              'q=hello world'              → '/api?q=hello world'
✅ '/path'                             'k=v=w'                      → '/path?k=v=w'  (equals in value)
✅ '//evil.com/x'                      ''                           → '//evil.com/x'  (rejected by _safeNextPath downstream)
✅ '/x'                                'a=&b=2'                     → '/x?a=&b=2'  (empty value)

All round-trips correct. The protocol-relative //evil.com/x IS preserved through encoding (correctly — encoding shouldn't be a security boundary), but _safeNextPath at static/login.js:31-33 rejects it via if (raw.charAt(1) === '/' || raw.charAt(1) === '\\') return './'. Defense in depth holds.

CRLF injection check

Verified via harness:

'/foo\r\nSet-Cookie: evil=1' → '/foo%0D%0ASet-Cookie%3A%20evil%3D1'
'/foo\nbar'                  → '/foo%0Abar'
'/foo\rbar'                  → '/foo%0Dbar'

quote() correctly percent-encodes \r and \n so an attacker can't inject HTTP headers via the redirect URL. ✅

`_safeNextPath` open-redirect chain

Verified the four guard layers at static/login.js:27-36:

if (!raw) return './' — no next → safe default.
if (raw.charAt(0) !== '/') return './' — must be path-absolute (rejects https://evil, //evil, data:..., javascript:...).
if (raw.charAt(1) === '/' || raw.charAt(1) === '\\') return './' — rejects protocol-relative //evil and Windows-style \\evil.
if (/[\x00-\x1f\x7f\s]/.test(raw)) return './' — rejects control chars and whitespace.

These four guards survive even if an attacker bypassed the encoding layer. ✅

Cross-tool consistency

✅ SESSION_TTL is webui-only auth state. CLI doesn't share session cookies. No cross-tool impact.
✅ /health endpoint is webui-only HTTP route. No CLI interaction.
✅ ?next= redirect is webui-only login flow. No config.yaml writes, no agent calls.

Race / lock analysis

SESSION_TTL change: constant change, no race. Sessions saved with the new TTL on next save; existing sessions in .sessions.json retain their old TTL until rewritten.
/health probe: single-threaded JS event loop. The retry interval is created once and cleared when the server comes back. No race between the catch handler and the success handler — they're mutually exclusive within a single fetch promise.
check_auth redirect: urlparse and quote are pure functions. No mutation of shared state. ✅

Security audit

✅ Open redirect: 4-layer _safeNextPath defense unchanged. Encoding fix is orthogonal — it preserves the exact path string for the JS guards to inspect.
✅ Header injection: quote(safe='/') percent-encodes CR/LF so Location: /login?next=<path> cannot inject additional headers.
✅ Query parameter pollution: encoding fix means the outer URL has exactly one next= parameter regardless of attacker-controlled inner content.
✅ Form disable on unreachable server: prevents user from typing password into a form whose POST will fail. setFormDisabled(true) at login.js:100.
✅ /health is public: no auth or secrets leaked through the probe.
✅ Auto-reload on recovery: window.location.reload() is browser-native and same-origin.

Edge-case matrix

Scenario	Pre-fix	Post-fix
Session expired, user revisits `/api/sessions?limit=50&offset=0`	Login redirects, returns to `/api/sessions?limit=50` (offset eaten)	Returns to full URL ✅
Session expired, simple path `/foo`	Worked	Works ✅
Session expired, path with `?`	Worked partially	Works ✅
Attacker-injected `&next=https://evil.com` in inner query	Top-level next had two values; Python parsed last-match (evil)	One value preserved as part of inner string ✅
Path with CRLF (`/foo\r\nSet-Cookie: evil`)	Could potentially inject header	Percent-encoded as `%0D%0A` ✅
Protocol-relative path `//evil.com`	_safeNextPath rejected	Still rejected ✅
Path with control chars / whitespace	_safeNextPath rejected	Still rejected ✅
Tailscale dropped mid-page	User saw "Connection failed" / form silently broken	Clear "Cannot reach server" + auto-recover ✅
Server returns 500 on /health	n/a	Shows "(server error 500)" ✅
Session valid for > 24h	Kicked out	30-day TTL ✅
Subpath mount `/myapp/...` redirect	Worked	Works (relative `fetch('health')` resolves against base) ✅
Empty `?next=` parameter	_safeNextPath returns './'	Same ✅

Tests

test_v050258_opus_followups.py — 6/6 pass:
- test_login_redirect_uses_path_only_safe_encoding (source-level pin + negative-pattern guard)
- test_redirect_roundtrip_simple_path
- test_redirect_roundtrip_single_query_param
- test_redirect_roundtrip_multi_query_params (REGRESSION pin)
- test_redirect_roundtrip_attacker_controlled_next_injection_neutralized (REGRESSION pin)
- test_redirect_session_ttl_30_days
Full suite: 3558 passed, 54 skipped, 3 xpassed, 0 failed in 17.74s on stage-258.
CI: 3.11/3.12/3.13 all green.
Behavioural harness: 12 round-trip scenarios + 3 CRLF cases all match expected.

(PR description claims 3610; my count 3558. Counting drift consistent with prior batches.)

Other audit — what I checked

✅ /health is in PUBLIC_PATHS at api/auth.py:22.
✅ safe='/' correctly percent-encodes ?, &, = (verified via Python harness).
✅ setInterval retry loop correctly clears on success at login.js:90.
✅ fetch('health', { credentials: 'omit' }) doesn't send the auth cookie (which doesn't exist anyway since user is on login page).
✅ window.location.reload() preserves the original URL (with ?next= query) so the back-redirect works after auto-recovery.
✅ 6 regression tests include both source-level pins (negative-pattern guards prevent revert) and behavioral round-trips (verify the fix actually works at runtime).

Minor observations (non-blocking)

clearTimeout(retryTimer) at login.js:90 — the timer was created via setInterval at line 103, but cleared with clearTimeout. Per MDN, setTimeout and setInterval share the same pool of IDs in all major browsers, so this works in practice. Cosmetic nit; clearInterval would be more semantically accurate.
/health probe runs once on page load, not periodically. If the user sits on the login page and the server later becomes unreachable, they'd only learn about it when they tried to login (where they'd get the existing connFailed message). Acceptable; covering the "server goes down while user types password" edge would add complexity for marginal benefit.
setFormDisabled doesn't re-enable on success — but that's because we window.location.reload() immediately on success, so the disable state is irrelevant. Could be flagged as dead state but doesn't hurt.
SESSION_TTL = 86400 * 30 is a magic number. Could be SESSION_TTL = timedelta(days=30).total_seconds() for readability, but the inline arithmetic is fine.
Test count drift: 3558 local vs 3610 reported. Consistent with prior batches.

Recommendation

✅ Approved. The Opus follow-up correctly fixes a real multi-param query truncation bug AND closes a parser-divergence footgun before tag. Encoding shape (safe='/') verified via 12-case round-trip harness; CRLF injection blocked; _safeNextPath 4-layer guard chain unchanged. The /health probe is correctly public and gives users actionable error messages. The 30-day TTL is a sensible UX improvement for VPN/Tailscale users.

Parked at approval — ready for the release agent's merge/tag pipeline.

bsgdigital and others added 5 commits May 1, 2026 19:54

fix(auth): extend session TTL to 30 days + redirect back after login

9c0667d

fix(login): probe /health on load, show VPN error if unreachable

af3d26f

Merge PR #1419 from bsgdigital: login session TTL + redirect-back + c…

ba33dbd

…onnectivity probe

nesquena approved these changes May 1, 2026

View reviewed changes

nesquena-hermes merged commit 2ae07ba into master May 1, 2026
3 checks passed

nesquena-hermes deleted the stage-258 branch May 1, 2026 22:30

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

v0.50.258 — login stability batch (#1419) + redirect-encoding Opus follow-up#1422

v0.50.258 — login stability batch (#1419) + redirect-encoding Opus follow-up#1422
nesquena-hermes merged 5 commits into
masterfrom
stage-258

nesquena-hermes commented May 1, 2026

Uh oh!

nesquena left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

nesquena-hermes commented May 1, 2026

v0.50.258 Batch Release — login stability

Summary

Constituent PR

Pre-applied Opus follow-up

Tests

Independent review

What's NOT in this batch

Uh oh!

nesquena left a comment

Choose a reason for hiding this comment

Review — end-to-end ✅ (clean approve, encoding fix verified, behavioral harness across 12 scenarios)

Squash audit

What this ships

Opus pre-release follow-up — encoding fix verified

Behavioural harness — 12 round-trip scenarios

CRLF injection check

_safeNextPath open-redirect chain

Cross-tool consistency

Race / lock analysis

Security audit

Edge-case matrix

Tests

Other audit — what I checked

Minor observations (non-blocking)

Recommendation

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

`_safeNextPath` open-redirect chain