v0.50.258 — login stability batch (#1419) + redirect-encoding Opus follow-up#1422
Conversation
…r recovers When the server is unreachable (VPN/Tailscale off), the login page now polls /health every 3 seconds instead of failing silently. Once the server becomes reachable, the page reloads automatically so the user doesn't have to manually refresh.
…onnectivity probe
…HANGELOG PR #1419 (login session TTL + redirect-back + connectivity probe) had a real bug in the server-side ?next= construction: quote(path, safe='/:@!$&'()*+,;=') keeps ? and & literal, so: (a) /api/sessions?limit=50&offset=0 round-trips as /api/sessions?limit=50 — the inner & terminates the outer next= value and offset=0 leaks as a top-level outer query the login page ignores. (b) An attacker-controlled path with embedded &next=https://evil.com injects a second top-level next parameter. Browsers parse first-match (benign), Python parse_qs parses last-match (the evil URL) — the parser-divergence is a footgun even though _safeNextPath() in login.js rejects the actual exploit. Fix: encode the entire path?query blob with safe='/' so ?, &, = all percent-encode. The outer next then holds exactly one path-with-query string the browser auto-decodes once. 6 regression tests in test_v050258_opus_followups.py pin round-trip behavior across simple paths, single-query, multi-param queries, attacker-injection neutralization, and the SESSION_TTL=30d constant. Full suite: 3610 passed, 0 failed.
nesquena
left a comment
There was a problem hiding this comment.
Review — end-to-end ✅ (clean approve, encoding fix verified, behavioral harness across 12 scenarios)
Release v0.50.258 — small focused login-stability batch. PR #1419 by @bsgdigital ships three independent fixes for flaky-network users; the bot's Opus advisor caught a real multi-param query truncation bug in the original ?next= encoding and fixed it before tag.
Squash audit
? #1419 Login stability: 30-day TTL, ?next= redirect-back, /health probe @bsgdigital
? Opus follow-up: redirect URL encoding (safe='/') + 6 regression tests
? release: v0.50.258 CHANGELOG
What this ships
(1) SESSION_TTL 24h → 30d at api/auth.py:27: users on flaky networks (VPN, Tailscale) no longer get kicked out daily. The cookie itself was already long-lived; this just stops the server-side session record from expiring.
(2) ?next=<original-path> redirect-back at api/auth.py:230-258: when an unauthenticated request hits a protected route, the redirect to /login now includes a next= parameter so _safeNextPath() in static/login.js can navigate back after successful auth.
(3) /health connectivity probe at static/login.js:74-109: on page load, fetch('health', ...) distinguishes "session expired" from "can't reach server". On unreachable: shows a clear error, disables the form, retries every 3s, and auto-reloads when the server is back. /health is verified public at api/auth.py:22 (PUBLIC_PATHS = frozenset({'/login', '/health', ...})).
Opus pre-release follow-up — encoding fix verified
The original PR #1419 encoded the next parameter with quote(path, safe='/:@!$&\'()*+,;=') (kept ? and & literal). Two real problems:
(a) Multi-param query truncation: /api/sessions?limit=50&offset=0 round-tripped as /api/sessions?limit=50 because the inner & terminated the outer next value. offset=0 leaked out as a top-level query parameter the login page ignored.
(b) Parser-divergence footgun: an attacker-controlled path embedding &next=https://evil.com injected a second top-level next parameter. Browsers' URLSearchParams.get() returns first-match (benign), Python's parse_qs returns last-match (the evil URL). The downstream _safeNextPath() rejects non-/ prefixes so the actual exploit is closed, but the parser divergence is a footgun.
Fix at api/auth.py:255-258 encodes the entire path?query blob with safe='/':
_path_with_query = parsed.path or '/'
if parsed.query:
_path_with_query += '?' + parsed.query
_next = _urlparse.quote(_path_with_query, safe='/')
handler.send_header('Location', '/login?next=' + _next)Behavioural harness — 12 round-trip scenarios
✅ '/foo' '' → '/foo'
✅ '/foo/bar' 'a=1' → '/foo/bar?a=1'
✅ '/foo/bar' 'a=1&b=2' → '/foo/bar?a=1&b=2' ← FIXES regression
✅ '/admin' 'action=foo&next=evil' → '/admin?action=foo&next=evil' ← FIXES injection
✅ '/foo with space' '' → '/foo with space'
✅ '/foo+bar' '' → '/foo+bar'
✅ '/foo%bar' '' → '/foo%bar'
✅ '/api' 'q=hello world' → '/api?q=hello world'
✅ '/path' 'k=v=w' → '/path?k=v=w' (equals in value)
✅ '//evil.com/x' '' → '//evil.com/x' (rejected by _safeNextPath downstream)
✅ '/x' 'a=&b=2' → '/x?a=&b=2' (empty value)
All round-trips correct. The protocol-relative //evil.com/x IS preserved through encoding (correctly — encoding shouldn't be a security boundary), but _safeNextPath at static/login.js:31-33 rejects it via if (raw.charAt(1) === '/' || raw.charAt(1) === '\\') return './'. Defense in depth holds.
CRLF injection check
Verified via harness:
'/foo\r\nSet-Cookie: evil=1' → '/foo%0D%0ASet-Cookie%3A%20evil%3D1'
'/foo\nbar' → '/foo%0Abar'
'/foo\rbar' → '/foo%0Dbar'
quote() correctly percent-encodes \r and \n so an attacker can't inject HTTP headers via the redirect URL. ✅
_safeNextPath open-redirect chain
Verified the four guard layers at static/login.js:27-36:
if (!raw) return './'— nonext→ safe default.if (raw.charAt(0) !== '/') return './'— must be path-absolute (rejectshttps://evil,//evil,data:...,javascript:...).if (raw.charAt(1) === '/' || raw.charAt(1) === '\\') return './'— rejects protocol-relative//eviland Windows-style\\evil.if (/[\x00-\x1f\x7f\s]/.test(raw)) return './'— rejects control chars and whitespace.
These four guards survive even if an attacker bypassed the encoding layer. ✅
Cross-tool consistency
- ✅
SESSION_TTLis webui-only auth state. CLI doesn't share session cookies. No cross-tool impact. - ✅
/healthendpoint is webui-only HTTP route. No CLI interaction. - ✅
?next=redirect is webui-only login flow. Noconfig.yamlwrites, no agent calls.
Race / lock analysis
SESSION_TTLchange: constant change, no race. Sessions saved with the new TTL on next save; existing sessions in.sessions.jsonretain their old TTL until rewritten./healthprobe: single-threaded JS event loop. The retry interval is created once and cleared when the server comes back. No race between the catch handler and the success handler — they're mutually exclusive within a single fetch promise.check_authredirect:urlparseandquoteare pure functions. No mutation of shared state. ✅
Security audit
- ✅ Open redirect: 4-layer
_safeNextPathdefense unchanged. Encoding fix is orthogonal — it preserves the exact path string for the JS guards to inspect. - ✅ Header injection:
quote(safe='/')percent-encodes CR/LF soLocation: /login?next=<path>cannot inject additional headers. - ✅ Query parameter pollution: encoding fix means the outer URL has exactly one
next=parameter regardless of attacker-controlled inner content. - ✅ Form disable on unreachable server: prevents user from typing password into a form whose POST will fail.
setFormDisabled(true)at login.js:100. - ✅
/healthis public: no auth or secrets leaked through the probe. - ✅ Auto-reload on recovery:
window.location.reload()is browser-native and same-origin.
Edge-case matrix
| Scenario | Pre-fix | Post-fix |
|---|---|---|
Session expired, user revisits /api/sessions?limit=50&offset=0 |
Login redirects, returns to /api/sessions?limit=50 (offset eaten) |
Returns to full URL ✅ |
Session expired, simple path /foo |
Worked | Works ✅ |
Session expired, path with ? |
Worked partially | Works ✅ |
Attacker-injected &next=https://evil.com in inner query |
Top-level next had two values; Python parsed last-match (evil) | One value preserved as part of inner string ✅ |
Path with CRLF (/foo\r\nSet-Cookie: evil) |
Could potentially inject header | Percent-encoded as %0D%0A ✅ |
Protocol-relative path //evil.com |
_safeNextPath rejected | Still rejected ✅ |
| Path with control chars / whitespace | _safeNextPath rejected | Still rejected ✅ |
| Tailscale dropped mid-page | User saw "Connection failed" / form silently broken | Clear "Cannot reach server" + auto-recover ✅ |
| Server returns 500 on /health | n/a | Shows "(server error 500)" ✅ |
| Session valid for > 24h | Kicked out | 30-day TTL ✅ |
Subpath mount /myapp/... redirect |
Worked | Works (relative fetch('health') resolves against base) ✅ |
Empty ?next= parameter |
_safeNextPath returns './' | Same ✅ |
Tests
test_v050258_opus_followups.py— 6/6 pass:test_login_redirect_uses_path_only_safe_encoding(source-level pin + negative-pattern guard)test_redirect_roundtrip_simple_pathtest_redirect_roundtrip_single_query_paramtest_redirect_roundtrip_multi_query_params(REGRESSION pin)test_redirect_roundtrip_attacker_controlled_next_injection_neutralized(REGRESSION pin)test_redirect_session_ttl_30_days
- Full suite: 3558 passed, 54 skipped, 3 xpassed, 0 failed in 17.74s on stage-258.
- CI: 3.11/3.12/3.13 all green.
- Behavioural harness: 12 round-trip scenarios + 3 CRLF cases all match expected.
(PR description claims 3610; my count 3558. Counting drift consistent with prior batches.)
Other audit — what I checked
- ✅
/healthis in PUBLIC_PATHS at api/auth.py:22. - ✅
safe='/'correctly percent-encodes?,&,=(verified via Python harness). - ✅
setIntervalretry loop correctly clears on success at login.js:90. - ✅
fetch('health', { credentials: 'omit' })doesn't send the auth cookie (which doesn't exist anyway since user is on login page). - ✅
window.location.reload()preserves the original URL (with?next=query) so the back-redirect works after auto-recovery. - ✅ 6 regression tests include both source-level pins (negative-pattern guards prevent revert) and behavioral round-trips (verify the fix actually works at runtime).
Minor observations (non-blocking)
clearTimeout(retryTimer)at login.js:90 — the timer was created viasetIntervalat line 103, but cleared withclearTimeout. Per MDN,setTimeoutandsetIntervalshare the same pool of IDs in all major browsers, so this works in practice. Cosmetic nit;clearIntervalwould be more semantically accurate./healthprobe runs once on page load, not periodically. If the user sits on the login page and the server later becomes unreachable, they'd only learn about it when they tried to login (where they'd get the existingconnFailedmessage). Acceptable; covering the "server goes down while user types password" edge would add complexity for marginal benefit.setFormDisableddoesn't re-enable on success — but that's because wewindow.location.reload()immediately on success, so the disable state is irrelevant. Could be flagged as dead state but doesn't hurt.SESSION_TTL = 86400 * 30is a magic number. Could beSESSION_TTL = timedelta(days=30).total_seconds()for readability, but the inline arithmetic is fine.- Test count drift: 3558 local vs 3610 reported. Consistent with prior batches.
Recommendation
✅ Approved. The Opus follow-up correctly fixes a real multi-param query truncation bug AND closes a parser-divergence footgun before tag. Encoding shape (safe='/') verified via 12-case round-trip harness; CRLF injection blocked; _safeNextPath 4-layer guard chain unchanged. The /health probe is correctly public and gives users actionable error messages. The 30-day TTL is a sensible UX improvement for VPN/Tailscale users.
Parked at approval — ready for the release agent's merge/tag pipeline.
v0.50.258 Batch Release — login stability
Summary
Small focused batch on top of v0.50.257 — 1 PR plus 1 Opus pre-release follow-up that fixes a real multi-param query truncation bug.
Constituent PR
Three independent fixes for users on flaky networks (VPN, Tailscale):
SESSION_TTL24h → 30d so users aren't kicked out daily?next=<original-path>redirect-back after session expiry/healthprobe distinguishes "session expired" from "can't reach server" + auto-recoversPre-applied Opus follow-up
Multi-param redirect URL encoding bug — the original PR #1419 implementation built the outer
?next=parameter viaquote(path, safe='/:@!$&\'()*+,;=')which kept?and&literal. Two real problems:(a) Multi-param query truncation —
/api/sessions?limit=50&offset=0round-tripped as/api/sessions?limit=50because the inner&terminated the outernextvalue andoffset=0leaked out as a top-level query parameter the login page ignored.(b) Parser-divergence footgun — an attacker-controlled path with embedded
&next=https://evil.cominjected a second top-levelnextparameter. Browsers'URLSearchParams.get()returns first-match (benign), Python'sparse_qsreturns last-match (the evil URL). The downstream_safeNextPath()already rejects non-/prefixes so the actual exploit is closed, but the parser divergence is a footgun.Fix: encode the entire
path?queryblob withsafe='/'so?,&,=all percent-encode. The outernextthen holds exactly one path-with-query string the browser auto-decodes once. 6 regression tests pin the round-trip behavior across all 4 problematic shapes.Tests
Independent review
Pinging @nesquena. Especially keen for eyes on:
safe='/'+ path/query reconstruction is the cleanest version I could think of, but worth a sanity check that I haven't broken any subpath-mount edge cases_safeNextPath()open-redirect guard chain — verified the attacker-injection scenario can't escape same-origin even with the encoding bug, but worth one more pair of eyessafe='/') is also there as belt-and-suspendersWhat's NOT in this batch
deepseek-chat. The PR body originally cited #18509 (wrong PR) — I posted a correction tagging the actual companion #18534.