Skip to content

fix(upload): archive extraction respects HERMES_WEBUI_ATTACHMENT_DIR#2520

Merged
2 commits merged into
nesquena:masterfrom
OneFat3:fix/archive-extract-respects-attachment-dir
May 18, 2026
Merged

fix(upload): archive extraction respects HERMES_WEBUI_ATTACHMENT_DIR#2520
2 commits merged into
nesquena:masterfrom
OneFat3:fix/archive-extract-respects-attachment-dir

Conversation

@OneFat3
Copy link
Copy Markdown
Contributor

@OneFat3 OneFat3 commented May 18, 2026

Thinking Path

  • Hermes WebUI supports HERMES_WEBUI_ATTACHMENT_DIR to redirect uploads away from the workspace
  • handle_upload() already honours this via _attachment_root()
  • handle_upload_extract() ignores the override and hardcodes extraction into s.workspace
  • This means archive uploads bypass the configured attachment directory
  • This PR fixes handle_upload_extract() to use _attachment_root() consistently

What Changed

api/upload.pyhandle_upload_extract() now calls _attachment_root() instead of using Path(s.workspace) as the extraction target.

Why It Matters

Users who set HERMES_WEBUI_ATTACHMENT_DIR (e.g. /workspace/uploads) expect all uploads — including archives — to land there. Without this fix, archives silently extract into the workspace root.

Verification

  • Set HERMES_WEBUI_ATTACHMENT_DIR=/workspace/uploads
  • Uploaded a zip archive via the UI
  • Confirmed extraction landed in /workspace/uploads/<archive_stem>/ instead of /workspace/<archive_stem>/
  • Single-file uploads continue to work as before

Risks / Follow-ups

Minimal — the change is two lines and only affects the archive extraction path. extract_archive() itself is unchanged; only the root directory passed to it differs.

Model Used

AI-assisted change with repository inspection, targeted editing, and shell-based test verification.

Ref #2247

@nesquena-hermes
Copy link
Copy Markdown
Collaborator

Summary

Reading api/upload.py:96-126 (handle_upload), :130-240 (extract_archive), and :242-269 (handle_upload_extract) on origin/master, plus the diff at :259-262 on this branch, the inconsistency the PR identifies is real:

  • handle_upload() resolves uploads under _session_attachment_dir(session_id), which itself walks under _attachment_root() — honoring HERMES_WEBUI_ATTACHMENT_DIR per the helper's docstring at api/upload.py:65-76.
  • handle_upload_extract() instead hardcoded Path(s.workspace) as the extraction root, which is the active workspace — bypassing the override and putting archive contents into the user's project tree.

The PR correctly redirects extract_archive's root to _attachment_root(). That restores parity between single-file uploads and archive uploads.

However, there are two concrete concerns worth raising before merge.

Code reference

The diff applied at api/upload.py:259-262:

-        workspace = Path(s.workspace)
-        result = extract_archive(file_bytes, filename, workspace)
+        upload_root = _attachment_root()
+        upload_root.mkdir(parents=True, exist_ok=True)
+        result = extract_archive(file_bytes, filename, upload_root)

Compare against the analogous single-file path at :114-115:

safe_name = _sanitize_upload_name(filename)
dest = _upload_destination(session_id, safe_name)  # → _session_attachment_dir(session_id)

Diagnosis / Recommendation

1. The fix loses session scoping. Single-file uploads go to _attachment_root() / <session_id_sanitized> via _session_attachment_dir() at api/upload.py:87-92. The new path for archives goes directly into _attachment_root() (the inbox root), not into a per-session subfolder. That means two sessions that upload archives with the same stem will collide on disk under the configured attachment dir — extract_archive() does append _NNN suffixes when the dir already exists (api/upload.py:151-156), so the second upload doesn't overwrite, but you also lose the "this archive came from session X" trail that single-file uploads keep. Recommendation: use _session_attachment_dir(session_id) as the root instead of _attachment_root() directly, e.g.:

upload_root = _session_attachment_dir(session_id)
upload_root.mkdir(parents=True, exist_ok=True)
result = extract_archive(file_bytes, filename, upload_root)

This keeps the session-scoping contract that handle_upload() already enforces and matches the user's mental model that "uploads from this chat go in this chat's folder."

2. extract_archive()'s relative_to(workspace.resolve()) at :194 and :228 will compute paths relative to the new root. The returned files list in the response now lists paths relative to _attachment_root() instead of s.workspace. Any frontend code that resolves those paths against the workspace will see them as orphans. Worth a grep for callers of /api/upload-extract and result.files to confirm nothing assumes a workspace-relative format:

grep -rn "extract_archive\|/api/upload-extract" ~/hermes-webui-public/{static,api}

If the WebUI just displays the relative path as text, the change is harmless; if it tries to open the file via the workspace file API, the path becomes broken. Worth confirming in the verification section.

3. Missing test coverage. The PR description says verification was a manual upload check, no automated test. The single-file upload path has tests under tests/test_upload*.py; the archive path should get a parallel test that:

  • Sets HERMES_WEBUI_ATTACHMENT_DIR=/tmp/test-uploads
  • POSTs an archive to /api/upload-extract
  • Asserts the resulting files are under /tmp/test-uploads/<...> and not under s.workspace

That pins the behavior the PR claims to fix and prevents regression if someone later restores Path(s.workspace) in a refactor.

4. CHANGELOG entry is missing. This is a user-visible behavior change for anyone using HERMES_WEBUI_ATTACHMENT_DIR — archive uploads previously landed in the workspace, now they land in the inbox. Worth a release-note entry per the AGENTS.md "user-visible behavior" rule. Could also reference #2247 in the entry since this is the same feature surface.

Verification

The two-line minimal patch is correct in intent. The concerns above are about completeness (session subfolder, returned paths) and contract durability (test + changelog), not about whether the fix points at the right helper. Ref #2247 is the right framing — this slots into the configurable-attachment-directory work tracked in that issue.

handle_upload_extract() used Path(s.workspace) as the extraction root,
bypassing HERMES_WEBUI_ATTACHMENT_DIR entirely. Route through
_session_attachment_dir(session_id) so archives land alongside
single-file uploads and session cleanup covers them.

Add tests and CHANGELOG entry.

Ref nesquena#2247
@OneFat3 OneFat3 force-pushed the fix/archive-extract-respects-attachment-dir branch from b648e0a to 2fe0ece Compare May 18, 2026 18:24
@OneFat3
Copy link
Copy Markdown
Contributor Author

OneFat3 commented May 18, 2026

Thanks for the thorough review! Addressed all four concerns in the updated commit:

1. Session scoping — Switched from _attachment_root() to _session_attachment_dir(session_id). Archives now land at <attachment_root>/<session_id>/<archive_stem>/, matching the single-file upload path and ensuring shutil.rmtree(_session_attachment_dir(sid)) covers them on session deletion.

2. Relative paths — Confirmed: the frontend (ui.js:7915) only consumes data.dest (absolute) and data.extracted (count). data.files is unused by the frontend. With workspace = session_dir, data.files now contains session-relative paths (e.g. archive_stem/file.txt), which is correct.

3. Tests — Added tests/test_pr2520_extract_attachment_dir.py with 4 test cases:

  • Extraction lands in session dir (not bare root)
  • Session cleanup (shutil.rmtree) covers extracted archives
  • dest is scoped under session dir, not directly under inbox
  • data.files paths are relative to session dir and resolve correctly

4. CHANGELOG — Added entry under [Unreleased] > Fixed.

All squashed into a single commit.

@nesquena-hermes nesquena-hermes closed this pull request by merging all changes into nesquena:master in 718a4c7 May 18, 2026
Michaelyklam pushed a commit to Michaelyklam/hermes-webui that referenced this pull request May 18, 2026
# Conflicts:
#	CHANGELOG.md
eleboucher pushed a commit to eleboucher/homelab that referenced this pull request May 19, 2026
… 0.51.92) (#560)

This PR contains the following updates:

| Package | Update | Change |
|---|---|---|
| [ghcr.io/nesquena/hermes-webui](https://github.com/nesquena/hermes-webui) | patch | `0.51.90` → `0.51.92` |

---

### Release Notes

<details>
<summary>nesquena/hermes-webui (ghcr.io/nesquena/hermes-webui)</summary>

### [`v0.51.92`](https://github.com/nesquena/hermes-webui/blob/HEAD/CHANGELOG.md#v05192--2026-05-19--Release-BP-stage-385--7-PR-full-sweep-batch--RFC-Slice-3c-clarification--workspace-tree-icon-alignment--project-move-cache-refresh--auto-compression-handoff-metadata--Grok-OAuth-provider-catalog--anonymous-custom-endpoint-picker-fallback--PWA-standalone-reload--pull-to-refresh)

[Compare Source](nesquena/hermes-webui@v0.51.91...v0.51.92)

##### Fixed

- **PR [#&#8203;2563](nesquena/hermes-webui#2563 by [@&#8203;Michaelyklam](https://github.com/Michaelyklam) (closes [#&#8203;2554](nesquena/hermes-webui#2554)) — Align workspace-tree file rows with sibling directory rows by reserving the same expand/collapse toggle slot for files via a new `.file-tree-toggle-placeholder` element. Expanded directories now show child files stepped in at the same icon column as child folders. Directory toggles and file interactions are unchanged; source-level regression coverage and before/after PNGs included.
- **PR [#&#8203;2561](nesquena/hermes-webui#2561 by [@&#8203;nanookclaw](https://github.com/nanookclaw) (closes [#&#8203;2551](nesquena/hermes-webui#2551)) — Refresh the authoritative `_allSessions` cache when the project picker moves a session to/from a project. Previous code mutated only the shallow sidebar row copy, so `renderSessionListFromCache()` re-read the unchanged cache and repainted a stale project dot until the next `/api/sessions` poll healed the UI. Both the "Removed from project" and "Moved to <project>" branches now write the new `project_id` into `_allSessions[idx]` before re-rendering.
- **PR [#&#8203;2567](nesquena/hermes-webui#2567 by [@&#8203;dso2ng](https://github.com/dso2ng) (refs [#&#8203;2477](nesquena/hermes-webui#2477)) — Surface automatic-compression handoff metadata through the `compressed` SSE event so the active browser stream keeps its completion card even after the backend rotates the session id from the origin to a compressed continuation. The event now carries both `old_session_id` and `new_session_id`/`continuation_session_id`; the frontend `compressed` listener accepts either, and the automatic-compression detail line names the compressed continuation session so the done state isn't silently dropped.
- **PR [#&#8203;2568](nesquena/hermes-webui#2568 by [@&#8203;Michaelyklam](https://github.com/Michaelyklam) (closes [#&#8203;2545](nesquena/hermes-webui#2545)) — Add the Hermes Agent `xai-oauth` provider to the WebUI's OAuth provider catalog so Grok OAuth accounts authenticated via the Hermes CLI appear in Settings → Providers and the `/api/models` picker. The provider is treated as CLI-managed OAuth (no WebUI API-key form) and uses the live Hermes CLI model catalog when available with a Grok 4.20 static fallback.
- **PR [#&#8203;2550](nesquena/hermes-webui#2550 by [@&#8203;espokaos-ops](https://github.com/espokaos-ops) (refs [#&#8203;2542](nesquena/hermes-webui#2542)) — Keep anonymous custom OpenAI-compatible endpoints in the model picker even when the configured `/v1/models` probe fails. Lightweight relays and llama-server-style deployments that authenticate `/v1/chat/completions` but not `/v1/models` no longer have their provider group silently dropped from the picker. Users can type a model id manually in the free-form input when no live catalog is available.

##### Added

- **PR [#&#8203;2548](nesquena/hermes-webui#2548 by [@&#8203;espokaos-ops](https://github.com/espokaos-ops) — Add a PWA-standalone reload affordance. A small refresh button appears in the app titlebar (visible only under `@media (display-mode: standalone), (display-mode: fullscreen)`) so users running the WebUI as an installed home-screen PWA can reload without re-launching the app. Adds a complementary pull-to-refresh gesture on the messages container with an 80px threshold and a smooth-scroll-to-top guard so accidental triggers while reading history feel intentional. 4-viewport screenshots (390/1280/1440/1920, light/dark, hover/idle) included under `docs/pr-media/2548/`.

##### Documentation

- **PR [#&#8203;2560](nesquena/hermes-webui#2560 by [@&#8203;Michaelyklam](https://github.com/Michaelyklam) (refs [#&#8203;1925](nesquena/hermes-webui#1925)) — Clarify the RuntimeAdapter Slice 3c state after [#&#8203;2544](nesquena/hermes-webui#2544) shipped. The RFC now distinguishes shipped `/api/goal` routing through `RuntimeAdapter.update_goal(...)` from the still-staged `queue_message(...)` protocol method, and explicitly warns not to add a new server-side queue endpoint or queue scheduler merely for adapter symmetry while `/queue` remains browser-side queue/drain behavior.

### [`v0.51.91`](https://github.com/nesquena/hermes-webui/blob/HEAD/CHANGELOG.md#v05191--2026-05-18--Release-BO-stage-384--5-PR-full-sweep-batch--reasoning-replay-history-fix--archive-extract-per-session-inbox--fallback-streaming-warnings--sanitized-custom-provider-env-hints--Slice-3c-queuegoal-adapter-routing)

[Compare Source](nesquena/hermes-webui@v0.51.90...v0.51.91)

##### Fixed

- **PR [#&#8203;2536](nesquena/hermes-webui#2536 by [@&#8203;Michaelyklam](https://github.com/Michaelyklam) (closes [#&#8203;2514](nesquena/hermes-webui#2514), refs [#&#8203;2535](nesquena/hermes-webui#2535)) — Stop reasoning-only Thinking entries from being replayed into provider-facing history as blank assistant turns. Long WebUI sessions were accumulating duplicated stale Thinking blocks and inflated Activity/tool metadata on later turns when reasoning-only display entries (from interrupted/canceled turns) got reinserted into the restored conversation history. The fix keeps visible Thinking cards in the transcript while filtering them out of provider-facing replay. Settled compact Activity rerenders now also clear previously inserted Thinking rows before rebuilding the visible transcript.
- **PR [#&#8203;2520](nesquena/hermes-webui#2520 by [@&#8203;OneFat3](https://github.com/OneFat3) (refs [#&#8203;2247](nesquena/hermes-webui#2247)) — Route archive extraction (`/api/upload/extract`) through the per-session attachment inbox (`_session_attachment_dir`) instead of hardcoded `Path(s.workspace)`, matching the single-file upload path. Extracted archives now land at `<attachment_root>/<session_id>/<archive_stem>/` so session deletion cleanup covers them and per-session isolation is preserved when `HERMES_WEBUI_ATTACHMENT_DIR` is configured.
- **PR [#&#8203;2505](nesquena/hermes-webui#2505 by [@&#8203;cyberdyne187](https://github.com/cyberdyne187) — Surface provider fallback and rate-limit lifecycle notices as auto-clearing fallback warnings in the streaming composer status. The new bridge in `_agent_status_callback` matches agent lifecycle messages containing `rate limited` / `switching to fallback` / `falling back` / `fallback activated` / `trying fallback` and emits them as `warning` events with `type=fallback`, so the existing `static/messages.js` warning channel surfaces them with the correct auto-clear contract instead of letting them drop silently.
- **PR [#&#8203;2556](nesquena/hermes-webui#2556 by [@&#8203;Michaelyklam](https://github.com/Michaelyklam) (closes [#&#8203;2541](nesquena/hermes-webui#2541)) — Sanitize auto-generated custom-provider API-key environment variable names so endpoint-derived provider ids such as `custom:gpu.local-8000` use POSIX-safe names like `CUSTOM_GPU_LOCAL_8000_API_KEY`. Runtime custom-provider key resolution now checks the sanitized env var first and falls back to the legacy punctuation-preserving name with a one-shot deprecation warning. Configured literal `api_key` values and explicit `key_env` config are unchanged.

##### Documentation

- **PR [#&#8203;2544](nesquena/hermes-webui#2544 by [@&#8203;Michaelyklam](https://github.com/Michaelyklam) (refs [#&#8203;1925](nesquena/hermes-webui#1925)) — Implement the first Slice 3c RuntimeAdapter control routing. `RuntimeAdapter` / `LegacyJournalRuntimeAdapter` now expose `queue_message(...)` and `update_goal(...)` as protocol-translator delegates, and the `/api/goal` route uses `update_goal(...)` only when `HERMES_WEBUI_RUNTIME_ADAPTER=legacy-journal` is enabled while preserving the legacy-direct response shape. The change keeps `/queue`'s existing browser-side drain semantics and goal post-turn evaluation in the current agent loop; no runner/sidecar, WebUI-owned queue, goal scheduler, cached-agent table, or execution-survives-restart claim is introduced.

</details>

---

### Configuration

📅 **Schedule**: Branch creation - At any time (no schedule defined), Automerge - At any time (no schedule defined).

🚦 **Automerge**: Disabled by config. Please merge this manually once you are satisfied.

♻ **Rebasing**: Whenever PR becomes conflicted, or you tick the rebase/retry checkbox.

🔕 **Ignore**: Close this PR and you won't be reminded about these updates again.

---

 - [ ] <!-- rebase-check -->If you want to rebase/retry this PR, check this box

---

This PR has been generated by [Renovate Bot](https://github.com/renovatebot/renovate).
<!--renovate-debug:eyJjcmVhdGVkSW5WZXIiOiI0My4xMDEuMSIsInVwZGF0ZWRJblZlciI6IjQzLjEwMS4xIiwidGFyZ2V0QnJhbmNoIjoibWFpbiIsImxhYmVscyI6WyJyZW5vdmF0ZS9jb250YWluZXIiLCJ0eXBlL3BhdGNoIl19-->

Reviewed-on: https://git.erwanleboucher.dev/eleboucher/homelab/pulls/560
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants