fix: harden RDNA device property patch against OOM and attr loss by on22s · Pull Request #53 · Finrandojin/alexandria-audiobook

on22s · 2026-06-04T21:28:57Z

Summary

Three bugs in _patch_rdna_device_properties and the batch warmup paths:

Device key TypeError — int(device) raises TypeError when get_device_properties is called with a torch.device object (e.g. torch.device('cuda:0')). Fixed by using dev.index, with fallbacks for unindexed devices and errors.
SimpleNamespace silently drops C-extension attrs — dir(props) does not enumerate all C-extension attributes on the real props object, so any property not surfaced by dir() is missing from the patched result. Replaced with _RDNADeviceProps, a thin proxy that delegates all attribute lookups to the real props object and only overrides multi_processor_count and warp_size. Future PyTorch device property fields are forwarded automatically with no maintenance required.
Warmup loads second model into VRAM (OOM) — _local_batch_clone and _local_batch_lora both called self._init_local_custom() as the warmup model target while the clone/LoRA model was already loaded. Two full models resident simultaneously causes OOM on 12–16 GB cards. Fixed by using the already-loaded model for warmup. For the LoRA path, the warmup block is moved inside the adapter loop so the LoRA model is resident before warmup runs.

Test plan

Verify get_device_properties called with torch.device('cuda:0') no longer raises TypeError on RDNA hardware
Verify patched props object forwards arbitrary attributes (e.g. props.total_memory, props.name) correctly to the underlying device
Verify batch clone generation completes on a 12–16 GB RDNA card without OOM (previously required a restart after warmup)
Verify batch LoRA generation completes with warmup running against the LoRA model, not a separately loaded CustomVoice model

🤖 Generated with Claude Code

System updates can change the ROCm kernel module version without warning, breaking the previously hardcoded rocm6.3 torch wheel. Detect the installed ROCm version from /opt/rocm/.info/version and map to the nearest PyTorch index URL (7.x→rocm7.2, 6.3→rocm6.3, 6.2→rocm6.2.4, fallback rocm6.3). Ref pinokiocomputer/pinokio#1087 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Resolves Dependabot alert for DoS via unbounded multipart part headers. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

The previous implementation iterated directly over process.stdout on the calling thread, which blocks cancellation checks and would require select.select() if stderr were separated — an API that does not work on Windows pipes. A dedicated daemon thread now drains stdout into a queue.Queue. The drain loop calls queue.get(timeout=0.05) so it can honour a per-task "cancel" flag via process.terminate() between reads, with no platform-specific I/O multiplexing needed. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Adds a Preparer tab to the Alexandria web UI allowing users to generate LoRA training datasets from audiobooks, either one at a time or in a batch. Backend (app/app.py): - PreparerConfig, BatchPreparerTask, BatchPreparerRequest Pydantic models - check_disk_space(), _normalize_filename_tokens(), _fuzzy_score() helpers - _stream_subprocess_to_logs(): cross-platform stdout capture via thread + queue.Queue — no select.select(), works on Windows pipes - /api/preparer/suggest_source — fuzzy-match uploaded EPUB/TXT to audio file - /api/preparer/start — upload + run preparer script (single file) - /api/preparer/cancel — send SIGTERM to running preparer - /api/preparer/list — list generated dataset ZIPs - /api/preparer/download/{path} — download a dataset ZIP - /api/preparer/batch/start — queue multiple files, run sequentially - /api/preparer/batch/cancel — cancel in-progress batch - 503 guard on both start endpoints when app/alexandria_preparer.py absent Frontend (app/static/index.html): - Preparer nav tab (Advanced section) - Single-mode: file pickers for audio + source, "Match" auto-suggest button - Batch mode: multi-file picker, auto-match queue table, per-task status badges - Shared config: language, confidence, min SNR, keep-unaligned toggle - Live log window with 1 s polling, Cancel button, status message Tests (app/test_api.py): - preparer and batch_preparer added to status_known_tasks check - 8 new quick tests: suggest_source, status, cancel-when-idle, list, download-404, batch-schema, batch-cancel Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Drop EPUB/ebook source alignment step per upstream feedback (discussion Finrandojin#40). Finrandojin wants a tool for processing audio the user owns, not one that implies extracting from commercial audiobooks. - Remove suggest_source endpoint and all EPUB fuzzy-match logic - Remove source_filename and keep_unaligned from Pydantic models and API - Remove source file picker and Match button from UI - Rename nav label and tab header to "Voice Training Dataset Builder" - Update descriptions to emphasise own recordings / CC audio use cases - Remove test_preparer_suggest_source_no_match (endpoint deleted) Tests: 52 passed, 0 failed Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Persona prompts were hardcoded in generate_personas.py, unlike script generation and review prompts which are user-editable via the UI. Extract persona prompts to persona_prompts.txt, add a loader module following the existing review_prompts.py pattern, and wire into the config API and frontend Prompt Settings section with three new textareas. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Add tests for /api/system/stats, persona prompt config roundtrip, persona cancel endpoint, M4B audiobook endpoint, and persona status polling. Update default_prompts test to verify persona prompt fields. 72 tests, all passing in full mode. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Remove unused imports (tempfile, DEFAULT_SYSTEM_PROMPT/DEFAULT_USER_PROMPT), unused constants (VOICES_PATH, BUILTIN_LORA_MANIFEST), redundant _atomic_json_write wrappers in app.py and project.py, obsolete parse_voices.py and its endpoint/test, a debug print in update_chunk, and a bare except clause. Fix all dynamically-generated onclick attributes where JSON.stringify double quotes conflicted with the attribute's double-quote delimiters, silently breaking buttons in saved scripts, voice designer, LoRA datasets, and LoRA models. Also fix loadScript to force full chunk redraw and show a success toast. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Resolve conflicts: keep import signal from PR, keep updateSystemStats from main, combine both. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…ss, scope preparer output Re-apply cleanup from 392e8a2 that PR Finrandojin#47 reintroduced (dead imports, unused constants, _atomic_json_write wrapper, parse_voices endpoint, broken onclick quoting, loadScript fixes). Additional fixes to PR Finrandojin#47: - Remove unused _fuzzy_score and _normalize_filename_tokens functions - Replace os.kill(pid, SIGTERM) with process.terminate() for cross-platform - Deduplicate run_process by delegating to _stream_subprocess_to_logs - Remove duplicate check_disk_space definition - Scope preparer output to dedicated preparer_output/ directory - Rename nav tab from "Dataset Builder" to "Preparer" to avoid confusion - Add missing Form import for FastAPI endpoint - Add preparer_output/ to .gitignore Ref Finrandojin#46, Ref Finrandojin#47 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Resolve conflicts: keep _stream_subprocess_to_logs from dev, combine persona + preparer/batch_preparer task names in tests. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Flash and memory-efficient SDPA kernels on ROCm 7.2+ hang during batched attention with left-padded sequences. Disable these backends for batch clone/LoRA methods on AMD GPUs, falling back to the math SDPA kernel. Single generation and NVIDIA users are unaffected. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

The batch generation "hang" on ROCm 7.x was caused by the GPU's DPM controller aggressively downclocking the shader engine between autoregressive steps, not an SDPA kernel bug. Setting the COMPUTE power profile (pp_power_profile_mode=5) enforces a min clock floor and resolves the issue. - Remove SDPA backend disable workaround (was masking the real cause) - Remove GPU keepalive thread (unnecessary with COMPUTE profile) - Fix batch warmup to use CustomVoice model (Base model lacks custom voice speakers, causing warmup to fail silently) - Add RDNA2/3 CU count and warp size correction (ROCm reports half the CUs and warp_size=32 instead of 64 on consumer GPUs) - Document COMPUTE power profile fix in README Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Three bugs in _patch_rdna_device_properties and the batch warmup paths: 1. Device key normalization: int(device) raises TypeError when called with a torch.device object (e.g. torch.device('cuda:0')). Use dev.index instead, falling back to current_device() for unindexed devices and 0 on any error. 2. SimpleNamespace proxy silently drops C-extension attributes that don't appear in dir(props). Replace with _RDNADeviceProps, a thin proxy that delegates all attribute lookups to the real props object and only overrides multi_processor_count and warp_size. Future PyTorch device property fields are forwarded automatically. 3. _local_batch_clone and _local_batch_lora both loaded the CustomVoice model as the warmup target while the clone/LoRA model was already in VRAM. Having two full models resident simultaneously causes OOM on 12–16 GB cards. Use the already-loaded model for warmup instead. For the LoRA path, move the warmup block inside the adapter loop so the LoRA model is resident before warmup runs. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Finrandojin and others added 19 commits May 28, 2026 12:19

Bump python-multipart to 0.0.27 (CVE fix)

9f67f90

Resolves Dependabot alert for DoS via unbounded multipart part headers. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Merge branch 'dev'

d151059

Add acknowledgement for Michii's System Health Dashboard contribution

5912826

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Merge branch 'feature/cross-platform-subprocess' into dev

914f995

Merge branch 'dev'

781af62

Merge feature/web-batch-preparer into dev

1745280

Resolve conflicts: keep import signal from PR, keep updateSystemStats from main, combine both. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Merge dev into main (PRs Finrandojin#46, Finrandojin#47 with fixes)

59a749f

Resolve conflicts: keep _stream_subprocess_to_logs from dev, combine persona + preparer/batch_preparer task names in tests. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Add acknowledgement for Michii's PRs Finrandojin#46 and Finrandojin#47

b05c54f

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

on22s changed the base branch from main to dev June 5, 2026 12:29

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: harden RDNA device property patch against OOM and attr loss#53

fix: harden RDNA device property patch against OOM and attr loss#53
on22s wants to merge 19 commits into
Finrandojin:devfrom
on22s:fix/rdna-device-props

on22s commented Jun 4, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

on22s commented Jun 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Test plan

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

on22s commented Jun 4, 2026 •

edited

Loading