Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
19 commits
Select commit Hold shift + click to select a range
6f2051d
Align table formatting and improve consistency in DEVELOPMENT.md.
chris-colinsky Mar 5, 2026
e1fb72c
Bump project version to 0.1.1 in `pyproject.toml`.
chris-colinsky Mar 5, 2026
82b78d8
Refactor worker status reporting and failure aggregation for improved…
chris-colinsky Mar 5, 2026
e4addbc
Bump `audio-refinery` version to 0.1.1 in `uv.lock`.
chris-colinsky Mar 5, 2026
32e4b6c
Containerize the project with CUDA support: add `Dockerfile`, setup `…
chris-colinsky Mar 5, 2026
4cc91eb
Add Makefile target to test Slack integration with `SLACK_WEBHOOK_URL`
chris-colinsky Mar 5, 2026
a60d784
Expand DEPLOYMENT.md: add HuggingFace token setup, NVIDIA driver requ…
chris-colinsky Mar 5, 2026
017b115
Update .gitignore to exclude `.claude/` directory
chris-colinsky Mar 5, 2026
da98fb3
Refactor Slack notifier: make dotenv import conditional and narrow ex…
chris-colinsky Mar 5, 2026
879029e
Refactor transcriber: clarify docstrings, add `_whisperx_model` param…
chris-colinsky Mar 5, 2026
6df0521
Fix typo in `SeparationError` docstring
chris-colinsky Mar 5, 2026
63c18f2
Add sentiment directory support and improve docstring clarity
chris-colinsky Mar 5, 2026
f61341d
Narrow exception handling in `gpu_utils.py` and improve return statem…
chris-colinsky Mar 5, 2026
aa0ab28
Improve docstring clarity and add `_pipeline` parameter to `diarize()…
chris-colinsky Mar 5, 2026
d12817f
Refactor CLI: centralize Demucs scratch directory handling, improve c…
chris-colinsky Mar 5, 2026
138df85
Refactor Slack notifier: add detailed per-stage stats, average proces…
chris-colinsky Mar 5, 2026
ddf06d7
Refactor CLI: add derived metrics to combined report for detailed pro…
chris-colinsky Mar 6, 2026
3a8d09d
Document combined report fields in README for improved usability and …
chris-colinsky Mar 6, 2026
7e999d8
Prepare release v0.1.1
chris-colinsky Mar 6, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
45 changes: 45 additions & 0 deletions .dockerignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,45 @@
# Virtual environment and build artifacts
.venv/
*.egg-info/
dist/
build/
__pycache__/
*.py[cod]

# Test artifacts
.pytest_cache/
.coverage
htmlcov/
coverage.xml

# Type / lint caches
.mypy_cache/
.ruff_cache/

# Secrets and local config
.env
assets/

# Dev tooling
.pre-commit-config.yaml
.claude/
.idea/
.vscode/
*.swp
*.swo
.DS_Store

# Git
.git/
.github/

# Docs and non-runtime files
docs/
tests/
CHANGELOG.md
CONTRIBUTING.md
CLAUDE.md
SECURITY.md
Makefile
README.md
uv.lock
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -1,4 +1,5 @@
assets/
.claude/

# Python
.venv/
Expand Down
26 changes: 25 additions & 1 deletion CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,29 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0

## [Unreleased]

## [0.1.1] - 2026-03-06

### Added

- `combined_report.json` now includes four derived metrics: `avg_time_per_file_seconds`, `avg_time_per_mb_seconds`, `processing_speed_ratio` (real-time factor), and `words_per_audio_hour` (transcription density)
- Slack notifications now include detailed per-stage stats (processed / skipped / failed counts) and average processing time per file
- `make test-slack` Makefile target for validating Slack webhook integration
- Dockerfile and `.dockerignore` for containerized deployment
- Sentiment output directory (`<base>/sentiment/`) support in batch pipeline

### Changed

- Centralized Demucs scratch directory resolution in CLI — RAM disk detection and fallback confirmation now happen in one place
- Worker status reporting and failure aggregation in `pipeline-parallel` refactored for improved accuracy
- `python-dotenv` import in Slack notifier is now conditional — avoids import-time failure when the package is absent
- DEPLOYMENT.md expanded: HuggingFace token setup, NVIDIA driver requirements, cloud instance guidelines, and Docker usage
- Combined report fields documented in README under the Parallel Pipeline section

### Fixed

- Narrowed exception handling in `gpu_utils.py`, `transcriber.py`, and `notifier.py` to avoid masking unexpected errors
- Typo in `SeparationError` docstring

## [0.1.0] - 2026-03-01

### Added
Expand Down Expand Up @@ -35,5 +58,6 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
- `transformers` capped at `<4.40.0` — versions 4.40+ use `torch.utils._pytree.register_pytree_node`, an API introduced in PyTorch 2.2, which breaks with the pinned PyTorch 2.1.2
- `make dev-setup` now reinstalls CUDA torch wheels (`torch==2.1.2+cu121`, `torchaudio==2.1.2+cu121`) as its final step — `uv sync` resolves torch from PyPI and installs the CPU-only build, silently breaking GPU inference

[Unreleased]: https://github.com/LunarCommand/audio-refinery/compare/v0.1.0...HEAD
[Unreleased]: https://github.com/LunarCommand/audio-refinery/compare/v0.1.1...HEAD
[0.1.1]: https://github.com/LunarCommand/audio-refinery/compare/v0.1.0...v0.1.1
[0.1.0]: https://github.com/LunarCommand/audio-refinery/releases/tag/v0.1.0
39 changes: 39 additions & 0 deletions Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,39 @@
FROM nvidia/cuda:12.1.1-cudnn8-devel-ubuntu22.04

# System dependencies
RUN apt-get update && apt-get install -y \
python3.11 python3.11-dev python3-pip python3.11-venv \
ffmpeg git curl \
&& rm -rf /var/lib/apt/lists/*

# Non-root user
RUN useradd -m -u 1000 refinery
WORKDIR /app
USER refinery

# Install uv
RUN pip install --user uv

# Copy and install the package (resolves main deps; may pull CPU-only torch)
COPY --chown=refinery:refinery . .
RUN uv pip install -e .

# Install WhisperX at the pinned commit — no-deps to avoid overwriting torch
# v3.1.1 tag has the old API without device_index; use the correct commit instead
RUN uv pip install --no-deps \
"whisperx @ git+https://github.com/m-bain/whisperX.git@741ab9a2a8a1076c171e785363b23c55a91ceff1"

# Install pinned WhisperX runtime deps
# transformers must stay <4.40.0 — 4.40+ uses torch.utils._pytree.register_pytree_node
# which was added in PyTorch 2.2 and breaks with the pinned 2.1.2
RUN uv pip install \
"av==16.1.0" "ctranslate2==4.7.1" "faster-whisper==1.2.1" \
"flatbuffers==25.12.19" "nltk==3.9.2" "onnxruntime==1.24.1" \
"transformers>=4.30.0,<4.40.0"

# Reinstall PyTorch with CUDA 12.1 wheels last — uv pip install -e . above may have
# pulled CPU-only builds; this guarantees the CUDA wheel is what's actually used
RUN uv pip install torch==2.1.2+cu121 torchaudio==2.1.2+cu121 \
--extra-index-url https://download.pytorch.org/whl/cu121

CMD ["audio-refinery", "--help"]
11 changes: 11 additions & 0 deletions Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -79,6 +79,17 @@ dev-setup: install-dev install-whisperx install-torch-cuda pre-commit-install ##
@echo " 2. Run 'make test' to verify everything works"
@echo " 3. Run 'audio-refinery --help' to see available commands"

test-slack: ## Send a test Slack notification to verify SLACK_WEBHOOK_URL is configured
@uv run python -c "\
from dotenv import load_dotenv; \
load_dotenv(); \
import os, sys, json, urllib.request; \
url = os.getenv('SLACK_WEBHOOK_URL') or (print('SLACK_WEBHOOK_URL is not set — add it to .env or export it') or sys.exit(1)); \
data = json.dumps({'text': ':white_check_mark: *Test notification* from \`audio-refinery\` — Slack integration is working.'}).encode(); \
req = urllib.request.Request(url, data=data, headers={'Content-Type': 'application/json'}); \
urllib.request.urlopen(req, timeout=5); \
print('Test notification sent — check your Slack channel')"

stats: ## Show project statistics
@echo "Project Statistics:"
@echo "==================="
Expand Down
23 changes: 23 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -607,6 +607,29 @@ Options:
--help Show this message and exit.
```

### Combined report fields

`combined_report.json` is always written after all workers finish. It contains aggregate metrics across all workers:

| Field | Type | Description |
|---|---|---|
| `run_at` | string | ISO 8601 timestamp of run start (UTC) |
| `total_discovered` | int | Total WAV files found in `extracted/` |
| `total_time_seconds` | float | Wall-clock seconds from first worker start to last finish |
| `total_audio_hours` | float | Total audio duration processed across all workers |
| `source_audio_bytes` | int | Combined size of all input WAV files |
| `total_words` | int | Total words transcribed across all files |
| `total_segments` | int | Total transcript segments across all files |
| `avg_time_per_file_seconds` | float | `total_time / total_discovered` — average wall-clock cost per file |
| `avg_time_per_mb_seconds` | float | `total_time / source_MB` — processing seconds per MB of source audio |
| `processing_speed_ratio` | float | `audio_seconds / wall_seconds` — real-time factor (e.g. `3.7` means the pipeline processed audio 3.7× faster than its playback duration) |
| `words_per_audio_hour` | float | Transcription density — useful for detecting sparse/silent audio or diarization misses |
| `gpu_temp_celsius` | object | Per-device temperature summary: `peak_celsius`, `avg_celsius`, `sample_count` |
| `workers` | array | Per-worker label, device, exit code, and individual summary |
| `combined_failures` | array | Aggregated failure records from all workers |

`null` is written for derived metrics when the divisor is zero (e.g. `avg_time_per_file_seconds` is `null` if no files were discovered).

### Power limit / sudoers

`--power-limit` invokes `sudo nvidia-smi -pl <watts>`. To allow this without a password prompt:
Expand Down
Loading