Skip to content

Releases: pretyflaco/millet

v0.12.5 — Title-aware schedule matching + collision guard for sync

03 Jun 11:30

Choose a tag to compare

Fixed

  • Title-aware schedule matching. detect_meeting_type now considers the session title (when present in *.session.json). A titled session only auto-matches a scheduled meeting whose name/folder slug equals the title's slug; otherwise it returns None so the caller files it under its own folder. This stops an ad-hoc meeting recorded inside a schedule window (e.g. a "post-scrum" at 09:03 inside the 06:30-09:30 standup window) from being misfiled as the scheduled meeting. Untitled sessions keep the prior pure time-window behavior (back-compat).
  • Collision guard: never silently overwrite a different meeting. sync_session writes a small local-only .session-id marker into each synced folder and, before reusing a dated folder, checks it. If an existing folder belongs to a different session, the new meeting is filed into a disambiguated folder (<folder>-<sessionid-suffix>) instead of clobbering the existing one.

Notes

  • The .session-id marker is kept strictly local: it is registered in the clone's .git/info/exclude, so it is never committed/pushed and never trips the "uncommitted changes" sync guard.
  • Pairs with vezir v0.7.16, which injects the session title into *.session.json and adds an explicit "sync as" folder override.

v0.12.4 — robust language detection + sync exit-code

02 Jun 15:07

Choose a tag to compare

Fixed

  • Wrong summary/transcript language from a misleading channel opener. whisperx detects language from only the first ~30s of each channel, so an opening word in another language (e.g. "Gracias") could mislabel an English meeting — even after the v0.12.3 dominant-channel fix. Detection now samples several windows across each channel via faster-whisper's detect_language(language_detection_segments=N) (whisperx backend; --language-detection-segments, default 6).
  • millet sync exited 0 even when the push failed. A failed sync (e.g. git push rejected) now raises SystemExit(1), so callers don't have to scrape the log to notice failures.

Added

  • Soft default-language bias. --default-language <lang> keeps a team/operator default unless a channel confidently detects another language (>= default_language_override_confidence, default 0.70). Feeds the dominant-channel selection so single-language teams don't drift.

Tests

  • +default-language bias, +CLI sync exit-code. Full suite 295 pass; 7 pre-existing env-only failures.

v0.12.3 — summary language from dominant channel + per-language summaries

02 Jun 13:00

Choose a tag to compare

Fixed

  • Summary generated in the wrong language for dual-channel meetings. The transcript/summary language was taken from the mic channel only, so a local speaker's minority-language asides (e.g. a few Portuguese phrases) made the whole summary that language even when the meeting was mostly English on the system channel. The language is now chosen from the channel with the most speech (_dominant_channel_language; mic wins exact ties).
  • Each channel is word-aligned with its OWN detected language (_align_channel) instead of sharing the mic's alignment model.

Added

  • Additional-language summaries. apply_labels(summary_language=...) regenerates the summary in a chosen language and saves it as an ADDITIONAL <base>.summary.<lang>.md (with suffixed sidecars), leaving the primary intact. MeetingSummary.save gains lang_suffix.

Changed

  • Sync pushes <base>.summary.<lang>.md as a distinct summary.<lang>.md; .frontmatter.json is now excluded from sync (also fixes a latent transcript.json collision).

Tests

  • +8 (dominant-channel language, additional-language save/override). Full suite 285 pass; 7 pre-existing env-only failures.

v0.12.2 — suppress phantom remote speakers in dual-diarize

02 Jun 12:23

Choose a tag to compare

Fixed

  • Phantom extra speakers in the dual-diarize path. pyannote can over-segment a single remote stream into multiple clusters (e.g. peeling short backchannel "yeah/cool/awesome" off the main speaker into a separate cluster), which voiceprint matching then mis-names from a weak, barely-over-threshold match. Observed on a real 2-speaker call that surfaced as 4 speakers (a false "Roark" @0.69 and a 0.4s "REMOTE" one-liner).

Added

  • Voiceprint auto-apply gate. A match at/above MATCH_THRESHOLD is auto-applied only when it has enough embeddable speech and is unambiguous — either a strong absolute confidence (MATCH_AUTOAPPLY_CONFIDENCE = 0.72) or a clear margin over the runner-up profile (MATCH_AUTOAPPLY_MARGIN = 0.15). SpeakerMatch now carries evidence_seconds + margin; identify_speakers computes the per-cluster margin. Weak/ambiguous matches stay raw and route to needs_labeling rather than mislabeling. The auto-id sidecar records only applied matches.
  • Remote-cluster consolidation (dual-diarize). After diarizing the system channel, merge same-speaker clusters (voiceprint cosine ≥ cluster_merge_similarity) and absorb thin clusters (< cluster_min_speech_seconds embeddable) into the dominant remote; attach trivial unassigned segments to the nearest remote so a brief one-liner no longer becomes a generic REMOTE. On by default; disable with --no-consolidate-remote-clusters.

Validated

  • Real 2-speaker session: 4 → 2 named + 1 raw cluster, no false name, orphan merged.
  • Real 13-speaker session: every legitimate speaker still auto-named (no over-suppression).

Tests

  • +18 (consolidation merge/absorb/no-over-merge/orphan/config + auto-apply gate policy). Full suite 277 pass; the 7 failures are pre-existing env-only (device-default, Gdk import, offline diarization).

v0.12.1 — fix auto-label discarding matches in non-interactive runs

02 Jun 09:20

Choose a tag to compare

Fixed

  • label --auto discarded all voiceprint matches when any speaker was unmatched in a non-interactive (worker) context. It auto-applied confident matches, then prompted for unrecognized speakers; with no TTY, click.prompt hit EOF → Abort → matches never persisted. Meetings with fully-recognizable speakers were stuck in needs_labeling with raw SPEAKER_N ids. Now skips prompting when stdin isn't a TTY.

Added

  • *.autoid.json sidecar (name + confidence per speaker) so labeling UIs can pre-fill recognized names and show confidence. Excluded from sync + transcript resolution.

3 new tests.

v0.12.0 — dual-diarize: per-channel ASR + remote speaker diarization

01 Jun 10:21

Choose a tag to compare

New default for stereo: dual-diarize. Transcribes mic and system channels separately (local speaker = continuous YOU from mic, immune to overlap), then runs pyannote diarization on the system channel to split distinct remote speakers. Overlapping segments preserved.

Headline fix

Eliminates overlap-fragmentation: mono mixdown + diarization flickered words between speakers during talk-over. Now: each channel's speech is a continuous stream — no per-word channel guessing.

Also includes

  • Channel-energy correction (mono path, --channel-correct): per-segment/word RMS reassignment + --channel-correct-margin tuning. On by default for --mixdown mono.
  • DNS-retry hardening for millet sync git operations (5x backoff).
  • 11 new tests; 256 pass, 0 regressions.

Validated on

  • DEVSTANDUP (5 speakers, 62 min) — overlap-fragmentation eliminated
  • LUKAS_2 (2 speakers, 45 min) — clean overlapping segments
  • AB_BOARD (4 speakers, 88 min, .ogg) — distinct remotes preserved

--mixdown mono and --mixdown dual remain available as fallbacks.

v0.11.0 — opt-in Parakeet ASR backend

30 May 11:59

Choose a tag to compare

Adds an opt-in Parakeet ASR backend (NVIDIA Parakeet TDT via onnx-asr) alongside whisperx and mlx.

Highlights

  • --asr-backend parakeet (English, ONNX Runtime, pure-Python — no extra torch/transformers).
  • Long audio auto-chunked via Silero VAD; WhisperX-shaped output so alignment/diarization/dual-channel labeling are unchanged.
  • --parakeet-keep-alignment toggle (native timestamps vs WhisperX alignment).
  • millet download parakeet (explicit, lazy model fetch).
  • millet-pipeline[parakeet] extra (onnx-asr[hub]); install onnxruntime-gpu for CUDA.
  • scripts/bench_asr.py harness + results.

Notes

  • Opt-in only; auto selection unchanged.
  • On a 3090, whisperx is faster than Parakeet; Parakeet's value is finer segmentation, not speed. Stays opt-in pending further validation.
  • 12 new tests; no behavior change for existing users.

v0.10.0 — tech-debt sweep (import fix, CI, ruff, cli split)

29 May 21:06

Choose a tag to compare

Code-health release. No user-facing behavior change; minor bump because internal cli.py became a cli/ package.

Fixed

  • Latent clean-install crash: the millet/{capture,audio,utils,languages}.py shims imported from meet_record.* (pre-rename package). On a clean pip install millet-pipeline (which depends on millet-recordmillet_record) every shim raised ModuleNotFoundError: meet_record; it only worked where the legacy meetscribe-record was co-installed. Now they import from millet_record.*. New tests/test_shim_imports.py guards against regression.

Changed

  • cli.py (1929 lines) split into a cli/ package — one module per command + cli/_helpers.py; cli/__init__.py defines the main group and re-exports every command symbol so the millet.subcommands/meet.subcommands entry points keep resolving.
  • CI fixed — was linting/testing dead meet/ paths and installing meetscribe-record (old name). Now lints millet/+tests/, installs millet-record, runs the no-torch/no-GTK suites.
  • Ruff config added (mirrors vezir's ruleset); ~120 findings cleaned. Legacy meet.* test refs rewritten to millet.* (recovered ~29 tests).

Tests: 233 passed, 7 environment-only fails (GTK/torch); ruff clean.

v0.9.2 — resilient Tinfoil (confidential) summarization

29 May 18:50

Choose a tag to compare

The confidential summary preset (Tinfoil TEE backend) could hard-fail on a single transient DNS/network blip: the Tinfoil SDK does a network fetch at client construction (router discovery, GET https://atc.tinfoil.sh/routers) and the init was outside the retry path, so one flaky lookup aborted the whole summarization.

Fixed

  • _summarize_tinfoil now retries transient network/DNS errors with exponential backoff (3 attempts, ~2s/4s/8s), around BOTH client construction (router discovery) and the completion call.
  • Genuine auth/model errors still fail fast (no retry); a persistent outage surfaces a clear "Tinfoil TEE unreachable after N attempts" message.
  • New _is_transient_network_error() walks the exception cause chain (the SDK wraps URLError in ValueError("Failed to fetch router addresses…")) and matches common DNS/connection failure text.

Note: the Tinfoil SDK itself is current (0.12.1, latest on PyPI) — this was a host DNS-flakiness resilience fix. 6 new tests.

v0.9.1: team-aware paths + --team flag + MILLET_* env aliases

28 May 19:58

Choose a tag to compare

Team-aware path resolution and --team flag for millet sync/enroll/label, plus MILLET_* environment-variable aliases (one-release fallback to legacy MEETSCRIBE_*/MEET_* with a deprecation warning). On-disk ~/.config/meet and ~/meet-recordings paths are unchanged. See CHANGELOG.md for details.