feat(language): operator language picker — six-flag UI + end-to-end translation#5
Conversation
…cking
Two ships in one. Language picker is the headline; the Director refactor
is the supporting infra fix that was already in the working tree.
── Multi-language support (operator-facing) ──
Surface the existing translator + ElevenLabs-multilingual plumbing as a
real UI affordance. Six languages (en/es/fr/de/zh/tl) — flash_v2_5 speaks
all of them with the same voice ID, so picking a language flips the
spoken output without changing the avatar.
* dashboard/src/components/LanguagePicker.jsx (new)
Two presentations driven by `compact`. Pre-upload: prominent six-
tile grid centered above the drop zone ("STEP 1 · LANGUAGE").
Post-upload: compact corner chip with click-to-expand popout so
the operator can flip languages mid-stream.
* dashboard/src/hooks/useEmpireSocket.js
Adds `activeLanguage` + setter, seeded on mount from
GET /api/live/language and kept in sync via the existing
`language_changed` WS broadcast (multi-tab coherent). Also lifts
the `play_clip` debug HUD state into the hook (used by the new
ClipHudRow in App.jsx).
* dashboard/src/App.jsx
Mounts the picker — full version in the empty hint, compact chip
top-right post-upload (zIndex above the 9:16 phone overlay so the
popout isn't clipped). Also mounts the T0/T1 clip debug HUD.
* backend/main.py — api_respond_to_comment
Patches the live Q&A path to translate response_text + pass
language_code to ElevenLabs, matching what run_sell_pipeline
already does for the pitch. Cache hits in
translator.translate() are free; only the first time we ever
speak a particular response in a new language costs one Claude
Haiku call. Failure modes fall through to English.
End-to-end verified live: click 🇪🇸 → POST /api/live/language → backend
flips state, broadcasts language_changed → dashboard tagline updates to
"avatar will speak in español" → drop video → translated pitch + Q&A.
── Director refactor (supporting infra, was in working tree) ──
* backend/agents/avatar_director.py
- Adds `_tier1_busy_until` horizon. Every deliberate Tier 1 emit
(bridge / pitch / response / reading_chat / listening_attentive
/ processing / fetching) extends the horizon by ttl_ms. The
idle rotation loop checks it and skips the random misc_*
interjection branch while a deliberate clip owns the layer.
Kills the "she glances aside silently DURING the
processing.mp4 readback" overlap class. Cleared by
fade_to_idle.
- Adds `_processing_chain_id` for the walk_off → processing.mp4
narrative. Chain id bumps on each call so a back-to-back
upload (or an early pitch via dispatch_audio_first_pitch)
cleanly cancels the queued processing tail.
- Adds `_IDLE_TIER1_EMITTERS` frozenset as the single source of
truth for "this emit doesn't claim the Tier 1 layer".
- Fixes misc_glance_aside URL → _silent.mp4 variant. The
previous _speaking.mp4 played muted as an idle interjection
read as the avatar silently mouthing words (uncanny). The
silent render at veo_silent_idle_renders.py was made for
exactly this rotation context.
* backend/main.py — _play_upload_bridge() helper
Single-call-site refactor (Option A). Bridge ownership moves
from the pipeline layer to the route layer (api_sell +
api_sell_video). run_video_sell_pipeline → run_sell_pipeline
previously fired the walk_off chain twice — visible as a re-walk
mid-fetch. One call per upload regardless of which pipeline runs.
Same commit fixes the embedded T1-list URL (CarouselTester
preview template) to match the live Director config.
|
Caution Review failedPull request was closed or merged during review 📝 WalkthroughWalkthroughThe changes introduce live language switching for the avatar director system and refactor the upload-phase processing bridge from pipelines to routes. Backend tier-1 busy-state management is enhanced with explicit classification of autonomous events and chain coordination for the processing narrative. Frontend gains a language picker UI component and clip debug visualization, with WebSocket/HTTP integration for live language updates. Changes
Sequence Diagram(s)sequenceDiagram
actor User
participant Dashboard as Dashboard Client
participant API as Backend Routes
participant Director as Director & Pipelines
participant Broadcast as WebSocket Broadcast
User->>Dashboard: Select language in LanguagePicker
Dashboard->>API: POST /api/live/language {lang: code}
API->>Director: Update language state
API->>Broadcast: Emit language_changed {lang: code}
Broadcast->>Dashboard: Receive language_changed message
Dashboard->>Dashboard: Update activeLanguage state
Note over API,Director: When responding to comments
API->>API: Translate response using activeLanguage
API->>API: text_to_speech(..., language_code=activeLanguage)
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Comment |
Summary
Real, lightweight language support shipped end-to-end. The translator
agent + multilingual ElevenLabs path already existed in the backend;
this PR surfaces them as an operator-facing UI, wires the live Q&A
path so it also speaks the chosen language, and bundles a Director
refactor that was already sitting in the working tree as supporting
infra.
What changed
1. Language picker (the headline)
Six languages:
en🇺🇸,es🇪🇸,fr🇫🇷,de🇩🇪,zh🇨🇳,tl🇵🇭.Same voice ID across all six — ElevenLabs
flash_v2_5is multilingual,so the avatar's voice doesn't change, only the language coming out of
its mouth.
dashboard/src/components/LanguagePicker.jsx(new, ~310 LOC)Two presentations driven by
compact:("STEP 1 · LANGUAGE")
the operator can flip languages mid-stream
dashboard/src/hooks/useEmpireSocket.jsAdds
activeLanguagestate, seeded on mount fromGET /api/live/language, kept in sync via the existinglanguage_changedWS broadcast (so multiple open tabs staycoherent). Also lifts the
play_clipdebug HUD state into the hook.dashboard/src/App.jsxMounts the picker — full version inside the empty hint, compact chip
top-right post-upload (zIndex above the 9:16 phone overlay so the
popout isn't clipped by an internal layer). Mounts the T0/T1 clip
debug HUD.
backend/main.py—api_respond_to_commentPatches the live Q&A TTS call to translate
response_textand passlanguage_codeto ElevenLabs, mirroring whatrun_sell_pipelinealready does for the pitch. Cache hits in
translator.translate()are free; only the first time we ever speak a particular response
in a new language costs one Claude Haiku call. Failure modes
(Bedrock error, unknown lang) fall through to the original English
text — see
translator.translate()for the fallback contract.2. Director refactor (supporting infra, was in working tree)
backend/agents/avatar_director.py_tier1_busy_untilhorizon — every deliberate Tier 1 emit(bridge / pitch / response / reading_chat / listening_attentive /
processing / fetching) extends a busy horizon by
ttl_ms. Theidle rotation loop checks it and skips the random
misc_*interjection branch while a deliberate clip owns the layer. Kills
the "she glances aside silently DURING the processing.mp4
readback" overlap class. Cleared by
fade_to_idle._processing_chain_idfor the walk_off → processing.mp4narrative. Bumped on each
play_processingcall so a back-to-backupload (or an early pitch via
dispatch_audio_first_pitch)cleanly cancels the queued processing tail instead of overlaying
stale content on the pitch.
_IDLE_TIER1_EMITTERSfrozenset — single source of truth for"this emit doesn't claim the Tier 1 layer".
misc_glance_asideURL fix →_silent.mp4variant. Theprevious
_speaking.mp4URL played muted as an idle interjectionread as the avatar silently mouthing words (uncanny). The silent
render was made for exactly this rotation context; mouth stays
closed.
backend/main.py—_play_upload_bridge()helperSingle-call-site refactor (Option A). Bridge ownership moves from
the pipeline layer to the route layer (
api_sell+api_sell_video).run_video_sell_pipeline → run_sell_pipelinepreviously fired thewalk_off chain twice — visible as a re-walk mid-fetch. One call per
upload regardless of which pipeline runs.
Same diff fixes the embedded
T1list URL inside the CarouselTesterpreview template so it matches the live Director config.
End-to-end flow
POST /api/live/language?lang=espipeline_state[\"active_language\"] = \"es\",broadcasts
language_changedover WS("avatar will speak in español")
run_sell_pipelinereadsactive_language→
translator.translate(script, \"es\")(sqlite-cached) →text_to_speech(..., language_code=\"es\")→ ElevenLabsflash_v2_5speaks the Spanish pitch with the same voice IDcache hits
Why it's lightweight
were already shipped from a prior pass; they just had no UI
six languages
language (sqlite-cached in
backend/data/brain.db); repeat playscost nothing
SUPPORTEDinagents/translator.py+ one row append toLANGUAGESinLanguagePicker.jsx. No schema, no pipeline changes.Test plan
GET /api/live/languagereturns supported list + cache statsPOST /api/live/language(lang=es) flips state + broadcasts WSupdates, purple-glow border moves to Spanish tile
is in Spanish (deferred to operator on hardware)
audio is in Spanish (deferred to operator on hardware)
Known followups (not in this PR)
phase0/fixtures/CODY_PROMPT.mdhad unrelated markdownreformatting in the working tree (table format mangled, code-block
backticks misplaced); intentionally left out so the PR stays focused
run_comment_pipeline/make_avatar_speakpaths was skipped — those are degraded fallbacksthat don't fire on the happy path
Summary by CodeRabbit