feat(language): operator language picker — six-flag UI + end-to-end translation by adityasingh2400 · Pull Request #5 · adityasingh2400/Zo

adityasingh2400 · 2026-04-19T13:51:04Z

Summary

Real, lightweight language support shipped end-to-end. The translator
agent + multilingual ElevenLabs path already existed in the backend;
this PR surfaces them as an operator-facing UI, wires the live Q&A
path so it also speaks the chosen language, and bundles a Director
refactor that was already sitting in the working tree as supporting
infra.

What changed

1. Language picker (the headline)

Six languages: en 🇺🇸, es 🇪🇸, fr 🇫🇷, de 🇩🇪, zh 🇨🇳, tl 🇵🇭.
Same voice ID across all six — ElevenLabs flash_v2_5 is multilingual,
so the avatar's voice doesn't change, only the language coming out of
its mouth.

dashboard/src/components/LanguagePicker.jsx (new, ~310 LOC)
Two presentations driven by compact:
- Pre-upload: prominent six-tile grid centered above the drop zone
  ("STEP 1 · LANGUAGE")
- Post-upload: compact corner chip with click-to-expand popout so
  the operator can flip languages mid-stream
dashboard/src/hooks/useEmpireSocket.js
Adds activeLanguage state, seeded on mount from
GET /api/live/language, kept in sync via the existing
language_changed WS broadcast (so multiple open tabs stay
coherent). Also lifts the play_clip debug HUD state into the hook.
dashboard/src/App.jsx
Mounts the picker — full version inside the empty hint, compact chip
top-right post-upload (zIndex above the 9:16 phone overlay so the
popout isn't clipped by an internal layer). Mounts the T0/T1 clip
debug HUD.
backend/main.py — api_respond_to_comment
Patches the live Q&A TTS call to translate response_text and pass
language_code to ElevenLabs, mirroring what run_sell_pipeline
already does for the pitch. Cache hits in translator.translate()
are free; only the first time we ever speak a particular response
in a new language costs one Claude Haiku call. Failure modes
(Bedrock error, unknown lang) fall through to the original English
text — see translator.translate() for the fallback contract.

2. Director refactor (supporting infra, was in working tree)

backend/agents/avatar_director.py
- _tier1_busy_until horizon — every deliberate Tier 1 emit
  (bridge / pitch / response / reading_chat / listening_attentive /
  processing / fetching) extends a busy horizon by ttl_ms. The
  idle rotation loop checks it and skips the random misc_*
  interjection branch while a deliberate clip owns the layer. Kills
  the "she glances aside silently DURING the processing.mp4
  readback" overlap class. Cleared by fade_to_idle.
- _processing_chain_id for the walk_off → processing.mp4
  narrative. Bumped on each play_processing call so a back-to-back
  upload (or an early pitch via dispatch_audio_first_pitch)
  cleanly cancels the queued processing tail instead of overlaying
  stale content on the pitch.
- _IDLE_TIER1_EMITTERS frozenset — single source of truth for
  "this emit doesn't claim the Tier 1 layer".
- misc_glance_aside URL fix → _silent.mp4 variant. The
  previous _speaking.mp4 URL played muted as an idle interjection
  read as the avatar silently mouthing words (uncanny). The silent
  render was made for exactly this rotation context; mouth stays
  closed.
backend/main.py — _play_upload_bridge() helper
Single-call-site refactor (Option A). Bridge ownership moves from
the pipeline layer to the route layer (api_sell + api_sell_video).
run_video_sell_pipeline → run_sell_pipeline previously fired the
walk_off chain twice — visible as a re-walk mid-fetch. One call per
upload regardless of which pipeline runs.
Same diff fixes the embedded T1 list URL inside the CarouselTester
preview template so it matches the live Director config.

End-to-end flow

Operator clicks 🇪🇸 Español → POST /api/live/language?lang=es
Backend updates pipeline_state[\"active_language\"] = \"es\",
broadcasts language_changed over WS
All connected dashboards update the active tile + tagline
("avatar will speak in español")
Operator drops video → run_sell_pipeline reads active_language
→ translator.translate(script, \"es\") (sqlite-cached) →
text_to_speech(..., language_code=\"es\") → ElevenLabs
flash_v2_5 speaks the Spanish pitch with the same voice ID
Live viewer comments → identical translate-then-TTS flow with
cache hits

Why it's lightweight

Zero new backend dependencies — translator agent + cache + endpoints
were already shipped from a prior pass; they just had no UI
ElevenLabs multilingual flash_v2_5 means same voice across all
six languages
Pitch translation = one Claude Haiku call per unique pitch per
language (sqlite-cached in backend/data/brain.db); repeat plays
cost nothing
Adding a 7th language = one row append to SUPPORTED in
agents/translator.py + one row append to LANGUAGES in
LanguagePicker.jsx. No schema, no pipeline changes.

Test plan

GET /api/live/language returns supported list + cache stats
POST /api/live/language (lang=es) flips state + broadcasts WS
Browser: clicking Español → backend reflects "es", UI tagline
updates, purple-glow border moves to Spanish tile
Reset to English → backend reflects "en"
Lint clean across all four touched files
Drop a real video while Spanish is active → verify pitch audio
is in Spanish (deferred to operator on hardware)
Send a Q&A comment while Spanish is active → verify response
audio is in Spanish (deferred to operator on hardware)

Known followups (not in this PR)

phase0/fixtures/CODY_PROMPT.md had unrelated markdown
reformatting in the working tree (table format mangled, code-block
backticks misplaced); intentionally left out so the PR stays focused
Adding language to the legacy run_comment_pipeline /
make_avatar_speak paths was skipped — those are degraded fallbacks
that don't fire on the happy path

Summary by CodeRabbit

New Features
- Added live language switching supporting 6 languages with real-time response translation and localized text-to-speech.
- Enhanced processing animation with an improved walk-off/return sequence.
- Added debug clip display showing active avatar animation state.
- Updated miscellaneous glance gesture to use a silent variant.

…cking Two ships in one. Language picker is the headline; the Director refactor is the supporting infra fix that was already in the working tree. ── Multi-language support (operator-facing) ── Surface the existing translator + ElevenLabs-multilingual plumbing as a real UI affordance. Six languages (en/es/fr/de/zh/tl) — flash_v2_5 speaks all of them with the same voice ID, so picking a language flips the spoken output without changing the avatar. * dashboard/src/components/LanguagePicker.jsx (new) Two presentations driven by `compact`. Pre-upload: prominent six- tile grid centered above the drop zone ("STEP 1 · LANGUAGE"). Post-upload: compact corner chip with click-to-expand popout so the operator can flip languages mid-stream. * dashboard/src/hooks/useEmpireSocket.js Adds `activeLanguage` + setter, seeded on mount from GET /api/live/language and kept in sync via the existing `language_changed` WS broadcast (multi-tab coherent). Also lifts the `play_clip` debug HUD state into the hook (used by the new ClipHudRow in App.jsx). * dashboard/src/App.jsx Mounts the picker — full version in the empty hint, compact chip top-right post-upload (zIndex above the 9:16 phone overlay so the popout isn't clipped). Also mounts the T0/T1 clip debug HUD. * backend/main.py — api_respond_to_comment Patches the live Q&A path to translate response_text + pass language_code to ElevenLabs, matching what run_sell_pipeline already does for the pitch. Cache hits in translator.translate() are free; only the first time we ever speak a particular response in a new language costs one Claude Haiku call. Failure modes fall through to English. End-to-end verified live: click 🇪🇸 → POST /api/live/language → backend flips state, broadcasts language_changed → dashboard tagline updates to "avatar will speak in español" → drop video → translated pitch + Q&A. ── Director refactor (supporting infra, was in working tree) ── * backend/agents/avatar_director.py - Adds `_tier1_busy_until` horizon. Every deliberate Tier 1 emit (bridge / pitch / response / reading_chat / listening_attentive / processing / fetching) extends the horizon by ttl_ms. The idle rotation loop checks it and skips the random misc_* interjection branch while a deliberate clip owns the layer. Kills the "she glances aside silently DURING the processing.mp4 readback" overlap class. Cleared by fade_to_idle. - Adds `_processing_chain_id` for the walk_off → processing.mp4 narrative. Chain id bumps on each call so a back-to-back upload (or an early pitch via dispatch_audio_first_pitch) cleanly cancels the queued processing tail. - Adds `_IDLE_TIER1_EMITTERS` frozenset as the single source of truth for "this emit doesn't claim the Tier 1 layer". - Fixes misc_glance_aside URL → _silent.mp4 variant. The previous _speaking.mp4 played muted as an idle interjection read as the avatar silently mouthing words (uncanny). The silent render at veo_silent_idle_renders.py was made for exactly this rotation context. * backend/main.py — _play_upload_bridge() helper Single-call-site refactor (Option A). Bridge ownership moves from the pipeline layer to the route layer (api_sell + api_sell_video). run_video_sell_pipeline → run_sell_pipeline previously fired the walk_off chain twice — visible as a re-walk mid-fetch. One call per upload regardless of which pipeline runs. Same commit fixes the embedded T1-list URL (CarouselTester preview template) to match the live Director config.

coderabbitai · 2026-04-19T13:51:52Z

Caution

Review failed

Pull request was closed or merged during review

📝 Walkthrough

Walkthrough

The changes introduce live language switching for the avatar director system and refactor the upload-phase processing bridge from pipelines to routes. Backend tier-1 busy-state management is enhanced with explicit classification of autonomous events and chain coordination for the processing narrative. Frontend gains a language picker UI component and clip debug visualization, with WebSocket/HTTP integration for live language updates.

Changes

Cohort / File(s)	Summary
Backend Director Tier-1 State Management `backend/agents/avatar_director.py`	Added `_IDLE_TIER1_EMITTERS` classification to distinguish autonomous tier-1 events from deliberate ones; introduced `_tier1_busy_until` and `_processing_chain_id` to coordinate overlap suppression and two-step processing narrative; refactored `play_processing()` into walk-off/return plus queued processing chain; updated interjection and sip suppression logic to check busy horizon timestamps.
Backend Route Refactoring & Multilingual Support `backend/main.py`	Moved upload-bridge emission (`play_processing()` + `set_voice_state()`) from pipelines to REST routes via new `_play_upload_bridge()` helper; added response-language translation and multilingual TTS wiring to `api_respond_to_comment` and pipeline flow; updated misc glance-aside clip URL to silent variant.
Frontend Language Selection Component `dashboard/src/components/LanguagePicker.jsx`	New component exporting `LANGUAGES` constant and `LanguagePicker` React component; renders full pre-upload tile picker or compact post-upload chip; POSTs language selection to `/api/live/language`; guards against duplicates via pending state.
Frontend WebSocket State Extension `dashboard/src/hooks/useEmpireSocket.js`	Added `activeClips` (per-layer tier0/tier1 clip state) and `activeLanguage` state management; extended message handlers for `play_clip` (storing intent, URL, mute status) and `language_changed` events; added on-mount HTTP fetch to seed language from server.
Frontend UI & Debug Integration `dashboard/src/App.jsx`	Integrated `LanguagePicker` into pre-upload empty-state and post-upload chip UI; added debug HUD with `ClipHudRow` component displaying active tier0/tier1 clip details (intent, filename, muted state) with visual highlighting for muted-speaking conflicts; extended `useEmpireSocket()` consumption for new state values.

Sequence Diagram(s)

sequenceDiagram
    actor User
    participant Dashboard as Dashboard Client
    participant API as Backend Routes
    participant Director as Director & Pipelines
    participant Broadcast as WebSocket Broadcast

    User->>Dashboard: Select language in LanguagePicker
    Dashboard->>API: POST /api/live/language {lang: code}
    API->>Director: Update language state
    API->>Broadcast: Emit language_changed {lang: code}
    Broadcast->>Dashboard: Receive language_changed message
    Dashboard->>Dashboard: Update activeLanguage state
    
    Note over API,Director: When responding to comments
    API->>API: Translate response using activeLanguage
    API->>API: text_to_speech(..., language_code=activeLanguage)

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Poem

🐰 A hop and a skip, the languages now flow,
Tier-1 horizons keep interjections in tow,
With clips that are muted and states that align,
The director conducts through a busy-time line! 🎬✨

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 50.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title clearly summarizes the main change: adding an operator-facing language picker UI with a six-language support and end-to-end translation infrastructure.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch cody/language-picker

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

adityasingh2400 merged commit 740db89 into main Apr 19, 2026
1 of 3 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(language): operator language picker — six-flag UI + end-to-end translation#5

feat(language): operator language picker — six-flag UI + end-to-end translation#5
adityasingh2400 merged 1 commit into
mainfrom
cody/language-picker

adityasingh2400 commented Apr 19, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented Apr 19, 2026 •

edited

Loading

Review failed

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Poem

❌ Failed checks (1 warning)

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

adityasingh2400 commented Apr 19, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

What changed

1. Language picker (the headline)

2. Director refactor (supporting infra, was in working tree)

End-to-end flow

Why it's lightweight

Test plan

Known followups (not in this PR)

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented Apr 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review failed

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Poem

❌ Failed checks (1 warning)

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

adityasingh2400 commented Apr 19, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Apr 19, 2026 •

edited

Loading