Skip to content

feat(language): operator language picker — six-flag UI + end-to-end translation#5

Merged
adityasingh2400 merged 1 commit into
mainfrom
cody/language-picker
Apr 19, 2026
Merged

feat(language): operator language picker — six-flag UI + end-to-end translation#5
adityasingh2400 merged 1 commit into
mainfrom
cody/language-picker

Conversation

@adityasingh2400
Copy link
Copy Markdown
Owner

@adityasingh2400 adityasingh2400 commented Apr 19, 2026

Summary

Real, lightweight language support shipped end-to-end. The translator
agent + multilingual ElevenLabs path already existed in the backend;
this PR surfaces them as an operator-facing UI, wires the live Q&A
path so it also speaks the chosen language, and bundles a Director
refactor that was already sitting in the working tree as supporting
infra.

What changed

1. Language picker (the headline)

Six languages: en 🇺🇸, es 🇪🇸, fr 🇫🇷, de 🇩🇪, zh 🇨🇳, tl 🇵🇭.
Same voice ID across all six — ElevenLabs flash_v2_5 is multilingual,
so the avatar's voice doesn't change, only the language coming out of
its mouth.

  • dashboard/src/components/LanguagePicker.jsx (new, ~310 LOC)
    Two presentations driven by compact:
    • Pre-upload: prominent six-tile grid centered above the drop zone
      ("STEP 1 · LANGUAGE")
    • Post-upload: compact corner chip with click-to-expand popout so
      the operator can flip languages mid-stream
  • dashboard/src/hooks/useEmpireSocket.js
    Adds activeLanguage state, seeded on mount from
    GET /api/live/language, kept in sync via the existing
    language_changed WS broadcast (so multiple open tabs stay
    coherent). Also lifts the play_clip debug HUD state into the hook.
  • dashboard/src/App.jsx
    Mounts the picker — full version inside the empty hint, compact chip
    top-right post-upload (zIndex above the 9:16 phone overlay so the
    popout isn't clipped by an internal layer). Mounts the T0/T1 clip
    debug HUD.
  • backend/main.pyapi_respond_to_comment
    Patches the live Q&A TTS call to translate response_text and pass
    language_code to ElevenLabs, mirroring what run_sell_pipeline
    already does for the pitch. Cache hits in translator.translate()
    are free; only the first time we ever speak a particular response
    in a new language costs one Claude Haiku call. Failure modes
    (Bedrock error, unknown lang) fall through to the original English
    text — see translator.translate() for the fallback contract.

2. Director refactor (supporting infra, was in working tree)

  • backend/agents/avatar_director.py

    • _tier1_busy_until horizon — every deliberate Tier 1 emit
      (bridge / pitch / response / reading_chat / listening_attentive /
      processing / fetching) extends a busy horizon by ttl_ms. The
      idle rotation loop checks it and skips the random misc_*
      interjection branch while a deliberate clip owns the layer. Kills
      the "she glances aside silently DURING the processing.mp4
      readback" overlap class. Cleared by fade_to_idle.
    • _processing_chain_id for the walk_off → processing.mp4
      narrative. Bumped on each play_processing call so a back-to-back
      upload (or an early pitch via dispatch_audio_first_pitch)
      cleanly cancels the queued processing tail instead of overlaying
      stale content on the pitch.
    • _IDLE_TIER1_EMITTERS frozenset — single source of truth for
      "this emit doesn't claim the Tier 1 layer".
    • misc_glance_aside URL fix_silent.mp4 variant. The
      previous _speaking.mp4 URL played muted as an idle interjection
      read as the avatar silently mouthing words (uncanny). The silent
      render was made for exactly this rotation context; mouth stays
      closed.
  • backend/main.py_play_upload_bridge() helper
    Single-call-site refactor (Option A). Bridge ownership moves from
    the pipeline layer to the route layer (api_sell + api_sell_video).
    run_video_sell_pipeline → run_sell_pipeline previously fired the
    walk_off chain twice — visible as a re-walk mid-fetch. One call per
    upload regardless of which pipeline runs.
    Same diff fixes the embedded T1 list URL inside the CarouselTester
    preview template so it matches the live Director config.

End-to-end flow

  1. Operator clicks 🇪🇸 Español → POST /api/live/language?lang=es
  2. Backend updates pipeline_state[\"active_language\"] = \"es\",
    broadcasts language_changed over WS
  3. All connected dashboards update the active tile + tagline
    ("avatar will speak in español")
  4. Operator drops video → run_sell_pipeline reads active_language
    translator.translate(script, \"es\") (sqlite-cached) →
    text_to_speech(..., language_code=\"es\") → ElevenLabs
    flash_v2_5 speaks the Spanish pitch with the same voice ID
  5. Live viewer comments → identical translate-then-TTS flow with
    cache hits

Why it's lightweight

  • Zero new backend dependencies — translator agent + cache + endpoints
    were already shipped from a prior pass; they just had no UI
  • ElevenLabs multilingual flash_v2_5 means same voice across all
    six languages
  • Pitch translation = one Claude Haiku call per unique pitch per
    language (sqlite-cached in backend/data/brain.db); repeat plays
    cost nothing
  • Adding a 7th language = one row append to SUPPORTED in
    agents/translator.py + one row append to LANGUAGES in
    LanguagePicker.jsx. No schema, no pipeline changes.

Test plan

  • GET /api/live/language returns supported list + cache stats
  • POST /api/live/language (lang=es) flips state + broadcasts WS
  • Browser: clicking Español → backend reflects "es", UI tagline
    updates, purple-glow border moves to Spanish tile
  • Reset to English → backend reflects "en"
  • Lint clean across all four touched files
  • Drop a real video while Spanish is active → verify pitch audio
    is in Spanish (deferred to operator on hardware)
  • Send a Q&A comment while Spanish is active → verify response
    audio is in Spanish (deferred to operator on hardware)

Known followups (not in this PR)

  • phase0/fixtures/CODY_PROMPT.md had unrelated markdown
    reformatting in the working tree (table format mangled, code-block
    backticks misplaced); intentionally left out so the PR stays focused
  • Adding language to the legacy run_comment_pipeline /
    make_avatar_speak paths was skipped — those are degraded fallbacks
    that don't fire on the happy path

Summary by CodeRabbit

  • New Features
    • Added live language switching supporting 6 languages with real-time response translation and localized text-to-speech.
    • Enhanced processing animation with an improved walk-off/return sequence.
    • Added debug clip display showing active avatar animation state.
    • Updated miscellaneous glance gesture to use a silent variant.

…cking

Two ships in one. Language picker is the headline; the Director refactor
is the supporting infra fix that was already in the working tree.

── Multi-language support (operator-facing) ──

Surface the existing translator + ElevenLabs-multilingual plumbing as a
real UI affordance. Six languages (en/es/fr/de/zh/tl) — flash_v2_5 speaks
all of them with the same voice ID, so picking a language flips the
spoken output without changing the avatar.

  * dashboard/src/components/LanguagePicker.jsx (new)
      Two presentations driven by `compact`. Pre-upload: prominent six-
      tile grid centered above the drop zone ("STEP 1 · LANGUAGE").
      Post-upload: compact corner chip with click-to-expand popout so
      the operator can flip languages mid-stream.
  * dashboard/src/hooks/useEmpireSocket.js
      Adds `activeLanguage` + setter, seeded on mount from
      GET /api/live/language and kept in sync via the existing
      `language_changed` WS broadcast (multi-tab coherent). Also lifts
      the `play_clip` debug HUD state into the hook (used by the new
      ClipHudRow in App.jsx).
  * dashboard/src/App.jsx
      Mounts the picker — full version in the empty hint, compact chip
      top-right post-upload (zIndex above the 9:16 phone overlay so the
      popout isn't clipped). Also mounts the T0/T1 clip debug HUD.
  * backend/main.py — api_respond_to_comment
      Patches the live Q&A path to translate response_text + pass
      language_code to ElevenLabs, matching what run_sell_pipeline
      already does for the pitch. Cache hits in
      translator.translate() are free; only the first time we ever
      speak a particular response in a new language costs one Claude
      Haiku call. Failure modes fall through to English.

End-to-end verified live: click 🇪🇸 → POST /api/live/language → backend
flips state, broadcasts language_changed → dashboard tagline updates to
"avatar will speak in español" → drop video → translated pitch + Q&A.

── Director refactor (supporting infra, was in working tree) ──

  * backend/agents/avatar_director.py
      - Adds `_tier1_busy_until` horizon. Every deliberate Tier 1 emit
        (bridge / pitch / response / reading_chat / listening_attentive
        / processing / fetching) extends the horizon by ttl_ms. The
        idle rotation loop checks it and skips the random misc_*
        interjection branch while a deliberate clip owns the layer.
        Kills the "she glances aside silently DURING the
        processing.mp4 readback" overlap class. Cleared by
        fade_to_idle.
      - Adds `_processing_chain_id` for the walk_off → processing.mp4
        narrative. Chain id bumps on each call so a back-to-back
        upload (or an early pitch via dispatch_audio_first_pitch)
        cleanly cancels the queued processing tail.
      - Adds `_IDLE_TIER1_EMITTERS` frozenset as the single source of
        truth for "this emit doesn't claim the Tier 1 layer".
      - Fixes misc_glance_aside URL → _silent.mp4 variant. The
        previous _speaking.mp4 played muted as an idle interjection
        read as the avatar silently mouthing words (uncanny). The
        silent render at veo_silent_idle_renders.py was made for
        exactly this rotation context.

  * backend/main.py — _play_upload_bridge() helper
      Single-call-site refactor (Option A). Bridge ownership moves
      from the pipeline layer to the route layer (api_sell +
      api_sell_video). run_video_sell_pipeline → run_sell_pipeline
      previously fired the walk_off chain twice — visible as a re-walk
      mid-fetch. One call per upload regardless of which pipeline runs.

      Same commit fixes the embedded T1-list URL (CarouselTester
      preview template) to match the live Director config.
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Apr 19, 2026

Caution

Review failed

Pull request was closed or merged during review

📝 Walkthrough

Walkthrough

The changes introduce live language switching for the avatar director system and refactor the upload-phase processing bridge from pipelines to routes. Backend tier-1 busy-state management is enhanced with explicit classification of autonomous events and chain coordination for the processing narrative. Frontend gains a language picker UI component and clip debug visualization, with WebSocket/HTTP integration for live language updates.

Changes

Cohort / File(s) Summary
Backend Director Tier-1 State Management
backend/agents/avatar_director.py
Added _IDLE_TIER1_EMITTERS classification to distinguish autonomous tier-1 events from deliberate ones; introduced _tier1_busy_until and _processing_chain_id to coordinate overlap suppression and two-step processing narrative; refactored play_processing() into walk-off/return plus queued processing chain; updated interjection and sip suppression logic to check busy horizon timestamps.
Backend Route Refactoring & Multilingual Support
backend/main.py
Moved upload-bridge emission (play_processing() + set_voice_state()) from pipelines to REST routes via new _play_upload_bridge() helper; added response-language translation and multilingual TTS wiring to api_respond_to_comment and pipeline flow; updated misc glance-aside clip URL to silent variant.
Frontend Language Selection Component
dashboard/src/components/LanguagePicker.jsx
New component exporting LANGUAGES constant and LanguagePicker React component; renders full pre-upload tile picker or compact post-upload chip; POSTs language selection to /api/live/language; guards against duplicates via pending state.
Frontend WebSocket State Extension
dashboard/src/hooks/useEmpireSocket.js
Added activeClips (per-layer tier0/tier1 clip state) and activeLanguage state management; extended message handlers for play_clip (storing intent, URL, mute status) and language_changed events; added on-mount HTTP fetch to seed language from server.
Frontend UI & Debug Integration
dashboard/src/App.jsx
Integrated LanguagePicker into pre-upload empty-state and post-upload chip UI; added debug HUD with ClipHudRow component displaying active tier0/tier1 clip details (intent, filename, muted state) with visual highlighting for muted-speaking conflicts; extended useEmpireSocket() consumption for new state values.

Sequence Diagram(s)

sequenceDiagram
    actor User
    participant Dashboard as Dashboard Client
    participant API as Backend Routes
    participant Director as Director & Pipelines
    participant Broadcast as WebSocket Broadcast

    User->>Dashboard: Select language in LanguagePicker
    Dashboard->>API: POST /api/live/language {lang: code}
    API->>Director: Update language state
    API->>Broadcast: Emit language_changed {lang: code}
    Broadcast->>Dashboard: Receive language_changed message
    Dashboard->>Dashboard: Update activeLanguage state
    
    Note over API,Director: When responding to comments
    API->>API: Translate response using activeLanguage
    API->>API: text_to_speech(..., language_code=activeLanguage)
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Poem

🐰 A hop and a skip, the languages now flow,
Tier-1 horizons keep interjections in tow,
With clips that are muted and states that align,
The director conducts through a busy-time line! 🎬✨

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 50.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly summarizes the main change: adding an operator-facing language picker UI with a six-language support and end-to-end translation infrastructure.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch cody/language-picker

Comment @coderabbitai help to get the list of available commands and usage tips.

@adityasingh2400 adityasingh2400 merged commit 740db89 into main Apr 19, 2026
1 of 3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant