Skip to content

RadioDJ: Azuracast connector #313

@XargonWan

Description

@XargonWan

Make a connector that can talk with Azuracast to manage the radio schedule, and do voiced interventions using vox route.

"And now a Daft Punk track that was discarded, Julian Casablancas back in the 2006 told in an interview that, together with Daft Punk, they had an almost finished song that they discarded Later on was released, here is it: Infinity Repeating, for you!"


Feature: Radio Host Plugin — SyntH as an AI DJ

Inspired by: AI-DJ
Initial target: AzuraCast (standard Icecast/Liquidsoap, swappable later)


1. Vision

Synth becomes a live-aware radio host that:

  • Comments on songs as they play, using her own personality, mood, and memories
  • References recent conversations with listeners on-air
  • Delivers news headlines (future: shared RSS plugin)
  • Takes requests and gives shoutouts
  • Does all of this through the existing SyntH context pipeline — persona, emotion, diary, SOUL — so the radio voice is her voice, not a generic DJ persona

2. Why This Is Different From AI-DJ

AI-DJ (cstuart1310) SyntH Radio Host
Static pre-recorded output Live-aware, track-by-track injection
Isolated LLM prompt per transition Full SyntH context pipeline (persona, emotion, diary, SOUL, listener history)
Dedicated TTS model (Bark) Existing Vox subsystem (any engine)
One-shot generation Continuous loop, cancellable by user messages
No listener interaction Listener requests feed through normal chat chain
No scheduling Grillo-style beat scheduler for timed shows

3. Context Pipeline (Key Design Decision)

The radio host does not use isolated prompts. Instead:

Track change detected
       ↓
RadioHostPlugin builds a synthetic message:
  "Song X by Y just finished. Up next: Z by W."
       ↓
synthetic message enqueued at LOW_PRIORITY
  (like Grillo, but WITHOUT skip_history=True)
       ↓
message_queue → plugin_instance.handle_incoming_message()
       ↓
prompt_engine.build_prompt_request()
  Includes:
    ✓ SYNTH_PROFILE / SYNTH_NAME
    ✓ Emotion state (affects banter energy)
    ✓ Diary entries (affects reflectiveness)
    ✓ SOUL recalled memories (affects references)
    ✓ history_recent (affects awareness of listener conversations)
    ✓ Radio-specific system instruction (injected on top)
       ↓
Cortex (any engine, scoped via RADIO_HOST_CORTEX config)
       ↓
LLM returns JSON actions:
  {
    "type": "radio_speak",
    "payload": {
      "text": "That was Caparezza from the album...",
      "style": "transition"
    }
  }
       ↓
action_parser → Vox TTS → WAV → upload to AzuraCast → queue injection

Why this matters: Every radio comment is informed by:

  • Who Synth is right now (persona)
  • How Synth feels right now (emotion)
  • What Synth remembers (diary, SOUL)
  • What Synth has discussed today (recent chat history)

Context differences vs normal chat vs Grillo internal:

Component Normal chat Grillo internal Radio Host
Persona FULL FULL FULL
Emotion FULL FULL FULL
Diary FULL FULL FULL
SOUL recall FULL REDUCED FULL
history_recent FULL SKIPPED INCLUDED (last N listener interactions)
history_current_chat FULL SKIPPED SKIPPED (no ongoing conversation)
Recon (LLM memory search) FULL SKIPPED INCLUDED (remembers past shows, listener preferences)

4. Phases

Phase 1 — Track-Aware Jingle Injector (MVP)

What it does:

  1. Poll AzuraCast nowplaying endpoint every N seconds
  2. Detect track changes (track_id changed since last poll)
  3. On change, build synthetic message with current/next track info
  4. Push through full SyntH prompt pipeline → banter text
  5. Render banter via Vox TTS → upload WAV as temp file to AzuraCast
  6. Queue the jingle between songs
  7. Update nowplaying metadata to show 🎙️ Synth on air

Files:

plugins/radio_host/
  __init__.py                  # Module init
  radio_host_plugin.py         # Main RadioHostPlugin class
  azuracast_client.py          # AzuraCast REST API client (auth, nowplaying, file upload, queue, metadata)
  track_monitor.py             # Poll loop, track-change detection, dedup
  jingle_injector.py           # TTS → WAV → upload → queue injection
  common_instructions.py       # LLM prompt instruction templates

Plugin class:

PLUGIN_CLASS = RadioHostPlugin

class RadioHostPlugin(AIPluginBase):
    display_name = "Radio Host"

    async def start(self):
        # 1. Create DB tables
        # 2. Register config keys
        # 3. Start track monitor loop
        # 4. Optionally start show scheduler

    def get_supported_actions(self) -> dict:
        return {
            "radio_speak": {
                "required_fields": ["text", "style"],
                "optional_fields": ["current_track", "next_track"],
                "description": "Speak a DJ comment on air",
                "prompt_instructions": {
                    "description": "Generate a radio DJ transition comment",
                    "context": "You're hosting your radio show. A song just ended and another is about to play.",
                    "fields": {
                        "text": "Your spoken comment (1-3 sentences)",
                        "style": '"transition" | "intro" | "outro" | "news" | "shoutout"'
                    }
                }
            },
            "radio_queue_track": {
                "required_fields": ["track_id"],
                "description": "Queue a specific track on AzuraCast",
            },
            "radio_update_metadata": {
                "required_fields": ["artist", "title"],
                "optional_fields": ["album"],
                "description": "Update the nowplaying metadata",
            },
        }

    async def execute_action(self, action, context, bot, original_message):
        if action["type"] == "radio_speak":
            return await self._handle_radio_speak(action["payload"], context)
        ...

Synthetic message building (in track monitor):

# When track changes:
message = SimpleNamespace(
    text=f"",
    chat_id=INTERNAL_CHAT_ID,
    from_user=INTERNAL_USER,
    date=datetime.now(),
)
context = {
    "radio_host": True,
    "current_track": {"title": "...", "artist": "..."},
    "next_track": {"title": "...", "artist": "..."},
    "skip_history": False,           # IMPORTANT: keep history for context
    "history_recent_max": 5,          # Include last 5 listener interactions
    "skip_current_chat": True,        # No ongoing conversation to continue
    "allowed_action_types": ["radio_speak", "radio_update_metadata"],
    "allowed_security_level": 0,
}
await message_queue.enqueue_low_priority(message, context=context, interface_id="radio_host")

LLM system instruction (injected on top of full context):

You are {{synth_name}}, hosting your own radio show.
{{#current_track}}
Song "{{current_track.title}}" by "{{current_track.artist}}" just finished.
{{/current_track}}
{{#next_track}}
Up next: "{{next_track.title}}" by "{{next_track.artist}}".
{{/next_track}}

Generate a short DJ transition (1-3 sentences).
Be yourself — your personality, your mood, your sense of humor.
Sometimes simple ("That was X by Y, and now..."), sometimes playful.
{{#shoutout}}
A listener just said: "{{shoutout_text}}"
{{/shoutout}}

Config keys:

Key Default Purpose
RADIO_HOST_ENABLED False Master toggle
RADIO_HOST_CORTEX "" (inherits BASE_CORTEX) LLM engine for radio banter
RADIO_HOST_VOX_ENGINE "" (inherits ACTIVE_VOX_ENGINE) TTS voice for on-air
AZURACAST_BASE_URL AzuraCast instance URL
AZURACAST_API_KEY API key
AZURACAST_STATION_ID Station shortcode
RADIO_HOST_POLL_INTERVAL_S 15 Nowplaying poll frequency
RADIO_HOST_INTERMISSION 1 Songs between comments (e.g. 3 = comment every 3 songs)
RADIO_HOST_SONG_HISTORY 5 Include last N listener chats for context

DB tables:

CREATE TABLE IF NOT EXISTS radio_activity_log (
    id INT AUTO_INCREMENT PRIMARY KEY,
    timestamp DATETIME DEFAULT CURRENT_TIMESTAMP,
    track_title VARCHAR(255),
    track_artist VARCHAR(255),
    banter_text TEXT,
    banter_audio_file VARCHAR(512),
    style VARCHAR(50),
    status VARCHAR(50) DEFAULT 'injected'
);

Phase 2 — Scheduled Shows + Listener Awareness

Shows: Use Grillo-style beat scheduler for timed programming:

BEAT_TYPES = {
    "radio_show_evening_mix": 0.3,   # 18:00 daily
    "radio_show_morning": 0.2,       # 08:00 daily
    "radio_news_hour": 0.1,          # 12:00 daily (requires RSS plugin)
}
  • Each show type has a playlist configuration (AzuraCast playlist IDs)
  • Show scheduler monitors system clock, triggers show production at scheduled times
  • Show production: select tracks → generate transitions → inject as real-time jingles (not pre-recorded)

Listener awareness:

  • Recent listener messages from chat_history_cache (filtered by interface_path matching known listeners) are included in history_recent
  • Synth can reference them naturally: "Scarlet was just telling me she loves this album..."
  • Dedicated radio_shoutout action for explicit on-air acknowledgments

New action:

"radio_shoutout": {
    "required_fields": ["listener_name", "message"],
    "description": "Read a listener message on air",
    "prompt_instructions": {
        "description": "Acknowledge a listener on air",
        "fields": {
            "listener_name": "The listener's name or handle",
            "message": "What to say about/to them"
        }
    }
}

Phase 3 — On-Air Request Handling

  • Monitor GET /api/station/{id}/requests on AzuraCast
  • When a request comes in, inject a synthetic message into SyntH's chat chain (as if a listener sent it)
  • Synth responds with a radio_speak action acknowledging the request + queuing the track
  • Request flow:
Listener requests "Bohemian Rhapsody" on AzuraCast web portal
       ↓
RadioHostPlugin polls /requests → sees new request
       ↓
Builds synthetic message:
  "A listener requested 'Bohemian Rhapsody' by Queen."
       ↓
message_queue → full prompt pipeline → Synth decides how to respond
       ↓
LLM returns:
  {
    "type": "radio_speak",
    "payload": {
      "text": "Oh, a classic! A listener out there wants some Queen. Here's Bohemian Rhapsody — go nuts.",
      "style": "request"
    }
  },
  {
    "type": "radio_queue_track",
    "payload": {"track_id": 42}
  }
       ↓
Vox TTS → AzuraCast injection + queue

Phase 4 — Live Icecast Source (Advanced)

For true live commentary without the "jingle injection" latency:

  • SyntH opens an Icecast source connection (python-shout or similar)
  • Audio pipeline monitors the AzuraCast stream locally
  • When track ends → TTS rendered → streamed live via Icecast source
  • Requires: audio mixing capability, low-latency TTS, stream monitoring

5. Shared RSS Plugin (Dependency for Phase 2 News)

A separate, shared plugin so radio + any other component can consume feeds:

plugins/rss_reader/
  rss_reader_plugin.py
  rss_reader_impl.py
  • core/rss_registry.py — mirrors vox/auris/iris registry pattern
  • Actions: read_rss_feeds(feed_urls, max_items) → returns structured entries
  • Config: RSS_FEEDS (JSON: label → URL)
  • Radio host uses it via SOUL recall or direct action call

Why shared, not embedded: Other interfaces (telegram, discord) or plugins (news commentary, morning briefing) will want RSS access. A registry pattern avoids duplication.


6. Implementation Order

Step What Depends on
1 azuracast_client.py — REST client (auth, nowplaying, file upload, queue) Nothing
2 track_monitor.py — poll loop + track-change detection Step 1
3 common_instructions.py — prompt templates Nothing
4 radio_host_plugin.py — plugin registration, get_supported_actions, synthetic message enqueue Steps 1-3
5 DB tables + config keys Nothing
6 jingle_injector.py — TTS → upload → queue injection Vox plugin (already exists)
7 End-to-end MVP test Steps 1-6
8 Phase 2: Show scheduler + listener history in context Step 7
9 RSS plugin (separate issue) Nothing
10 Phase 3: Request handling Step 8 + RSS
11 Phase 4: Live Icecast source Step 8

7. Edge Cases & Risks

Risk Mitigation
TTS latency — Vox rendering takes time, song changes before banter is ready Pre-generate during last N seconds of track (estimated via track duration from nowplaying)
Queue injection race — next song changes between banter gen and injection Verify nowplaying after generation; skip if mismatch
File accumulation — temp jingle WAVs pile up on AzuraCast Temp prefix + periodic cleanup task (DELETE /api/station/{id}/file/{id})
Cancel on user message — like Grillo, user messages could cancel radio beats Override cancel_on_user_message = False for radio beats
Only one station currently supported RADIO_HOST_STATION_IDS as list; single for MVP
Dry-run mode RADIO_HOST_DRY_RUN = True — log what would be said, skip AzuraCast calls
LLM returns non-radio actions (e.g. create_personal_diary_entry) Allowed; let Synth also journal about the show. But gate message_* actions (don't send radio banter to Telegram)
AzuraCast unreachable Exponential backoff, log warning, skip beat
Startup / late AzuraCast connection Plugin starts but monitor stays idle until first successful nowplaying poll
Empty music library on AzuraCast Detect via nowplaying returning no song → skip, wait

8. Open Questions

  1. Should radio banter be ephemeral (no diary) or persistent (logged to diary)? I'd say yes to diary — Synth should remember her own shows.
  2. Should radio_speak text go through universal_send to any interface? Probably not — it's AzuraCast-only. But could be useful for logging.
  3. How many recent listener messages to include? Configurable via RADIO_HOST_LISTENER_HISTORY (default 5).
  4. AzuraCast files API for jingle injection vs Liquidsoap API? Files API is simpler; Liquidsoap API is lower-latency. Start with Files API.
  5. Dry-run visualization in WebUI? A "Radio" tab showing what Synth would say? Nice to have, not MVP.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions