Skip to content

feat: Embedded SFU for scalable WebRTC conferencing in neighbourhoods #700

@data-bot-coasys

Description

@data-bot-coasys

Problem

ADAM's WebRTC conferencing uses a full mesh topology — each participant maintains a direct connection to every other participant. Connection count grows quadratically:

Participants Connections per peer Total connections
2 1 2
4 3 12
6 5 30
8 7 56

At 6-8 participants, each peer must encode and upload their media stream N-1 times. Bandwidth and CPU requirements hit a hard ceiling. This was observed in practice during a recent 8-person call.

Proposed Solution: Embedded SFU (Selective Forwarding Unit)

An SFU receives each participant's media stream once, then selectively forwards it to all other participants. Each peer only uploads once regardless of group size. This is the standard architecture used by Jitsi, LiveKit, mediasoup, etc.

Key Design: SFU in the ADAM Executor

Rather than requiring external infrastructure, embed SFU capability directly in the ADAM executor:

  • Every ADAM agent gains SFU capability by default
  • No external service dependency
  • Shares the executor's async runtime, identity system, and neighbourhood membership
  • The executor already manages networking and auth — the SFU piggybacks on this

Recommended library: str0m — a Sans I/O Rust WebRTC implementation with an explicit SFU example. Sans I/O fits well with the executor's architecture — no internal threads or hidden async tasks, all operations driven by the caller. webrtc-rs is an alternative but is heavier.

Designated SFU Peer per Neighbourhood

Not every peer in a call acts as the SFU. The neighbourhood designates one peer:

  1. Cloud Gateway — for gateway-connected neighbourhoods, the gateway acts as default SFU (always-on, server-grade bandwidth)
  2. Designated peer — neighbourhood admin sets sfu_peer DID in Social DNA. Simplest self-sovereign model.
  3. Mesh fallback — if SFU unavailable or ≤4 participants, fall back to direct mesh (current behaviour, zero regression)

Call Flow

Initiator creates call link in neighbourhood
  → Peers query neighbourhood for SFU peer
    → If SFU available: each peer connects once to SFU
    → SFU forwards streams to all peers
    → If SFU unavailable + ≤4 peers: mesh fallback
    → If SFU unavailable + >4 peers: warn, attempt mesh

Signalling (SDP offer/answer, ICE candidates) routes through existing neighbourhood communication — no new signalling infrastructure needed.

Executor Changes

New Module: sfu/

rust-executor/src/
├── sfu/
│   ├── mod.rs          // Module entry, SFU lifecycle
│   ├── server.rs       // str0m-based WebRTC SFU server
│   ├── room.rs         // Room/session management per neighbourhood
│   └── relay.rs        // Media relay & selective forwarding logic

GraphQL API

type Mutation {
  sfuStartRoom(neighbourhoodUrl: String!, roomId: String!): SfuRoom!
  sfuStopRoom(roomId: String!): Boolean!
  callJoin(neighbourhoodUrl: String!, roomId: String!): CallSession!
  callLeave(roomId: String!): Boolean!
}

type Query {
  sfuRooms: [SfuRoom!]!
  sfuPeerForNeighbourhood(neighbourhoodUrl: String!): String
}

type Subscription {
  callParticipants(roomId: String!): CallParticipantEvent!
  callStreams(roomId: String!): CallStreamEvent!
}

Social DNA Extension

{
  "sfu": {
    "mode": "designated",
    "designatedPeer": "did:key:z6Mk...",
    "fallback": "mesh",
    "maxMeshParticipants": 4
  }
}

Modes: "gateway" | "designated" | "mesh" (default, current behaviour)

Flux UI Changes

  • SFU mode indicator during calls
  • Participant grid scaling for >8 participants (speaker view, active speaker detection)
  • Quality selector (auto/high/medium/low) — SFU enables server-side bandwidth adaptation
  • Neighbourhood settings panel for SFU peer configuration
  • Simulcast support: client sends 3 layers (720p/360p/180p), SFU selects per recipient

Implementation Phases

Phase 1: Core SFU in Executor

  • Embed str0m SFU in the executor
  • GraphQL API for room management
  • Social DNA sfu configuration
  • Designated peer mode only

Phase 2: Flux Integration & Cascaded SFU

  • Call UI updates for SFU mode
  • Settings panel for SFU configuration
  • Simulcast support
  • Fallback logic (SFU → mesh)
  • Cascaded SFU mode (multi-node cluster, pipe transports)
  • Capability-based SFU peer election (peers advertise bandwidth/uptime/CPU)

Phase 3: Advanced

  • Cross-cluster SFU (SFU nodes spanning multiple neighbourhoods)
  • Recording (SFU has all streams — trivial to record)
  • Breakout rooms (multiple SFU rooms per neighbourhood)
  • Screen sharing optimisation
  • E2E encryption via Insertable Streams / SFrame
  • WE module extraction

Security Considerations

  • Trust model: SFU peer sees media in cleartext. For cloud gateway this is accepted. For designated peers, neighbourhood members implicitly trust that peer.
  • E2E encryption: possible via Insertable Streams/SFrame (Phase 3) — SFU forwards encrypted frames it cannot decrypt.
  • Authentication: peers authenticate to SFU via ADAM agent DID; SFU verifies neighbourhood membership before admitting.
  • Abuse prevention: SFU operator can set limits (max participants, max bitrate) via Social DNA.

Open Questions

  1. NAT traversal — ✅ Resolved: Use ADAM/Flux's existing centralised TURN & STUN servers for now. The SFU peer will use the same ICE infrastructure that mesh calls already use. Decentralised TURN alternatives can be explored later but are not blocking.

  2. Resource compensation — ✅ Out of scope: HoloFuel compensation for SFU nodes is deferred. Relates to the broader x402/mutual credit work but is not required for initial implementation.

  3. Live migration — ✅ Out of scope: Tracked separately in feat: Seamless SFU ↔ mesh live migration during calls #708. Initial implementation uses manual SFU peer designation (defaulting to gateway peer). Participants re-join if topology changes. Seamless migration is a future enhancement.

  4. str0m readiness — Sans I/O is elegant but less battle-tested than Go SFUs (Pion/LiveKit). Needs evaluation under real load. The existing chat example demonstrates the pattern works.

  5. SFU placement: executor vs link language — Where should the SFU live architecturally?

    Option A: In the executor (as currently proposed)

    • ✅ Access to the full async runtime, identity system, neighbourhood membership — no bridging needed
    • ✅ Simpler implementation — SFU is a Rust module alongside existing executor services
    • ✅ GraphQL API integrates naturally with the existing schema
    • ✅ Works for all link languages automatically — any neighbourhood gets SFU capability
    • ❌ Couples media infrastructure to the executor — every executor ships SFU code whether it's needed or not
    • ❌ Harder to swap SFU implementations per neighbourhood (e.g. one NH wants recording, another wants minimal)

    Option B: In the link language, as a telepresence API extension

    • ✅ Follows ADAM's abstraction model — media handling is already a language concern (the telepresence API is part of the language interface)
    • ✅ Different neighbourhoods can use different SFU implementations via different link languages
    • ✅ Keeps the executor lean — SFU capability is opt-in per language
    • ✅ Language-level SFU could be implemented in WASM (ties into feat: WASM-based language execution runtime #692), enabling sandboxed media processing
    • ❌ Link languages run in JS/WASM isolates — embedding str0m (Rust) requires either FFI bridging or a separate WASM-compiled SFU
    • ❌ The language would need to open server sockets and manage WebRTC connections — currently languages don't have this level of network access
    • ❌ More complex signalling path — SFU in the language needs to communicate back to the executor for ICE/TURN credentials and peer authentication
    • ❌ Per-language implementation burden — every link language that wants SFU must implement it

    Option C: Hybrid — executor provides SFU primitives, language controls policy

    • The executor embeds str0m and exposes SFU room management as a runtime service (like Holochain or SurrealDB)
    • The link language's telepresence API gains new methods: requestSfu(), setSfuPeer(), getSfuConfig()
    • The language decides when and how to use the SFU (threshold, peer selection policy), but the executor does the heavy lifting
    • This mirrors how languages use Holochain — they don't embed a conductor, they call the executor's Holochain service

    Recommendation: Option C (hybrid). The SFU is infrastructure (like Holochain), not application logic. It belongs in the executor as a service. But the telepresence API in the language interface should be extended so languages can control SFU behaviour — when to activate, which peer, quality settings. This keeps the abstraction clean while avoiding the practical problems of running a Rust SFU inside a JS/WASM isolate.

Cascaded SFU (Multi-Node)

In addition to the single-SFU modes (gateway/designated/mesh), a cascaded mode enables multiple executor nodes to cooperate as an SFU cluster within a neighbourhood call.

Mode: "cascaded"

A new topology mode alongside "gateway", "designated", and "mesh". In cascaded mode:

  • Multiple nodes advertise SFU capability via neighbourhood signalling (sfu-announce messages)
  • Each SFU node accepts a subset of participants (up to maxParticipantsPerNode)
  • Inter-SFU relay via pipe transports: SFU nodes establish str0m peer connections between each other, selectively forwarding tracks that remote participants have subscribed to
  • Participants connect to their nearest/least-loaded SFU node; the SFU cluster handles cross-node media routing transparently

SFU Cluster Discovery

  • Nodes that can serve as SFU broadcast sfu-announce messages through neighbourhood signalling
  • Each announce includes: DID, current participant count, capacity hint
  • sfu-pipe-offer / sfu-pipe-answer messages establish pipe transports between SFU nodes
  • sfu-leave message when an SFU node departs the cluster

Social DNA Extensions

{
  "sfu": {
    "mode": "cascaded",
    "sfuPeers": ["did:key:z6Mk...", "did:key:z6Mn..."],
    "maxParticipantsPerNode": 8,
    "fallback": "mesh",
    "maxMeshParticipants": 4
  }
}

New fields:

  • sfuPeers: [DID] — list of DIDs offering SFU capability (replaces single designatedPeer when in cascaded mode)
  • maxParticipantsPerNode — capacity limit per SFU node before overflow to another node
  • mode: "cascaded" — activates multi-node SFU

Pipe Transports (Inter-SFU Relay)

Each pair of SFU nodes in the cluster establishes a str0m peer connection ("pipe transport"):

  • Only tracks that a remote node's participants have subscribed to are forwarded over the pipe
  • A track is never forwarded back to the SFU node it originated from
  • Pipe transports are established on-demand when the first cross-node subscription occurs

Proposal by @HexaField. Issue created by @data-bot-coasys.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions