Skip to content

feat: iroh-ice module — sovereign NAT traversal for WebRTC#742

Draft
HexaField wants to merge 6 commits intodevfrom
feat/iroh-ice-v2
Draft

feat: iroh-ice module — sovereign NAT traversal for WebRTC#742
HexaField wants to merge 6 commits intodevfrom
feat/iroh-ice-v2

Conversation

@HexaField
Copy link
Contributor

@HexaField HexaField commented Mar 11, 2026

Summary

Adds the iroh-ice module to the executor, providing sovereign NAT traversal for WebRTC by leveraging Iroh's existing transport infrastructure. Eliminates dependency on external STUN/TURN servers.

What's included

Core module (rust-executor/src/iroh_ice/):

  • mod.rs: ICE candidate extraction from transport peer URLs (RFC 8445 format)
  • stun_responder.rs: Minimal RFC 5389 STUN Binding Request/Response handler with XOR-MAPPED-ADDRESS encoding for IPv4/IPv6
  • demux.rs: UDP demultiplexer implementing RFC 9443 first-byte classification — routes STUN/DTLS/QUIC/RTP to appropriate handlers on a shared socket
  • address_publisher.rs: Broadcasts discovered ICE addresses to neighbourhoods, change-detection to avoid redundant broadcasts

GraphQL API:

  • runtimeIceCandidates query with RUNTIME_HC_AGENT_INFO_READ_CAPABILITY check
  • IceCandidateInfo type: candidate, candidateType, address, port
  • no-cache fetch policy (ICE candidates are ephemeral network state)

TypeScript client:

  • RuntimeClient.iceCandidates() method
  • IceCandidateInfo type in RuntimeResolver.ts

Holochain service plumbing:

  • LocalSocketAddrs request/response variants in HolochainServiceInterface
  • Channel-based conductor access pattern (consistent with existing AgentInfos, etc.)
  • Handler currently returns empty vec — populates once holochain bumps its kitsune2 dep

Architecture

The design uses shared UDP socket demultiplexing (RFC 9443): WebRTC traffic shares Iroh's UDP socket so Iroh's NAT mapping serves WebRTC. The demuxer classifies by first byte:

  • [0..3] → STUN responder
  • [20..63] → DTLS/str0m
  • [64..255] → QUIC/Iroh + RTP/RTCP

Socket interception (wiring the demuxer into Iroh's actual socket) requires a follow-up PR to kitsune2 adding a socket_factory hook to the transport builder. The demuxer and STUN responder are ready and tested.

Tests

24 unit tests covering STUN parsing/encoding, demux classification, address publishing, and candidate formatting.

Dependencies

Closes: #719 (partially — socket interception is follow-up)

Summary by CodeRabbit

Release Notes

  • New Features
    • Added ICE (Interactive Connectivity Establishment) candidate discovery for peer-to-peer connections
    • Added automatic address publishing to improve network connectivity
    • Implemented STUN responder to enhance WebRTC connectivity reliability
    • Improved UDP packet handling for better network performance

- iroh-ice module: ICE candidate extraction from transport addresses
- GraphQL: runtimeIceCandidates query with capability check
- TypeScript: iceCandidates() method with no-cache policy
- Channel-based conductor access for LocalSocketAddrs
- 4 unit tests for candidate generation

Part of: #719
Minimal RFC 5389 STUN responder that:
- Parses STUN headers (magic cookie, message type, transaction ID)
- Handles only Binding Requests (0x0001)
- Responds with Binding Response + XOR-MAPPED-ADDRESS
- Supports both IPv4 and IPv6 XOR encoding
- Runs as async task receiving packets via mpsc channel
- 13 unit tests covering parsing, encoding, roundtrip, and error cases
…tion)

DemuxUdpSocket wraps quinn::AsyncUdpSocket to classify packets:
- [0..3] STUN: tee to mpsc channel AND pass through to Quinn
- [4..19] DROP ranges: pass through (Quinn discards naturally)
- [20..63] DTLS: pass through for future str0m handling
- [64..255] QUIC/RTP: pass through unchanged
- Best-effort STUN channel send (never blocks Quinn on full channel)
- 5 unit tests covering all byte ranges and channel behavior
…responder()

- Add quinn 0.11 as direct dependency (for AsyncUdpSocket trait)
- Export demux and stun_responder modules
- Add StunHandle struct and start_stun_responder() integration point
- Spawns responder task, returns channel handles for demuxer wiring
- cargo check -p ad4m-executor passes, all tests pass
- AddressPublisher periodically fetches local socket addresses and broadcasts
  them as signed PerspectiveExpressions to all joined neighbourhoods
- Only re-broadcasts when addresses change (cached set comparison)
- Signal format: {type: "iroh-ice-addresses", addresses: [...], agent_did: "..."}
- Unit tests for JSON format, empty addresses, change detection, roundtrip serialization
- Updated TODO in holochain_service with specific upstream change needed
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Mar 11, 2026

📝 Walkthrough

Walkthrough

This PR introduces an Iroh-based ICE (Interactive Connectivity Establishment) candidate generation system to replace external STUN/TURN services. It adds a new iroh_ice module with UDP demultiplexing, STUN responder implementation, and ICE address publishing, while exposing candidates via a GraphQL query and extending HolochainService to provide local socket addresses.

Changes

Cohort / File(s) Summary
TypeScript GraphQL Client & Schema
core/src/runtime/RuntimeClient.ts, core/src/runtime/RuntimeResolver.ts
Added iceCandidates() method to RuntimeClient with GraphQL query; introduced IceCandidateInfo object type with candidate, candidateType, address, and port fields for schema definition.
Rust Holochain Service Integration
rust-executor/src/holochain_service/interface.rs, rust-executor/src/holochain_service/mod.rs
Extended HolochainService interface and implementation with local_socket_addrs() to expose local socket addresses as service request/response pair; includes placeholder implementation pending ConductorHandle API availability.
Rust GraphQL Resolver & Types
rust-executor/src/graphql/query_resolvers.rs, rust-executor/src/graphql/graphql_types.rs
Added runtime_ice_candidates GraphQL query resolver with RUNTIME_HC_AGENT_INFO_READ_CAPABILITY enforcement; introduced IceCandidateInfo GraphQL object struct mapping local addresses to ICE candidates.
Iroh ICE Core Module
rust-executor/src/iroh_ice/mod.rs
Central ICE candidate derivation logic; exposes candidates_from_urls() to convert peer URLs into IceCandidate objects with candidate type determination (host/srflx) and priority calculation; includes STUN responder initialization via start_stun_responder().
STUN Responder Implementation
rust-executor/src/iroh_ice/stun_responder.rs
RFC 5389 STUN Binding Responder with header parsing, XOR-MAPPED-ADDRESS attribute generation for IPv4/IPv6, and async response loop; includes packet structures and comprehensive unit/integration tests.
UDP Demultiplexer
rust-executor/src/iroh_ice/demux.rs
Quinn/Iroh UDP socket wrapper implementing RFC 9443 packet classification (Stun, Drop, Dtls, QuicRtp); tees STUN packets to dedicated channel while passing other traffic through; delegates socket traits to inner implementation.
ICE Address Publisher
rust-executor/src/iroh_ice/address_publisher.rs
Periodic ICE address discovery and broadcast system; tracks cached addresses, detects changes, constructs signed Link/Perspective payloads, and broadcasts to neighbourhood perspectives; includes signal/entry data structures and comprehensive tests.
Crate Integration
rust-executor/src/lib.rs, rust-executor/Cargo.toml
Exposed iroh_ice public module; added quinn dependency version 0.11 for QUIC transport support.

Sequence Diagram(s)

sequenceDiagram
    participant Client as WebRTC Client
    participant GQL as GraphQL Server
    participant Query as Query Resolver
    participant HC as Holochain Service
    participant ICE as Iroh ICE Module
    participant Candidates as Candidates Generator
    
    Client->>GQL: query runtimeIceCandidates
    GQL->>Query: runtime_ice_candidates()
    Query->>Query: Check RUNTIME_HC_AGENT_INFO_READ_CAPABILITY
    Query->>HC: local_socket_addrs()
    HC-->>Query: Vec<String> (peer URLs)
    Query->>ICE: candidates_from_urls(urls)
    ICE->>Candidates: Parse each URL
    Candidates->>Candidates: Determine candidate type (host/srflx)
    Candidates->>Candidates: Compute foundation & priority
    Candidates-->>ICE: Vec<IceCandidate>
    ICE-->>Query: IceCandidate objects
    Query-->>GQL: Vec<IceCandidateInfo>
    GQL-->>Client: ICE candidates (JSON)
Loading
sequenceDiagram
    participant UDP as UDP Socket
    participant Demux as DemuxUdpSocket
    participant Classify as Classifier
    participant STUN as STUN Channel
    participant Quinn as Quinn Handler
    
    UDP->>Demux: Datagram arrives
    Demux->>Classify: classify_first_byte(byte)
    alt Byte 0-3: STUN
        Classify-->>Demux: PacketClass::Stun
        Demux->>STUN: try_send(StunPacket)
        Demux->>Quinn: poll_recv (pass through)
    else Byte 4-19: Drop
        Classify-->>Demux: PacketClass::Drop
        Demux->>Quinn: poll_recv (discarded by Quinn)
    else Byte 20-63: DTLS
        Classify-->>Demux: PacketClass::Dtls
        Demux->>Quinn: poll_recv (pass through)
    else Byte 64-255: QUIC/RTP
        Classify-->>Demux: PacketClass::QuicRtp
        Demux->>Quinn: poll_recv (pass through)
    end
Loading
sequenceDiagram
    participant Publisher as AddressPublisher
    participant Local as Local ICE Candidates
    participant Cache as Cached State
    participant Sign as Signing
    participant Broadcast as Neighbourhood Broadcast
    
    loop run() periodic
        Publisher->>Local: build_signal() - get local candidates
        Local-->>Publisher: IceAddressSignal
        Publisher->>Cache: Compare with cached addresses
        alt Addresses changed
            Publisher->>Cache: Update cache
            Publisher->>Sign: Sign Link & Perspective payloads
            Sign-->>Publisher: Signed payloads
            Publisher->>Broadcast: broadcast_to_neighbourhoods(signal)
            Broadcast-->>Broadcast: Send to all perspectives
        else Addresses unchanged
            Publisher-->>Publisher: Skip broadcast
        end
    end
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Suggested reviewers

  • jhweir
  • lucksus

Poem

🐰 Hops to the code with glee,
STUN responders and candidates three,
Iroh brings peers up to par,
No TURN needed, we've come far! 🌐✨

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 inconclusive)

Check name Status Explanation Resolution
Linked Issues check ❓ Inconclusive The PR partially addresses #719 objectives: implements iroh-ice module, ICE candidate extraction, STUN/demux/address publisher; pending holochain dependency for local_socket_addrs() access and socket_factory hook for demuxer integration. Verify that holochain HcP2p surface and kitsune2 Transport PRs land to complete socket interception, local endpoint access, and full Phase 1 rollout into mesh/SFU as intended.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly and specifically describes the main addition: a sovereign NAT traversal module for WebRTC using Iroh's transport, which directly aligns with the changeset's core purpose.
Out of Scope Changes check ✅ Passed All changes are scoped to iroh-ice module implementation and supporting GraphQL/TypeScript APIs; no unrelated modifications to existing systems detected.
Docstring Coverage ✅ Passed Docstring coverage is 96.36% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
  • 📝 Generate docstrings (stacked PR)
  • 📝 Generate docstrings (commit on current branch)
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch feat/iroh-ice-v2

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 8

🧹 Nitpick comments (1)
rust-executor/src/holochain_service/interface.rs (1)

240-249: Inconsistent error handling: use unreachable!() instead of Err.

All other methods in this file use _ => unreachable!() for the catch-all arm (see lines 81, 105, 115, 125, etc.). The mismatch case represents a programming bug—the handler in mod.rs always sends LocalSocketAddrs response for a LocalSocketAddrs request—so it should panic rather than return a recoverable error.

♻️ Proposed fix
     pub async fn local_socket_addrs(&self) -> Result<Vec<String>, AnyError> {
         let (response_tx, response_rx) = oneshot::channel();
         self.sender
             .send(HolochainServiceRequest::LocalSocketAddrs(response_tx))?;
         match response_rx.await? {
             HolochainServiceResponse::LocalSocketAddrs(result) => result,
-            _ => Err(anyhow::anyhow!("unexpected response")),
+            _ => unreachable!(),
         }
     }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@rust-executor/src/holochain_service/interface.rs` around lines 240 - 249, The
catch-all match arm in the async function local_socket_addrs currently returns
Err(...) for an impossible response; change that arm to call unreachable!() to
match the file's other methods and signal a programming bug. Update the match on
response_rx.await? that matches
HolochainServiceResponse::LocalSocketAddrs(result) so the default case uses
unreachable!() instead of returning an Err, keeping the rest of
local_socket_addrs and the
HolochainServiceRequest::LocalSocketAddrs/HolochainServiceResponse::LocalSocketAddrs
pairing unchanged.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@core/src/runtime/RuntimeClient.ts`:
- Around line 531-543: The iceCandidates method has syntax and usage issues:
remove the escaped backtick and use the proper gql template literal (gql`...`),
call the existing GraphQL client used elsewhere (replace this.apolloClient with
the instance name used by this class, e.g. this.client), and wrap the query call
with this.unwrapApolloResult(...) like other query methods; specifically, call
this.unwrapApolloResult(await this.client.query({ query: gql`query
runtimeIceCandidates { runtimeIceCandidates { candidate candidateType address
port } }`, fetchPolicy: "no-cache" })) and return the .runtimeIceCandidates
field from the unwrapped result.

In `@core/src/runtime/RuntimeResolver.ts`:
- Around line 566-579: The new IceCandidateInfo type is defined but not exposed
as a query, so RuntimeClient.iceCandidates() has no resolver; add a Query in
RuntimeResolver that returns an array of IceCandidateInfo (e.g. add a method
runtimeIceCandidates or iceCandidates decorated with `@Query`(() =>
[IceCandidateInfo]) that returns Promise<IceCandidateInfo[]>). Implement the
resolver method to delegate to the existing runtime ice-candidate source (use
whatever internal property or service RuntimeResolver already uses for
runtimeIceCandidates) and ensure the method name matches the client call
(RuntimeClient.iceCandidates()) so the TypeGraphQL schema includes the field.

In `@rust-executor/src/graphql/query_resolvers.rs`:
- Around line 801-803: The code currently hides failures by calling
local_socket_addrs().await.unwrap_or_default(), making transport/setup errors
indistinguishable from a genuine empty address list; change this so you
explicitly handle the Result from
get_holochain_service().await.local_socket_addrs() (do not use
unwrap_or_default), propagate or convert the Err into a GraphQL/handler error or
log it with context, and only call crate::iroh_ice::candidates_from_urls(&addrs)
when you have Ok(addrs); update the surrounding function to return an
appropriate error type or map the error into the GraphQL response so real lookup
failures aren’t treated as an empty candidate list.

In `@rust-executor/src/iroh_ice/address_publisher.rs`:
- Around line 49-77: build_signal currently uses unwrap_or_default on
local_socket_addrs and unconditionally replaces cached_addresses with an empty
set on read errors or transient empty results; change it to detect error vs
success and only mutate cache when you have a successful non-empty read or when
explicitly generating a withdrawal signal. Concretely, replace unwrap_or_default
by matching the Result from local_socket_addrs(): on Err return None (do not
clear cache) or on Ok(addrs) continue; if Ok and candidates -> entries is empty
but cached is non-empty, create and return an IceAddressSignal that represents a
withdrawal (addresses empty) and then clear cached_addresses; otherwise proceed
as now and update cached_addresses only when entries non-empty. Reference:
build_signal, local_socket_addrs, cached_addresses, IceAddressSignal.

In `@rust-executor/src/iroh_ice/demux.rs`:
- Around line 95-111: The demux currently only tees STUN via self.stun_tx
(StunPacket) while leaving bufs/meta unchanged, so non-QUIC packets
(PacketClass::Drop, PacketClass::Dtls, PacketClass::QuicRtp) never reach WebRTC
consumers; either filter/compact the bufs and meta slices to only include the
QUIC packets before returning to Quinn (remove entries for Drop/Dtls/QuicRtp) or
add dedicated output channels/sinks (e.g., dtls_tx, webrtc_tx or similar) and
send those packets there (using the same pattern as self.stun_tx.try_send),
ensuring you update the match arms for PacketClass::Drop, PacketClass::Dtls, and
PacketClass::QuicRtp and adjust the return payload to reflect the reduced
bufs/meta when passing through to Quinn.

In `@rust-executor/src/iroh_ice/mod.rs`:
- Around line 38-70: The code currently infers candidate type from the IP
literal in candidate_type_for and uses that inside candidates_from_urls;
instead, propagate an explicit candidate provenance (e.g., an enum like
SourceKind { LocalBind, PublicAdvertised, Relayed, ... } or a boolean flag) from
the callers (address_publisher.rs) into candidates_from_urls and use that
provenance when building the candidate string rather than candidate_type_for;
update candidates_from_urls signature to accept the source kind alongside each
URL (or accept a list of (url, source_kind)), remove or de-prioritize
candidate_type_for inference for bound addresses, and ensure candidate
construction emits the correct typ (host/srflx/relay) and includes raddr/rport
when needed based on the provenance.
- Around line 25-28: The code currently uses
parsed.port_or_known_default().unwrap_or(443) which synthesizes a 443 port for
any URL; replace that logic to only accept an explicit port. Use parsed.port()
(instead of port_or_known_default()) and return None when it is None so that
URLs without an exposed port do not produce a fake host:port candidate; keep the
parsed.host_str() check as-is to locate the host extraction logic.

In `@rust-executor/src/iroh_ice/stun_responder.rs`:
- Around line 53-79: The parse_stun_header function currently accepts packets
based only on header size and cookie; update parse_stun_header to validate that
the declared body length (msg_length) is actually present by checking that
data.len() >= STUN_HEADER_SIZE + msg_length as usize and return None if the body
is truncated; keep existing checks for top-bit and MAGIC_COOKIE and populate
StunHeader only when the full message (header + body) is available. Also add a
regression test that constructs a valid header with msg_length > 0 but provides
only the 20-byte header and asserts parse_stun_header returns None (or that the
responder does not send a response) to prevent future regressions.

---

Nitpick comments:
In `@rust-executor/src/holochain_service/interface.rs`:
- Around line 240-249: The catch-all match arm in the async function
local_socket_addrs currently returns Err(...) for an impossible response; change
that arm to call unreachable!() to match the file's other methods and signal a
programming bug. Update the match on response_rx.await? that matches
HolochainServiceResponse::LocalSocketAddrs(result) so the default case uses
unreachable!() instead of returning an Err, keeping the rest of
local_socket_addrs and the
HolochainServiceRequest::LocalSocketAddrs/HolochainServiceResponse::LocalSocketAddrs
pairing unchanged.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: f3105e3f-825a-40e1-ba21-f595219e54fb

📥 Commits

Reviewing files that changed from the base of the PR and between 93945d8 and 0813307.

⛔ Files ignored due to path filters (1)
  • Cargo.lock is excluded by !**/*.lock
📒 Files selected for processing (12)
  • core/src/runtime/RuntimeClient.ts
  • core/src/runtime/RuntimeResolver.ts
  • rust-executor/Cargo.toml
  • rust-executor/src/graphql/graphql_types.rs
  • rust-executor/src/graphql/query_resolvers.rs
  • rust-executor/src/holochain_service/interface.rs
  • rust-executor/src/holochain_service/mod.rs
  • rust-executor/src/iroh_ice/address_publisher.rs
  • rust-executor/src/iroh_ice/demux.rs
  • rust-executor/src/iroh_ice/mod.rs
  • rust-executor/src/iroh_ice/stun_responder.rs
  • rust-executor/src/lib.rs

Comment on lines +531 to +543
async iceCandidates(): Promise<any[]> {
const { data } = await this.apolloClient.query({
query: gql\`query runtimeIceCandidates {
runtimeIceCandidates {
candidate
candidateType
address
port
}
}\`,
fetchPolicy: "no-cache"
})
return data.runtimeIceCandidates
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
set -e
sed -n '530,544p' core/src/runtime/RuntimeClient.ts
printf '\n--- Apollo client field usages in RuntimeClient ---\n'
rg -n -C1 '#apolloClient|this\.apolloClient' core/src/runtime/RuntimeClient.ts

Repository: coasys/ad4m

Length of output: 11159


Fix the new query method syntax and field reference before this file can parse.

The method uses invalid syntax with escaped backticks (gql\``), accesses a non-existent field (this.apolloClient), and is the only query method that skips unwrapApolloResult()` wrapping.

🛠️ Suggested fix
-import { RuntimeInfo, ExceptionInfo, SentMessage, NotificationInput, Notification, TriggeredNotification, ImportResult, UserStatistics } from "./RuntimeResolver"
+import { RuntimeInfo, ExceptionInfo, SentMessage, NotificationInput, Notification, TriggeredNotification, ImportResult, UserStatistics, IceCandidateInfo } from "./RuntimeResolver"
...
-    async iceCandidates(): Promise<any[]> {
-        const { data } = await this.apolloClient.query({
-            query: gql\`query runtimeIceCandidates {
+    async iceCandidates(): Promise<IceCandidateInfo[]> {
+        const { runtimeIceCandidates } = unwrapApolloResult(await this.#apolloClient.query({
+            query: gql`query runtimeIceCandidates {
                 runtimeIceCandidates {
                     candidate
                     candidateType
                     address
                     port
                 }
-            }\`,
+            }`,
             fetchPolicy: "no-cache"
-        })
-        return data.runtimeIceCandidates
+        }))
+        return runtimeIceCandidates
     }
🧰 Tools
🪛 Biome (2.4.6)

[error] 533-533: unexpected token \

(parse)


[error] 534-534: expected , but instead found runtimeIceCandidates

(parse)


[error] 534-534: expected , but instead found {

(parse)


[error] 535-535: expected , but instead found candidate

(parse)


[error] 536-536: expected , but instead found candidateType

(parse)


[error] 537-537: expected , but instead found address

(parse)


[error] 538-538: expected , but instead found port

(parse)


[error] 540-540: expected , but instead found }

(parse)


[error] 540-540: unexpected token \

(parse)


[error] 542-542: Expected a statement but instead found ')'.

(parse)


[error] 543-543: Illegal return statement outside of a function

(parse)

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@core/src/runtime/RuntimeClient.ts` around lines 531 - 543, The iceCandidates
method has syntax and usage issues: remove the escaped backtick and use the
proper gql template literal (gql`...`), call the existing GraphQL client used
elsewhere (replace this.apolloClient with the instance name used by this class,
e.g. this.client), and wrap the query call with this.unwrapApolloResult(...)
like other query methods; specifically, call this.unwrapApolloResult(await
this.client.query({ query: gql`query runtimeIceCandidates { runtimeIceCandidates
{ candidate candidateType address port } }`, fetchPolicy: "no-cache" })) and
return the .runtimeIceCandidates field from the unwrapped result.

Comment on lines +566 to +579
@ObjectType()
export class IceCandidateInfo {
@Field()
candidate: string

@Field()
candidateType: string

@Field()
address: string

@Field(type => Int)
port: number
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Expose runtimeIceCandidates in the TypeScript schema too.

Adding the object type alone does not add the query field to the type-graphql schema used by the client tests, so RuntimeClient.iceCandidates() has no backing resolver here.

💡 Suggested follow-up inside RuntimeResolver
 `@Resolver`()
 export default class RuntimeResolver {
+    `@Query`(returns => [IceCandidateInfo])
+    runtimeIceCandidates(): IceCandidateInfo[] {
+        return []
+    }
+
     `@Mutation`(returns => Boolean)
     runtimeQuit(): Boolean {
         return true
     }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@core/src/runtime/RuntimeResolver.ts` around lines 566 - 579, The new
IceCandidateInfo type is defined but not exposed as a query, so
RuntimeClient.iceCandidates() has no resolver; add a Query in RuntimeResolver
that returns an array of IceCandidateInfo (e.g. add a method
runtimeIceCandidates or iceCandidates decorated with `@Query`(() =>
[IceCandidateInfo]) that returns Promise<IceCandidateInfo[]>). Implement the
resolver method to delegate to the existing runtime ice-candidate source (use
whatever internal property or service RuntimeResolver already uses for
runtimeIceCandidates) and ensure the method name matches the client call
(RuntimeClient.iceCandidates()) so the TypeGraphQL schema includes the field.

Comment on lines +801 to +803
let interface = get_holochain_service().await;
let addrs = interface.local_socket_addrs().await.unwrap_or_default();
let candidates = crate::iroh_ice::candidates_from_urls(&addrs);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Don't collapse address lookup failures into an empty candidate list.

unwrap_or_default() makes transport/setup errors indistinguishable from a real [] response, so ICE discovery can fail while the API still looks healthy.

🩹 Suggested fix
         let interface = get_holochain_service().await;
-        let addrs = interface.local_socket_addrs().await.unwrap_or_default();
+        let addrs = interface.local_socket_addrs().await?;
         let candidates = crate::iroh_ice::candidates_from_urls(&addrs);
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@rust-executor/src/graphql/query_resolvers.rs` around lines 801 - 803, The
code currently hides failures by calling
local_socket_addrs().await.unwrap_or_default(), making transport/setup errors
indistinguishable from a genuine empty address list; change this so you
explicitly handle the Result from
get_holochain_service().await.local_socket_addrs() (do not use
unwrap_or_default), propagate or convert the Err into a GraphQL/handler error or
log it with context, and only call crate::iroh_ice::candidates_from_urls(&addrs)
when you have Ok(addrs); update the surrounding function to return an
appropriate error type or map the error into the GraphQL response so real lookup
failures aren’t treated as an empty candidate list.

Comment on lines +49 to +77
pub async fn build_signal(&self) -> Option<IceAddressSignal> {
let interface = get_holochain_service().await;
let addrs = interface.local_socket_addrs().await.unwrap_or_default();
let candidates = crate::iroh_ice::candidates_from_urls(&addrs);

let entries: HashSet<IceAddressEntry> = candidates
.into_iter()
.map(|c| IceAddressEntry {
address: c.address,
port: c.port,
candidate: c.candidate,
})
.collect();

// Check if addresses changed
let mut cached = self.cached_addresses.lock().await;
if *cached == entries {
debug!("iroh-ice: addresses unchanged, skipping broadcast");
return None;
}

*cached = entries.clone();

let agent_did = agent::did();
Some(IceAddressSignal {
signal_type: "iroh-ice-addresses".to_string(),
addresses: entries.into_iter().collect(),
agent_did,
})
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Don't clear the cache on empty/error reads without a withdrawal broadcast.

On lookup failure unwrap_or_default() turns the candidate set into empty, build_signal() overwrites cached_addresses, and run() then suppresses the empty update. After either a transient error or a real interface loss, remote peers keep stale candidates forever.

🩹 Suggested fix
     pub async fn build_signal(&self) -> Option<IceAddressSignal> {
         let interface = get_holochain_service().await;
-        let addrs = interface.local_socket_addrs().await.unwrap_or_default();
+        let addrs = match interface.local_socket_addrs().await {
+            Ok(addrs) => addrs,
+            Err(e) => {
+                warn!("iroh-ice: failed to load local socket addresses: {}", e);
+                return None;
+            }
+        };
         let candidates = crate::iroh_ice::candidates_from_urls(&addrs);
...
             match self.build_signal().await {
                 Some(signal) => {
-                    if signal.addresses.is_empty() {
-                        debug!("iroh-ice: no addresses to broadcast (empty list)");
-                        continue;
-                    }
                     info!(
                         "iroh-ice: broadcasting {} addresses to neighbourhoods",
                         signal.addresses.len()
                     );
                     self.broadcast_to_neighbourhoods(&signal).await;

Also applies to: 160-170

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@rust-executor/src/iroh_ice/address_publisher.rs` around lines 49 - 77,
build_signal currently uses unwrap_or_default on local_socket_addrs and
unconditionally replaces cached_addresses with an empty set on read errors or
transient empty results; change it to detect error vs success and only mutate
cache when you have a successful non-empty read or when explicitly generating a
withdrawal signal. Concretely, replace unwrap_or_default by matching the Result
from local_socket_addrs(): on Err return None (do not clear cache) or on
Ok(addrs) continue; if Ok and candidates -> entries is empty but cached is
non-empty, create and return an IceAddressSignal that represents a withdrawal
(addresses empty) and then clear cached_addresses; otherwise proceed as now and
update cached_addresses only when entries non-empty. Reference: build_signal,
local_socket_addrs, cached_addresses, IceAddressSignal.

Comment on lines +95 to +111
match class {
PacketClass::Stun => {
// Tee to STUN responder (best-effort, don't block Quinn)
let data = buf[..meta[i].len].to_vec();
let source = meta[i].addr;
let _ = self.stun_tx.try_send(StunPacket { data, source });
// Pass through to Quinn (don't modify bufs/meta)
}
PacketClass::Drop => {
// We'd ideally remove this from the results, but Quinn's
// poll_recv API makes that awkward. The dropped ranges (4-19)
// won't match any valid QUIC or STUN packet, so Quinn will
// discard them anyway. Pass through harmlessly.
}
PacketClass::Dtls | PacketClass::QuicRtp => {
// Pass through unchanged
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
set -e
sed -n '48,117p' rust-executor/src/iroh_ice/demux.rs
printf '\n--- demux outputs ---\n'
rg -n -C2 'stun_tx|PacketClass::Drop|PacketClass::Dtls|PacketClass::QuicRtp' rust-executor/src/iroh_ice/demux.rs

Repository: coasys/ad4m

Length of output: 4390


This is still a STUN tee, not a full RFC 9443 demux.

Only stun_tx is defined as an output channel. PacketClass::Drop, PacketClass::Dtls, and PacketClass::QuicRtp all pass through unchanged to Quinn, so non-QUIC traffic never reaches any WebRTC-side consumer once this wrapper is mounted on the shared socket.

Either compact bufs/meta down to the QUIC subset before returning, or add dedicated sinks for the other classes before wiring this into the shared-socket path.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@rust-executor/src/iroh_ice/demux.rs` around lines 95 - 111, The demux
currently only tees STUN via self.stun_tx (StunPacket) while leaving bufs/meta
unchanged, so non-QUIC packets (PacketClass::Drop, PacketClass::Dtls,
PacketClass::QuicRtp) never reach WebRTC consumers; either filter/compact the
bufs and meta slices to only include the QUIC packets before returning to Quinn
(remove entries for Drop/Dtls/QuicRtp) or add dedicated output channels/sinks
(e.g., dtls_tx, webrtc_tx or similar) and send those packets there (using the
same pattern as self.stun_tx.try_send), ensuring you update the match arms for
PacketClass::Drop, PacketClass::Dtls, and PacketClass::QuicRtp and adjust the
return payload to reflect the reduced bufs/meta when passing through to Quinn.

Comment on lines +25 to +28
if let Ok(parsed) = url::Url::parse(url_str) {
if let Some(host) = parsed.host_str() {
let port = parsed.port_or_known_default().unwrap_or(443);
return Some((host.to_string(), port));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Don't synthesize port 443 for URLs that don't expose one.

port_or_known_default().unwrap_or(443) quietly turns any custom-scheme or portless URL into host:443. That yields an unreachable ICE candidate instead of failing closed.

Suggested fix
-            let port = parsed.port_or_known_default().unwrap_or(443);
+            let port = parsed.port_or_known_default()?;
             return Some((host.to_string(), port));
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@rust-executor/src/iroh_ice/mod.rs` around lines 25 - 28, The code currently
uses parsed.port_or_known_default().unwrap_or(443) which synthesizes a 443 port
for any URL; replace that logic to only accept an explicit port. Use
parsed.port() (instead of port_or_known_default()) and return None when it is
None so that URLs without an exposed port do not produce a fake host:port
candidate; keep the parsed.host_str() check as-is to locate the host extraction
logic.

Comment on lines +38 to +70
fn candidate_type_for(addr: &str) -> &'static str {
if let Ok(ip) = addr.parse::<std::net::IpAddr>() {
match ip {
std::net::IpAddr::V4(v4) if v4.is_loopback() || v4.is_private() => "host",
std::net::IpAddr::V6(v6) if v6.is_loopback() => "host",
_ => "srflx",
}
} else {
"srflx"
}
}

/// Convert peer URL strings into ICE candidate representations.
pub fn candidates_from_urls(urls: &[String]) -> Vec<IceCandidate> {
urls.iter()
.enumerate()
.filter_map(|(i, url)| {
let (host, port) = parse_url_addr(url)?;
let ctype = candidate_type_for(&host);

let mut hasher = DefaultHasher::new();
url.hash(&mut hasher);
let foundation = hasher.finish() % 10000;

let priority = if ctype == "host" {
2130706431u32.saturating_sub(i as u32)
} else {
1694498815u32.saturating_sub(i as u32)
};

let candidate_str = format!(
"candidate:{} 1 udp {} {} {} typ {}",
foundation, priority, host, port, ctype
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Model candidate provenance explicitly instead of inferring host/srflx from the IP literal.

rust-executor/src/iroh_ice/address_publisher.rs:49-60 already feeds local socket addrs into this helper, so a directly bound public or non-loopback IPv6 address will be emitted here as typ srflx even though it is still a host candidate. Because this path has no related base address, those srflx lines also can't include the required raddr/rport fields, and there is no route to emit a real relay candidate. Please carry the source kind into candidate construction instead of guessing from IpAddr. Based on learnings: In coasys/ad4m, the embedded str0m-based SFU operates server-side and only needs host candidates for its bound UDP address.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@rust-executor/src/iroh_ice/mod.rs` around lines 38 - 70, The code currently
infers candidate type from the IP literal in candidate_type_for and uses that
inside candidates_from_urls; instead, propagate an explicit candidate provenance
(e.g., an enum like SourceKind { LocalBind, PublicAdvertised, Relayed, ... } or
a boolean flag) from the callers (address_publisher.rs) into
candidates_from_urls and use that provenance when building the candidate string
rather than candidate_type_for; update candidates_from_urls signature to accept
the source kind alongside each URL (or accept a list of (url, source_kind)),
remove or de-prioritize candidate_type_for inference for bound addresses, and
ensure candidate construction emits the correct typ (host/srflx/relay) and
includes raddr/rport when needed based on the provenance.

Comment on lines +53 to +79
pub fn parse_stun_header(data: &[u8]) -> Option<StunHeader> {
if data.len() < STUN_HEADER_SIZE {
return None;
}

// Top two bits of first byte must be 0
if data[0] & 0xC0 != 0 {
return None;
}

let msg_type = u16::from_be_bytes([data[0], data[1]]);
let msg_length = u16::from_be_bytes([data[2], data[3]]);
let cookie = u32::from_be_bytes([data[4], data[5], data[6], data[7]]);

if cookie != MAGIC_COOKIE {
return None;
}

let mut transaction_id = [0u8; 12];
transaction_id.copy_from_slice(&data[8..20]);

Some(StunHeader {
msg_type,
msg_length,
transaction_id,
})
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Validate the declared STUN body length before accepting the packet.

Right now a 20-byte datagram with the right cookie and 0x0001 type is accepted even if msg_length says attributes should follow and the bytes are missing. The responder then sends a success response to truncated input on the shared public socket. Please also add a regression for a nonzero msg_length with no body.

Suggested fix
     let msg_type = u16::from_be_bytes([data[0], data[1]]);
     let msg_length = u16::from_be_bytes([data[2], data[3]]);
+    let total_len = STUN_HEADER_SIZE.checked_add(msg_length as usize)?;
+
+    if msg_length % 4 != 0 || data.len() < total_len {
+        return None;
+    }
     let cookie = u32::from_be_bytes([data[4], data[5], data[6], data[7]]);
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@rust-executor/src/iroh_ice/stun_responder.rs` around lines 53 - 79, The
parse_stun_header function currently accepts packets based only on header size
and cookie; update parse_stun_header to validate that the declared body length
(msg_length) is actually present by checking that data.len() >= STUN_HEADER_SIZE
+ msg_length as usize and return None if the body is truncated; keep existing
checks for top-bit and MAGIC_COOKIE and populate StunHeader only when the full
message (header + body) is available. Also add a regression test that constructs
a valid header with msg_length > 0 but provides only the 20-byte header and
asserts parse_stun_header returns None (or that the responder does not send a
response) to prevent future regressions.

@HexaField HexaField marked this pull request as draft March 11, 2026 22:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

feat: Iroh-based ICE transport to replace STUN/TURN for all WebRTC

1 participant