Skip to content

Conversation

@dignifiedquire
Copy link
Contributor

@dignifiedquire dignifiedquire commented Jul 7, 2025

Work on integrating n0-computer/quinn#28 into the iroh magic

TODOs

@n0bot n0bot bot added this to iroh Jul 7, 2025
@github-project-automation github-project-automation bot moved this to 🏗 In progress in iroh Jul 7, 2025
@github-actions
Copy link

github-actions bot commented Jul 7, 2025

Documentation for this PR has been generated and is available at: https://n0-computer.github.io/iroh/pr/3381/docs/iroh/

Last updated: 2025-12-05T14:47:54Z

@github-actions
Copy link

github-actions bot commented Jul 7, 2025

Netsim report & logs for this PR have been generated and is available at: LOGS
This report will remain available for 3 days.

Last updated for commit: 66a8565

@dignifiedquire dignifiedquire changed the title [WIP] feat: use quinn multipath [WIP] feat: use quic multipath Jul 8, 2025
@dignifiedquire dignifiedquire force-pushed the feat-multipath branch 3 times, most recently from 72cb071 to db712c0 Compare July 18, 2025 14:39
@dignifiedquire dignifiedquire force-pushed the feat-multipath branch 2 times, most recently from 4827e62 to 946f71c Compare July 28, 2025 20:29
flub and others added 30 commits November 14, 2025 13:44
…3664)

## Description

This means these tests also work when nextest run locally.

## Breaking Changes

<!-- Optional, if there are any breaking changes document them,
including how to migrate older code. -->

## Notes & open questions

I'm not sure why this wasn't done when the ci profile override was
chosen.  What am I missing?

## Change checklist
<!-- Remove any that are not relevant. -->
- [x] Self-review.
## Description

Bumps netwatch and netdev, to remove duplicate dependency on both
[email protected] and [email protected].

## Breaking Changes

<!-- Optional, if there are any breaking changes document them,
including how to migrate older code. -->

## Notes & open questions

<!-- Any notes, remarks or open questions you have to make about the PR.
-->

## Change checklist
<!-- Remove any that are not relevant. -->
- [ ] Self-review.
- [ ] Documentation updates following the [style
guide](https://rust-lang.github.io/rfcs/1574-more-api-documentation-conventions.html#appendix-a-full-conventions-text),
if relevant.
- [ ] Tests if relevant.
- [ ] All breaking changes documented.
- [ ] List all breaking changes in the above "Breaking Changes" section.
- [ ] Open an issue or PR on any number0 repos that are affected by this
breaking change. Give guidance on how the updates should be handled or
do the actual updates themselves. The major ones are:
    - [ ] [`quic-rpc`](https://github.com/n0-computer/quic-rpc)
    - [ ] [`iroh-gossip`](https://github.com/n0-computer/iroh-gossip)
    - [ ] [`iroh-blobs`](https://github.com/n0-computer/iroh-blobs)
    - [ ] [`dumbpipe`](https://github.com/n0-computer/dumbpipe)
    - [ ] [`sendme`](https://github.com/n0-computer/sendme)
This is a next step into the world of configurable transports. We now
allow disabling the IP based transports entirely.
Internally this starts to prepare for a world where the user can
configure multiple different transports, IP, relay and others in the
future.

Closes #2957
## Description

Remove the test-only `Endpoint::path_selection` API and instead use
`Endpoint::clear_ip_transports` for `PathSelection::RelayOnly `, now
that this public API was added in
#3651.

## Breaking Changes

<!-- Optional, if there are any breaking changes document them,
including how to migrate older code. -->

## Notes & open questions

<!-- Any notes, remarks or open questions you have to make about the PR.
-->

## Change checklist
<!-- Remove any that are not relevant. -->
- [ ] Self-review.
- [ ] Documentation updates following the [style
guide](https://rust-lang.github.io/rfcs/1574-more-api-documentation-conventions.html#appendix-a-full-conventions-text),
if relevant.
- [ ] Tests if relevant.
- [ ] All breaking changes documented.
- [ ] List all breaking changes in the above "Breaking Changes" section.
- [ ] Open an issue or PR on any number0 repos that are affected by this
breaking change. Give guidance on how the updates should be handled or
do the actual updates themselves. The major ones are:
    - [ ] [`quic-rpc`](https://github.com/n0-computer/quic-rpc)
    - [ ] [`iroh-gossip`](https://github.com/n0-computer/iroh-gossip)
    - [ ] [`iroh-blobs`](https://github.com/n0-computer/iroh-blobs)
    - [ ] [`dumbpipe`](https://github.com/n0-computer/dumbpipe)
    - [ ] [`sendme`](https://github.com/n0-computer/sendme)
…moteState (#3673)

## Description

Renames:
* renamed `endpoint_map` -> `remote_map`, `EndpointMap` -> `RemoteMap`,
`endpoint_state` -> `remote_state`, `EndpointStateActor` ->
`RemoteStateActor`

Moved:
* moved `path_state` module under `remote_state` (prev
`endpoint_state`), its items are used only there and nowhere else

## Breaking Changes

<!-- Optional, if there are any breaking changes document them,
including how to migrate older code. -->

## Notes & open questions

<!-- Any notes, remarks or open questions you have to make about the PR.
-->

## Change checklist
<!-- Remove any that are not relevant. -->
- [ ] Self-review.
- [ ] Documentation updates following the [style
guide](https://rust-lang.github.io/rfcs/1574-more-api-documentation-conventions.html#appendix-a-full-conventions-text),
if relevant.
- [ ] Tests if relevant.
- [ ] All breaking changes documented.
- [ ] List all breaking changes in the above "Breaking Changes" section.
- [ ] Open an issue or PR on any number0 repos that are affected by this
breaking change. Give guidance on how the updates should be handled or
do the actual updates themselves. The major ones are:
    - [ ] [`quic-rpc`](https://github.com/n0-computer/quic-rpc)
    - [ ] [`iroh-gossip`](https://github.com/n0-computer/iroh-gossip)
    - [ ] [`iroh-blobs`](https://github.com/n0-computer/iroh-blobs)
    - [ ] [`dumbpipe`](https://github.com/n0-computer/dumbpipe)
    - [ ] [`sendme`](https://github.com/n0-computer/sendme)

---------

Co-authored-by: Floris Bruynooghe <[email protected]>
## Description

Merges main and adapts for the changes from #3619 

## Breaking Changes

<!-- Optional, if there are any breaking changes document them,
including how to migrate older code. -->

## Notes & open questions

<!-- Any notes, remarks or open questions you have to make about the PR.
-->

## Change checklist
<!-- Remove any that are not relevant. -->
- [ ] Self-review.
- [ ] Documentation updates following the [style
guide](https://rust-lang.github.io/rfcs/1574-more-api-documentation-conventions.html#appendix-a-full-conventions-text),
if relevant.
- [ ] Tests if relevant.
- [ ] All breaking changes documented.
- [ ] List all breaking changes in the above "Breaking Changes" section.
- [ ] Open an issue or PR on any number0 repos that are affected by this
breaking change. Give guidance on how the updates should be handled or
do the actual updates themselves. The major ones are:
    - [ ] [`quic-rpc`](https://github.com/n0-computer/quic-rpc)
    - [ ] [`iroh-gossip`](https://github.com/n0-computer/iroh-gossip)
    - [ ] [`iroh-blobs`](https://github.com/n0-computer/iroh-blobs)
    - [ ] [`dumbpipe`](https://github.com/n0-computer/dumbpipe)
    - [ ] [`sendme`](https://github.com/n0-computer/sendme)

---------

Co-authored-by: Rüdiger Klaehn <[email protected]>
Co-authored-by: Friedel Ziegelmayer <[email protected]>
## Description

Avoid potentially busy looping in a tokio task.
I think this blocking leads to tokio not being able to close the runtime
properly.
## Description

Fixes #3642 

This moves discovery handling fully into the `EndpointStateActor`.
The pub(crate) interface to trigger discovery and get a
EndpointMappedAddr is now `Magicsock::resolve_remote`, which sends the
provided addresses to the EndpointStateActor. The actor starts discovery
if it does not have a selected path and if discovery is not running. It
returns either immediately if there are any known paths, or waits for
discovery to produce at least one result or an error. Once this returns,
`resolve_remote` returns either with a EndpointMappedAddr or with the
discovery error.

This means the current behavior is kept: We only start
`quinn::Endpoint::connect` once we have at least one transport address
for the remote. If not, we return the discovery error immediately from
`iroh::Endpoint::connect`.

This opens the door for us to easily tune when to run discovery in other
siutations, e.g. when all available paths to a remote are closed.
However, for now this PR still only starts discovery when
`Endpoint::connect` is called and no path is selected at the moment.


## Breaking Changes

<!-- Optional, if there are any breaking changes document them,
including how to migrate older code. -->

## Notes & open questions

<!-- Any notes, remarks or open questions you have to make about the PR.
-->

## Change checklist
<!-- Remove any that are not relevant. -->
- [ ] Self-review.
- [ ] Documentation updates following the [style
guide](https://rust-lang.github.io/rfcs/1574-more-api-documentation-conventions.html#appendix-a-full-conventions-text),
if relevant.
- [ ] Tests if relevant.
- [ ] All breaking changes documented.
- [ ] List all breaking changes in the above "Breaking Changes" section.
- [ ] Open an issue or PR on any number0 repos that are affected by this
breaking change. Give guidance on how the updates should be handled or
do the actual updates themselves. The major ones are:
    - [ ] [`quic-rpc`](https://github.com/n0-computer/quic-rpc)
    - [ ] [`iroh-gossip`](https://github.com/n0-computer/iroh-gossip)
    - [ ] [`iroh-blobs`](https://github.com/n0-computer/iroh-blobs)
    - [ ] [`dumbpipe`](https://github.com/n0-computer/dumbpipe)
    - [ ] [`sendme`](https://github.com/n0-computer/sendme)
## Description

* fix idle timeout clear condition (previously it would hot loop)
* fix hot loop when local_addrs watchable becomes disconnected during
shutdown
* when sending a datagram fails in the transports sender, include the
dst address in the error message
* do not break the RemoteStateActor when sending a datagram fails

## Breaking Changes

<!-- Optional, if there are any breaking changes document them,
including how to migrate older code. -->

## Notes & open questions

<!-- Any notes, remarks or open questions you have to make about the PR.
-->

## Change checklist
<!-- Remove any that are not relevant. -->
- [ ] Self-review.
- [ ] Documentation updates following the [style
guide](https://rust-lang.github.io/rfcs/1574-more-api-documentation-conventions.html#appendix-a-full-conventions-text),
if relevant.
- [ ] Tests if relevant.
- [ ] All breaking changes documented.
- [ ] List all breaking changes in the above "Breaking Changes" section.
- [ ] Open an issue or PR on any number0 repos that are affected by this
breaking change. Give guidance on how the updates should be handled or
do the actual updates themselves. The major ones are:
    - [ ] [`quic-rpc`](https://github.com/n0-computer/quic-rpc)
    - [ ] [`iroh-gossip`](https://github.com/n0-computer/iroh-gossip)
    - [ ] [`iroh-blobs`](https://github.com/n0-computer/iroh-blobs)
    - [ ] [`dumbpipe`](https://github.com/n0-computer/dumbpipe)
    - [ ] [`sendme`](https://github.com/n0-computer/sendme)

---------

Co-authored-by: Philipp Krüger <[email protected]>
## Description

This reverts a change from this PR:
#3384

I originally thought I could make this test more reliable by pausing the
tokio time across the `tokio::time::timeout` calls, but it turns out
that actually makes the test *more* flaky:
- When time is paused, the timeout will immediately fire once the tokio
runtime has no more CPU work to do.
- It's possible that there's no CPU work to do anymore, while there's
something else that is actually still doing work, e.g. networking.
- Before the `ActiveRelayActor` finishes its `run_connected` loop, it
will call `client_sink.close().await`, which will do actual I/O. When
the tokio runtime is paused at that moment, it'll immediately trigger
the test's timeout.

## Notes & open questions

I couldn't reproduce this problem even across a couple thousand runs of
the test locally. I'm not super confident that this fixes things, but
I've analyzed the logs and this seems to be the most likely thing that's
happening to me.

Closes #3613 

## Change checklist
<!-- Remove any that are not relevant. -->
- [x] Self-review.
## Description

This switches from the old DISCO to the so-new-it-doesnt-exit-yet QUIC
NAT Traversal.

## Breaking Changes

Nothing visible?  Maybe?

## Notes & open questions

The QUIC NAT Traversal API doesn't exist yet, so this won't even build
on any machine that's not mine.  I've locally patched in the dummy
methods that I use.

---------

Co-authored-by: dignifiedquire <[email protected]>
Co-authored-by: Frando <[email protected]>
## Description

This adds the conecpt of hooks to the iroh endpoint. `Hooks` are structs
implementing the `EndpointHooks` trait and are used to intercept the
establishment of connections. Multiple hooks can be added to the
endpoint, and they will be invoked for each hook in the order they have
been added to the endpoint.

Currently there's two methods on the `EndpointHooks` trait:

* `before_connect` is invoked before an outgoing connection is started.
* `after_handshake` is invoked for incoming and outgoing connections
once the TLS handshake has completed

Both methods return an `Outcome`, which can either be `Reject` or
`Accept`. If any hook returns `Reject`, the connection or connection
attempt will be rejected.

The PR also adds `ConnectionInfo`, which is a struct that has
information about a connection, but does not keep the connection itself
alive. It allows to inspect stats and paths, and there's a `closed`
method that returns a future which completes once the connection closes
(without keeping the connection alive).

The PR includes two examples:
* `auth-hook` implements authentication for iroh protocols through a
middleware and a separate authentication protocol. Individual protocols
don't need to be aware of authentication at all.
* `monitor-connnections` monitors incoming and outgoing connections and
prints connection stats once a connection closes.

## Breaking Changes

<!-- Optional, if there are any breaking changes document them,
including how to migrate older code. -->

## Notes & open questions

<!-- Any notes, remarks or open questions you have to make about the PR.
-->

## Change checklist
<!-- Remove any that are not relevant. -->
- [ ] Self-review.
- [ ] Documentation updates following the [style
guide](https://rust-lang.github.io/rfcs/1574-more-api-documentation-conventions.html#appendix-a-full-conventions-text),
if relevant.
- [ ] Tests if relevant.
- [ ] All breaking changes documented.
- [ ] List all breaking changes in the above "Breaking Changes" section.
- [ ] Open an issue or PR on any number0 repos that are affected by this
breaking change. Give guidance on how the updates should be handled or
do the actual updates themselves. The major ones are:
    - [ ] [`quic-rpc`](https://github.com/n0-computer/quic-rpc)
    - [ ] [`iroh-gossip`](https://github.com/n0-computer/iroh-gossip)
    - [ ] [`iroh-blobs`](https://github.com/n0-computer/iroh-blobs)
    - [ ] [`dumbpipe`](https://github.com/n0-computer/dumbpipe)
    - [ ] [`sendme`](https://github.com/n0-computer/sendme)

---------

Co-authored-by: dignifiedquire <[email protected]>
Co-authored-by: ramfox <[email protected]>
## Description

Since the server was actively closing the connection it is possible
that the client would not have read the response yet by the time the
connection is closed.

## Breaking Changes

<!-- Optional, if there are any breaking changes document them,
including how to migrate older code. -->

## Notes & open questions

<!-- Any notes, remarks or open questions you have to make about the PR.
-->

## Change checklist
<!-- Remove any that are not relevant. -->
- [ ] Self-review.
- [ ] Documentation updates following the [style
guide](https://rust-lang.github.io/rfcs/1574-more-api-documentation-conventions.html#appendix-a-full-conventions-text),
if relevant.
- [ ] Tests if relevant.
- [ ] All breaking changes documented.
- [ ] List all breaking changes in the above "Breaking Changes" section.
- [ ] Open an issue or PR on any number0 repos that are affected by this
breaking change. Give guidance on how the updates should be handled or
do the actual updates themselves. The major ones are:
    - [ ] [`quic-rpc`](https://github.com/n0-computer/quic-rpc)
    - [ ] [`iroh-gossip`](https://github.com/n0-computer/iroh-gossip)
    - [ ] [`iroh-blobs`](https://github.com/n0-computer/iroh-blobs)
    - [ ] [`dumbpipe`](https://github.com/n0-computer/dumbpipe)
    - [ ] [`sendme`](https://github.com/n0-computer/sendme)

<!-- Message of single commit: -->
## Description

Based on #3593 

This was always just a placeholder, and can now be collected using
`EndpointHooks`.

## Breaking Changes

- remove `Endpoint::latency`

---------

Co-authored-by: varun-doshi <[email protected]>
## Description

Bumps quinn to latest `main-iroh` and netwatch/portmapper to
n0-computer/net-tools#72 (the latter is needed
because the quinn-udp version changed to 0.6 on `main-iroh`).

## Breaking Changes

<!-- Optional, if there are any breaking changes document them,
including how to migrate older code. -->

## Notes & open questions

<!-- Any notes, remarks or open questions you have to make about the PR.
-->

## Change checklist
<!-- Remove any that are not relevant. -->
- [ ] Self-review.
- [ ] Documentation updates following the [style
guide](https://rust-lang.github.io/rfcs/1574-more-api-documentation-conventions.html#appendix-a-full-conventions-text),
if relevant.
- [ ] Tests if relevant.
- [ ] All breaking changes documented.
- [ ] List all breaking changes in the above "Breaking Changes" section.
- [ ] Open an issue or PR on any number0 repos that are affected by this
breaking change. Give guidance on how the updates should be handled or
do the actual updates themselves. The major ones are:
    - [ ] [`quic-rpc`](https://github.com/n0-computer/quic-rpc)
    - [ ] [`iroh-gossip`](https://github.com/n0-computer/iroh-gossip)
    - [ ] [`iroh-blobs`](https://github.com/n0-computer/iroh-blobs)
    - [ ] [`dumbpipe`](https://github.com/n0-computer/dumbpipe)
    - [ ] [`sendme`](https://github.com/n0-computer/sendme)
## Description

Whenever we insert a new path, trigger pruning paths.

We currently only prune IP paths, and pruning paths only occurs if we
have more than 30 IP paths.

We will prune any paths that did not successfully holepunch. 

If there are still over 30 IP paths left, then we order the "inactive"
paths (paths that have been closed, but at one point holepunched), and
prune the paths that were closed earliest.

## Notes and Questions

- Added constants:
    - `MAX_IP_PATHS` = 30 - maximum IP paths per endpoint
    - `MAX_INACTIVE_IP_PATHS` = 10 - maximum inactive IP paths to keep
- New `PathState` field:
    - `status` - tracks the `PathStatus` of the path
- New `PathStatus` enum:
    -  `PathStatus::Open` - is an open path
    - `PathStatus::Inactive(Instant)` - was opened once, but currently inactive
    - `PathStatus::Unusable` - we attempted to use it, but it never connected
    - `PathStatus::Unknown` - we don't know the status yet
- New methods on `RemotePathState`:
  - `abandoned_path` - marks a path as abandoned with timestamp, triggered when we get the `PathEvent::Abandoned` event
  - `prune_paths` - triggers path pruning, occurs whenever we insert a path to the `RemotePathState`
  - changed `insert` to `insert_open_path`
- New `prune_ip_paths` function with all the prune logic:
    - Only prunes if IP paths exceed `MAX_IP_PATHS`
    - Never prunes active paths or paths of unknown status
    - Always prunes failed holepunch attempts (PathStatus::Unusable)
    - Keeps 10 most recently inactive paths that were previously successful
- Special case: if all paths failed, keeps `MAX_IP_PATHS` instead of pruning everything
- Added tests for edge cases and the typical case
…ain minimums when used with multipath (#3721)

## Description

This PR encapsulates the `quinn::TransportConfig` in a new struct called
`QuicTransportConfig`. It has all of the same methods as the
`quinn::TransportConfig`, but the follow methods will log warnings if
the user given values make iroh + multipath sub-optimal:
- default_path_keep_alive_interval() - should be at most
`HEARTBEAT_INTERVAL`ms
- default_path_max_idle_timeout() - should be at most
`PATH_MAX_IDLE_TIMEOUT`ms
- max_concurrent_multipath_paths() - should be at least `MAX_MULTIPATH +
1`
- set_max_remote_nat_traversal_addresses() - should be at least
`MAX_MULTIPATH`

These values are also set properly by default when creating a
`QuicTransportConfig`.

## quinn encapsulation/co-location

Created a new mod `quic` and co-located all the quinn exports there.

This would be the mod where we do any future encapsulation.

Note: the `StaticConfig` struct is a private struct that still uses
`quinn::TransportConfig` directly.

closes #3635 

## Breaking Changes
- `iroh`
    - changes
- `QuinnTransportConfig` renamed to `QuicTransportConfig` & is now
`Clone`
- `ConnectOptions::transport_config:
Option<Arc<quinn::TransportConfig>>` ->
`ConnectOptions::transport_config: Option<QuicTransportConfig>
- `ConnectOptions::with_transport_config(mut self, transport_config:
Arc<quinn::TransportConfig>)` ->
`ConnectOptions::with_transport_config(mut self, transport_config:
QuicTransportConfig)

## Open Questions
- [x] Need verification that my assumptions about "at least" and "at
most" for the above are the correct assumptions about what we want.
**done in #3635**
## Description

Adds a missing re-export for `quinn::Side`
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: 🏗 In progress

Development

Successfully merging this pull request may close these issues.

6 participants