fix(client): race conditions in AutoTLS initialization#80
Draft
fix(client): race conditions in AutoTLS initialization#80
Conversation
Two race conditions could cause forge addresses to not appear:
1. hasHost() returning false when address factory is called
- provideHost now calls hostFn() immediately, ensuring hasHost()
returns true right after ProvideHost() is called
- prevents race where address factory runs before Start() goroutine
2. hasCert not being set when cert already exists
- cached_managed_cert event only fires when cert is NEW to cache
- if cert was already in certmagic's in-memory cache, no event fires
- now explicitly set hasCert=true after ManageAsync() when we know
cert existed in storage beforehand
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #80 +/- ##
==========================================
+ Coverage 64.97% 68.28% +3.31%
==========================================
Files 12 21 +9
Lines 1062 1668 +606
==========================================
+ Hits 690 1139 +449
- Misses 292 410 +118
- Partials 80 119 +39 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
- TestHasHostTrueImmediatelyAfterProvideHost: verify hostFn is resolved immediately after ProvideHost, no deadlock when AddressFactory called - TestAddressFactoryOrderIndependence: verify factory works before/after ProvideHost without deadlocking - TestAddressFactoryConcurrentAccess: verify no data races with concurrent AddressFactory calls (run with -race flag) - TestProvideHostIdempotent: verify second ProvideHost doesn't deadlock and first host wins via sync.OnceValue
The previous fix (5be6b87) addressed race conditions but had gaps: 1. onCertLoaded callback could fire before reachability was confirmed - OnEvent handler called callback directly, racing with Start() - now OnEvent only signals via channel, Start() orchestrates timing 2. cert renewal wasn't detected as needing reachability check - only checked if cert exists, not if it needs renewal - certs in renewal window should wait for AutoNAT like new certs - added localCertNeedsRenewal() to detect this case 3. cached cert path blocked unnecessarily on reachability - cert already exists, no need to verify reachability again - callback now fires immediately after loading cert from cache - address factory's ConfirmedAddrs() dynamically filters unreachable addrs 4. simplified to AutoNAT v2 only (EvtHostReachableAddrsChanged) - v2 provides per-address reachability which is what p2p-forge needs - v2 is enabled by default in modern go-libp2p 5. cert check failures were silent - converted to methods to use instance logger (respects WithLogger) - added ERROR logs for unexpected failures to aid debugging
8b36ea4 to
e382fa6
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Warning
Do not merge, not ready for review, still investigating.
Two race conditions could cause forge (
/dns../ws/.libp2p.direct) addresses to not appear:hasHost()returning false when address factory is calledprovideHostnow callshostFn()immediately, ensuringhasHost()returns true right afterProvideHost()is calledStart()goroutinehasCertnot being set when cert already existscached_managed_certevent only fires when cert is NEW to cachehasCert=trueafterManageAsync()when we know cert existed in storage beforehandTODO