Running the test suite with -race surfaces a set of data races across several packages. Each is described below with the shared state that triggers the detector.
1. Federation discovery globals (config/config.go)
Shared state: fedDiscoveryOnce *sync.Once and globalFedErr error.
GetFederation reads fedDiscoveryOnce and calls .Do(); SetFederation, InitClient, ResetConfig, and UpdateConfigFromListener assign new values to the same pointer and error variable directly. Because the assignments are not synchronized with concurrent reads, the race detector fires on both the pointer write and the error write.
2. XRootD metric accumulators (metrics/xrootd_metrics.go)
Shared state: lastOssStats OSSStatsGs and lastStats SummaryStat.
handleOSSStats and HandleSummaryPacket update package-level delta accumulators: each reads the previous value, computes a delta, writes the new value, and emits a metric. Multiple goroutines processing packets concurrently perform these read-modify-write sequences on the same variables without any lock.
3. Origin handler registration flag (origin_serve/handlers.go)
Shared state: handlersRegistered bool.
RegisterHandlers reads handlersRegistered as an early-return guard and writes it to true after registration completes. Tests call RegisterHandlers concurrently (or reset and re-register without synchronization), so the check-and-set is not atomic.
4. Partially initialized director ad cache entries (director/director_advertise.go)
Shared state: directorInfo.ad, directorInfo.cancel, and directorInfo.forwardAdChan.
updateInternalDirectorCache allocates an empty directorInfo, publishes it into the shared TTL cache with GetOrSet, and then populates its fields. Concurrent readers of the cache (e.g. sendMyAd) could observe a partially-initialized entry where ad, cancel, and forwardAdChan were still zero values.
5. HTTP transport cache (config/transport.go)
Shared state: onceTransport sync.Once and the transport, basicTransport, client, etc. pointer variables.
GetTransport and other accessors call onceTransport.Do(setupTransport). ResetConfig writes onceTransport = sync.Once{} and zeroes the pointer variables directly, without any lock. sync.Once serializes concurrent initializations through it, but does not protect against a concurrent assignment to the Once variable itself or to the transport pointers.
6. Launcher goroutine captures outer err (launchers/launcher.go)
Shared state: the err variable declared in the LaunchModules scope.
Several goroutines spawned by LaunchModules assign to the same captured err from the enclosing function. The parent may also read err after launching those goroutines, so concurrent writes and reads of the same variable occur without synchronization.
7. Director TTL cache iteration during eviction (director/sort.go)
Shared state: the internal list of the serverAds ttlcache.
getAdsForPath iterated the live cache with Range. The ttlcache eviction goroutine concurrently removes expired entries from the same internal list. The repository already avoids this pattern elsewhere by taking an Items() snapshot first.
8. ObjectStat request handler override (director/stat.go)
Shared state: stat.ReqHandler.
queryServersForObject read stat.ReqHandler inside goroutines it launched. Tests replace stat.ReqHandler between subtests. Goroutines from a prior subtest that have not yet finished can therefore race with the next subtest writing a new handler value.
9. Shared JWK key metadata mutation (client/acquire_token.go)
Shared state: a cached jwk.Key object.
Concurrent calls to generateToken each called key.Set("kid", keyId) on the same cached key object before signing. The JWK library mutates an internal metadata map when Set is called, so concurrent callers race on that map.
10. Advertisement IOLoad read outside lock (director/monitor.go)
Shared state: Advertisement.ServerAd.IOLoad.
LaunchPeriodicDirectorTest copies adItem.Value().ServerAd directly. That struct copy reads IOLoad without the Advertisement's RWMutex, while SetIOLoad holds the same mutex when writing the field.
11. *gin.Context passed to storage and outbound HTTP calls (oauth2/issuer/, web_ui/)
Shared state: internal fields of a pooled *gin.Context.
*gin.Context implements context.Context, so it compiles wherever a context.Context is accepted. But gin reads and writes its own context fields (response writer, key/value store, etc.) throughout the request lifecycle. Storage calls and outbound HTTP requests that received a *gin.Context directly held a reference to that mutable object. The database/sql package spawns an awaitDone goroutine that reads the context after the handler returns; gin may have already recycled the object by then.
12. Server identity ad cache reset (server_utils/sitename.go)
Shared state: baseAdOnce sync.Once, baseAd server_structs.ServerBaseAd, and baseAdErr error.
IsDirectorAdFromSelf initializes the cache lazily through baseAdOnce. ResetTestState writes new values to all three variables directly. The same sync.Once-vs-assignment race as in the transport cache above: the Once serializes concurrent initializations but not a concurrent reset.
13. Shared rand.Rand in PelicanFS stress test (client/pelican_fs_test.go)
Shared state: a single *math/rand.Rand created before the stress-test goroutines are launched.
math/rand.Rand is not safe for concurrent use. All worker goroutines called Intn on the same instance, which races on the RNG's internal state.
14. Test log hook formatting races with cleanup (test_utils/utils.go)
Shared state: logrus.StandardLogger().ReportCaller and the test hook.
SetupTestLogging registers a t.Cleanup that restores the global logrus logger state. If a goroutine is still formatting or firing a log entry through the test hook when cleanup runs, it reads entry.Logger.ReportCaller while cleanup writes it.
(Brian A: With a tip of the hat to Copilot. Inspired by the mess of failures after the nightly unit test runs started using -race.)
Running the test suite with
-racesurfaces a set of data races across several packages. Each is described below with the shared state that triggers the detector.1. Federation discovery globals (
config/config.go)Shared state:
fedDiscoveryOnce *sync.OnceandglobalFedErr error.GetFederationreadsfedDiscoveryOnceand calls.Do();SetFederation,InitClient,ResetConfig, andUpdateConfigFromListenerassign new values to the same pointer and error variable directly. Because the assignments are not synchronized with concurrent reads, the race detector fires on both the pointer write and the error write.2. XRootD metric accumulators (
metrics/xrootd_metrics.go)Shared state:
lastOssStats OSSStatsGsandlastStats SummaryStat.handleOSSStatsandHandleSummaryPacketupdate package-level delta accumulators: each reads the previous value, computes a delta, writes the new value, and emits a metric. Multiple goroutines processing packets concurrently perform these read-modify-write sequences on the same variables without any lock.3. Origin handler registration flag (
origin_serve/handlers.go)Shared state:
handlersRegistered bool.RegisterHandlersreadshandlersRegisteredas an early-return guard and writes it totrueafter registration completes. Tests callRegisterHandlersconcurrently (or reset and re-register without synchronization), so the check-and-set is not atomic.4. Partially initialized director ad cache entries (
director/director_advertise.go)Shared state:
directorInfo.ad,directorInfo.cancel, anddirectorInfo.forwardAdChan.updateInternalDirectorCacheallocates an emptydirectorInfo, publishes it into the shared TTL cache withGetOrSet, and then populates its fields. Concurrent readers of the cache (e.g.sendMyAd) could observe a partially-initialized entry wheread,cancel, andforwardAdChanwere still zero values.5. HTTP transport cache (
config/transport.go)Shared state:
onceTransport sync.Onceand thetransport,basicTransport,client, etc. pointer variables.GetTransportand other accessors callonceTransport.Do(setupTransport).ResetConfigwritesonceTransport = sync.Once{}and zeroes the pointer variables directly, without any lock.sync.Onceserializes concurrent initializations through it, but does not protect against a concurrent assignment to theOncevariable itself or to the transport pointers.6. Launcher goroutine captures outer
err(launchers/launcher.go)Shared state: the
errvariable declared in theLaunchModulesscope.Several goroutines spawned by
LaunchModulesassign to the same capturederrfrom the enclosing function. The parent may also readerrafter launching those goroutines, so concurrent writes and reads of the same variable occur without synchronization.7. Director TTL cache iteration during eviction (
director/sort.go)Shared state: the internal list of the
serverAdsttlcache.getAdsForPathiterated the live cache withRange. The ttlcache eviction goroutine concurrently removes expired entries from the same internal list. The repository already avoids this pattern elsewhere by taking anItems()snapshot first.8. ObjectStat request handler override (
director/stat.go)Shared state:
stat.ReqHandler.queryServersForObjectreadstat.ReqHandlerinside goroutines it launched. Tests replacestat.ReqHandlerbetween subtests. Goroutines from a prior subtest that have not yet finished can therefore race with the next subtest writing a new handler value.9. Shared JWK key metadata mutation (
client/acquire_token.go)Shared state: a cached
jwk.Keyobject.Concurrent calls to
generateTokeneach calledkey.Set("kid", keyId)on the same cached key object before signing. The JWK library mutates an internal metadata map whenSetis called, so concurrent callers race on that map.10. Advertisement
IOLoadread outside lock (director/monitor.go)Shared state:
Advertisement.ServerAd.IOLoad.LaunchPeriodicDirectorTestcopiesadItem.Value().ServerAddirectly. That struct copy readsIOLoadwithout theAdvertisement'sRWMutex, whileSetIOLoadholds the same mutex when writing the field.11.
*gin.Contextpassed to storage and outbound HTTP calls (oauth2/issuer/,web_ui/)Shared state: internal fields of a pooled
*gin.Context.*gin.Contextimplementscontext.Context, so it compiles wherever acontext.Contextis accepted. But gin reads and writes its own context fields (response writer, key/value store, etc.) throughout the request lifecycle. Storage calls and outbound HTTP requests that received a*gin.Contextdirectly held a reference to that mutable object. Thedatabase/sqlpackage spawns anawaitDonegoroutine that reads the context after the handler returns; gin may have already recycled the object by then.12. Server identity ad cache reset (
server_utils/sitename.go)Shared state:
baseAdOnce sync.Once,baseAd server_structs.ServerBaseAd, andbaseAdErr error.IsDirectorAdFromSelfinitializes the cache lazily throughbaseAdOnce.ResetTestStatewrites new values to all three variables directly. The samesync.Once-vs-assignment race as in the transport cache above: theOnceserializes concurrent initializations but not a concurrent reset.13. Shared
rand.Randin PelicanFS stress test (client/pelican_fs_test.go)Shared state: a single
*math/rand.Randcreated before the stress-test goroutines are launched.math/rand.Randis not safe for concurrent use. All worker goroutines calledIntnon the same instance, which races on the RNG's internal state.14. Test log hook formatting races with cleanup (
test_utils/utils.go)Shared state:
logrus.StandardLogger().ReportCallerand the test hook.SetupTestLoggingregisters at.Cleanupthat restores the global logrus logger state. If a goroutine is still formatting or firing a log entry through the test hook when cleanup runs, it readsentry.Logger.ReportCallerwhile cleanup writes it.(Brian A: With a tip of the hat to Copilot. Inspired by the mess of failures after the nightly unit test runs started using
-race.)