Tests were run inside the dev container with two non-default characteristics:
-
/etc/pelican/pelican.yaml is configured for a local test federation:
Federation:
DiscoveryUrl: https://discovery:8444
DirectorUrl: https://director:8444
RegistryUrl: https://registry:8444
Server:
TLSKey: /certs/tls.key
TLSCertificateChain: /certs/tls.crt
-
The container runs as a non-root user (the dev container image default is root).
Tests that do not override these settings inherit ambient values and fail. Most failures are isolation problems; TestUpdateCert also contains a real test bug.
1. Server.Hostname = "dev" (container hostname)
Affects:
All tests that call config.InitServer or fed_test_utils.NewFedTest without overriding Server.Hostname.
Symptom:
x509: certificate is valid for localhost, …, not dev
Server.Hostname is not set in any config file. Pelican defaults to the system hostname, which is dev. The test TLS certificate does not cover dev.
Fix:
Set Server.Hostname = "localhost" in the shared initServerForTest helper (director/test_helpers_test.go) and in NewFedTest (fed_test_utils/fed.go).
2. Federation URLs from /etc/pelican/pelican.yaml
Affects:
TestCompareMetadata/disabled-when-no-discovery-url (director/metadata_comparison_test.go)
TestParseRemoteAsPUrl/test_valid_path_that_falls_back_to_configured_director_for_discovery (client/main_test.go)
TestGetCacheHostnameFromToken (broker/token_utils_test.go)
TestInitServerUrl (config/config_test.go)
Symptoms:
dial tcp: lookup discovery on …:53: no such host
Token issuer https://your-registry.com/… doesn't start with https://registry:8444/…
expected: "https://example.com" actual: "https://director:8444"
pelican.yaml sets Federation.DiscoveryUrl, DirectorUrl, and RegistryUrl to the test-federation hosts. Tests that don't clear or override these params pick them up from Viper. discovery, director, and registry are not resolvable from within the dev container.
Fix:
Clear or override all Federation.*Url params in each test (or in a shared setup function) before calling InitFederation or any function that reads them from Viper.
3. TLS paths from /etc/pelican/pelican.yaml (non-root)
Affects:
TestS3OriginConfig (xrootd/origin_test.go)
TestCopyCertificates (xrootd/xrootd_config_test.go)
Symptoms:
RSA type private key in PKCS #8 form is not allowed for /certs/tls.key.
Use an ECDSA key instead.
rename /certs/tls.crt /certs/tls.crt.orig: permission denied
pelican.yaml points Server.TLSKey and Server.TLSCertificateChain at /certs/. Running as non-root, those files are not writable. /certs/tls.key is also RSA; the code requires ECDSA.
Fix:
Generate a temporary ECDSA key/cert pair in t.TempDir() and override both Server.TLSKey and Server.TLSCertificateChain before calling InitServer.
4. TestMultiuserFileSystem_BasicOperations requires CAP_SETGID
Symptom:
failed to set supplementary groups: setgroups(0): operation not permitted
runAsUser unconditionally calls threadSetgroups, even when secondaryGIDs is nil (converted to []uint32{}), making a setgroups(0) syscall that requires CAP_SETGID. The test's os.Getuid() == 0 guard does not prevent this — when running as non-root, the non-root code path still reaches runAsUser.
Fix:
Probe for CAP_SETGID (or attempt a dry-run setgroups) at test start and call t.Skip if the capability is absent.
5. TestUpdateCert: isolation gap + real test bug
Symptom:
The entire web_ui package times out after 10 minutes.
Two independent problems:
-
Isolation gap — same as issue 3: pelican.yaml points at /certs/tls.crt, which is not writable under non-root, causing the cert-update path to fail. Fix: override both TLS params to writable temp files.
-
Real test bug — after the cert-path failure, egrp.Wait() blocks indefinitely because a goroutine is stuck sending on an unbuffered doneChan that no one drains. Fix: buffer doneChan (capacity 1), or use close-based signaling.
Brian A: With a tip of the hat to Copilot.
Tests were run inside the dev container with two non-default characteristics:
/etc/pelican/pelican.yamlis configured for a local test federation:The container runs as a non-root user (the dev container image default is
root).Tests that do not override these settings inherit ambient values and fail. Most failures are isolation problems;
TestUpdateCertalso contains a real test bug.1.
Server.Hostname = "dev"(container hostname)Affects:
All tests that call
config.InitServerorfed_test_utils.NewFedTestwithout overridingServer.Hostname.Symptom:
Server.Hostnameis not set in any config file. Pelican defaults to the system hostname, which isdev. The test TLS certificate does not coverdev.Fix:
Set
Server.Hostname = "localhost"in the sharedinitServerForTesthelper (director/test_helpers_test.go) and inNewFedTest(fed_test_utils/fed.go).2. Federation URLs from
/etc/pelican/pelican.yamlAffects:
TestCompareMetadata/disabled-when-no-discovery-url(director/metadata_comparison_test.go)TestParseRemoteAsPUrl/test_valid_path_that_falls_back_to_configured_director_for_discovery(client/main_test.go)TestGetCacheHostnameFromToken(broker/token_utils_test.go)TestInitServerUrl(config/config_test.go)Symptoms:
pelican.yamlsetsFederation.DiscoveryUrl,DirectorUrl, andRegistryUrlto the test-federation hosts. Tests that don't clear or override these params pick them up from Viper.discovery,director, andregistryare not resolvable from within the dev container.Fix:
Clear or override all
Federation.*Urlparams in each test (or in a shared setup function) before callingInitFederationor any function that reads them from Viper.3. TLS paths from
/etc/pelican/pelican.yaml(non-root)Affects:
TestS3OriginConfig(xrootd/origin_test.go)TestCopyCertificates(xrootd/xrootd_config_test.go)Symptoms:
pelican.yamlpointsServer.TLSKeyandServer.TLSCertificateChainat/certs/. Running as non-root, those files are not writable./certs/tls.keyis also RSA; the code requires ECDSA.Fix:
Generate a temporary ECDSA key/cert pair in
t.TempDir()and override bothServer.TLSKeyandServer.TLSCertificateChainbefore callingInitServer.4.
TestMultiuserFileSystem_BasicOperationsrequiresCAP_SETGIDSymptom:
runAsUserunconditionally callsthreadSetgroups, even whensecondaryGIDsis nil (converted to[]uint32{}), making asetgroups(0)syscall that requiresCAP_SETGID. The test'sos.Getuid() == 0guard does not prevent this — when running as non-root, the non-root code path still reachesrunAsUser.Fix:
Probe for
CAP_SETGID(or attempt a dry-runsetgroups) at test start and callt.Skipif the capability is absent.5.
TestUpdateCert: isolation gap + real test bugSymptom:
The entire
web_uipackage times out after 10 minutes.Two independent problems:
Isolation gap — same as issue 3:
pelican.yamlpoints at/certs/tls.crt, which is not writable under non-root, causing the cert-update path to fail. Fix: override both TLS params to writable temp files.Real test bug — after the cert-path failure,
egrp.Wait()blocks indefinitely because a goroutine is stuck sending on an unbuffereddoneChanthat no one drains. Fix: bufferdoneChan(capacity 1), or use close-based signaling.Brian A: With a tip of the hat to Copilot.