Skip to content

test(ldap): reproduce GH #980 divergent-principal DAV bug via OpenLDAP#997

Merged
cbcoutinho merged 3 commits into
masterfrom
feat/openldap-980-reproduction
Jul 1, 2026
Merged

test(ldap): reproduce GH #980 divergent-principal DAV bug via OpenLDAP#997
cbcoutinho merged 3 commits into
masterfrom
feat/openldap-980-reproduction

Conversation

@cbcoutinho

Copy link
Copy Markdown
Owner

What

Adds an ldap docker-compose profile + integration lane that provides the live RED→GREEN reproduction of #980 (DAV paths built from the loginName instead of the canonical Nextcloud UID). This is the reproduction the Keycloak lane (#993) could not provide.

Why LDAP (and not Keycloak/email)

#980's bug needs two conditions together: loginName != UID and /remote.php/dav/files/<loginName>/ does not resolve to /files/<UID>/. On the CI Nextcloud versions:

  • login-by-email gives loginName != UID but Nextcloud resolves /files/<email>/ to the real home (email is a valid path alias) → no bug.
  • user_oidc hardcodes loginName == UID (sha256 hash) in LoginController.php → no divergence.
  • user_ldap maps the LDAP login alice to a UID derived from the LDAP entryUUID, and the LDAP login is not a files-path alias → /files/alice/ genuinely 404s. ✅

Verified live: /files/<uid>/ → 207, /files/alice/ → 404, current-user-principal returns the UID (so #980's discovery resolves it).

How

  • openldap service (vegardit/openldap:2.6.10, digest-pinned) seeds alice from ldap/bootstrap.ldif (no data volume → re-seeds fresh each up).
  • user_ldap is configured by an app-hook app-hooks/post-installation/15-setup-ldap-backend.sh, gated on the openldap service (getent hosts openldap) so it is a no-op for every other lane — mirroring 15-setup-keycloak-provider.sh.
  • tests/server/ldap/ drives the multi-user BasicAuth MCP service (port 8003) as alice (no browser). The WebDAV round-trip is xfail(strict=True): it fails on master (bug present) and xpasses once fix(client): resolve DAV paths via current-user-principal discovery #980's BaseNextcloudClient._ensure_principal_id discovery lands — at which point strict=True turns the xpass into a CI failure, signalling that the marker should be dropped.
  • CI: a new ldap matrix lane (reuses multi-user-basic + the ldap profile); docs in CLAUDE.md.

Validation (local, NC32)

  • Fresh docker compose --profile ldap --profile multi-user-basic up → the app-hook auto-configures user_ldap (verified in app logs, ldap:test-config valid).
  • pytest -m ldap1 xfailed (bug present on master).
  • Cherry-picked fix(client): resolve DAV paths via current-user-principal discovery #980's client fix + rebuilt → 1 passed (fix resolves the principal). Reverted.

Relationship to #980 / #993

This is the test/infra counterpart to the fix in #980; it's the genuine reproduction that #993 set out to build but couldn't via Keycloak. Drop the xfail marker when #980 merges.


This PR was generated with the help of AI, and reviewed by a Human

Adds an `ldap` docker-compose profile + integration lane that provides the
live RED→GREEN reproduction of #980 (DAV paths built from the loginName
instead of the canonical Nextcloud UID) that the Keycloak lane (#993) could
not: login-by-email resolves to the real home (email is a path alias) and
user_oidc hardcodes loginName == UID.

- `openldap` service (vegardit/openldap:2.6.10) seeds user `alice` from
  ldap/bootstrap.ldif. Nextcloud's user_ldap maps her to a UID derived from
  the LDAP entryUUID, so `loginName (alice) != UID` AND
  `/remote.php/dav/files/alice/` does not resolve to her real home.
- user_ldap is configured by an app-hook
  (app-hooks/post-installation/15-setup-ldap-backend.sh), gated on the
  openldap service so it is a no-op for every other lane — mirroring
  15-setup-keycloak-provider.sh.
- tests/server/ldap drives the multi-user BasicAuth MCP service (port 8003)
  as `alice`; the round-trip is xfail(strict=True): it fails on master (bug
  present) and xpasses once #980's BaseNextcloudClient._ensure_principal_id
  discovery lands, at which point the marker should be dropped.
- CI matrix gains an `ldap` lane (reuses multi-user-basic + `ldap` profile,
  no browser); docs added to CLAUDE.md.

Verified locally: fresh-install hook auto-configures user_ldap; test xfails on
master and xpasses with #980 cherry-picked.

Relates to #980. Tracked on Deck card #490.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@claude

claude Bot commented Jul 1, 2026

Copy link
Copy Markdown

Claude finished @cbcoutinho's task in 3m 43s —— View job


Review

This is a well-constructed reproduction lane — the LDAP-vs-OIDC-vs-email reasoning in the PR description is sound (I traced through LoginController.php's email-alias resolution and user_oidc's hardcoded loginName == UID claim and they check out against how DAV path resolution works), and the two prior review rounds already caught the real bugs (single-user marker leak, hardcoded s01). Nothing blocking left. A couple of small things worth a look:

🟢 Nit — stale image reference in the LDIF comment
ldap/bootstrap.ldif:2 says # Loaded by osixia/openldap into the seeded dc=example,dc=org tree., but docker-compose.yml:455 uses vegardit/openldap:2.6.10 (a maintained fork of the now-unmaintained osixia image). Functionally fine since vegardit is schema-compatible, but the comment should say vegardit/openldap to avoid confusing a future reader who greps for the image name.
Fix this →

🟡 Worth a comment — getent hosts openldap gate has no retry
app-hooks/post-installation/15-setup-ldap-backend.sh:22 does a single, unretried getent hosts openldap to decide whether the ldap profile is active, then exits 0 (no-op) if it fails. Since the app service has no depends_on: openldap (can't — Compose won't let an always-on service depend on a profile-gated one), this relies on openldap's container existing in Docker's embedded DNS by the time Nextcloud's install finishes and hooks run. In practice that race is very unlikely to lose (DB wait + NC install takes far longer than starting the openldap container), and the PR's own validation confirms it works — but if it ever does lose the race, the hook silently skips LDAP setup, alice never becomes a valid user, and the reproduction test's mkdir_result.isError would still be True — which xfail(strict=True) swallows as "expected failure," indistinguishable from the real #980 bug. The "Verify LDAP backend" CI step (test.yml:250) only echoes diagnostics on this, it doesn't fail the job. Not blocking (the harness clearly works today), just flagging as a source of a possible false-green if this lane ever gets flaky — a small retry loop around the getent check (mirroring the retry loop already used for the LDAP connectivity test below it) would close that gap.

Everything else checks out:

  • Digest-pinned openldap image, consistent ldap marker wiring in pyproject.toml and both single-user exclusion clauses in test.yml.
  • nc_mcp_ldap_alice_client fixture in tests/server/ldap/conftest.py mirrors the existing nc_mcp_basic_auth_client pattern (BasicAuth header, port 8003) — good consistency with ADR-020.
  • # NOSONAR(S2068) credential annotation matches the convention already used in tests/server/keycloak/conftest.py.
  • tests/server/ldap/__init__.py matches sibling test-dir layout (auth/, keycloak/, login_flow/).
  • Verified the bootstrap LDIF's {SSHA} hash for alice actually decodes to AlicePass123! with the documented salt — correct.
  • try/finally cleanup in the WebDAV round-trip test is correctly scoped: the mkdir assertion (the one expected to fail pre-fix) sits outside the try, so cleanup is only attempted for resources that could actually have been created.

Solid reproduction of #980 and nice writeup of why LDAP was necessary where Keycloak (#993) couldn't do it. Approving from a code-review perspective — the two notes above are optional polish, not blockers.

… in verify

Round-1 review (🔴): the single-user matrix lane runs a broad
`integration and not keycloak and not login_flow and not multi_user_basic`
filter, which also collected the new `[integration, ldap]` test. That lane does
not start mcp-multi-user-basic (port 8003), so the fixture connection fails
during SETUP — an ERROR that xfail(strict) does not swallow — breaking the
single-user job on every PR. Add `and not ldap` to both clauses, mirroring how
keycloak/login_flow/multi_user_basic are already excluded.

Also (🟢): the Verify LDAP step hardcoded `s01`; re-derive the config id the
same way the app-hook does so the diagnostic stays accurate.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@cbcoutinho

Copy link
Copy Markdown
Owner Author

Round-1 findings addressed in `9b429204`:

  • 🔴 ldap marker not excluded from the single-user lane: Confirmed and fixed. Verified locally that the single-user filter (integration and not keycloak and not login_flow and not multi_user_basic) collected the [integration, ldap] test (1 collected), which would ERROR in setup there (port 8003 not up; xfail doesn't catch setup errors). Added and not ldap to both clauses of the single-user markers filter — mirroring how keycloak/login_flow/multi_user_basic are excluded. Re-verified: single-user filter now collects 0 from tests/server/ldap/ (1 deselected), -m ldap still collects 1. Good catch — the ldap CI lane itself was already green (nc32), but the single-user lane would have broken on merge.
  • 🟢 hardcoded s01 in the Verify LDAP debug step: Fixed — the step now re-derives the config id the same way the app-hook does (ldap:show-config | grep Configuration), falling back to s01.

Went with the marker-exclusion fix (option 1) over a fixture skip-guard, for consistency with the sibling keycloak/multi_user_basic backends added to this same file (none of which use a skip guard; only postgres does).

YAML valid; marker collection verified both ways.

…ew nits)

Round-2 🟢 nits (non-blocking): document that the openldap healthcheck searches
for the seeded `alice` entry so "healthy" implies the bootstrap LDIF was applied,
and note that the `|| true` on the config-id pipeline guards the whole command
substitution under `set -e`. No behavior change.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@cbcoutinho

Copy link
Copy Markdown
Owner Author

Round-2 nits addressed in 2d6279e1 (both non-blocking, no behavior change): added a comment noting the openldap healthcheck searches for the seeded alice entry so "healthy" == "bootstrap applied", and a note that the || true on the config-id pipeline guards the whole command substitution under set -e. The CLAUDE.md fresh-install caveat needs no change (it was flagged as already-good). Thanks for the thorough review — with "No blocking issues found" this is merge-ready pending CI.

@sonarqubecloud

sonarqubecloud Bot commented Jul 1, 2026

Copy link
Copy Markdown

@cbcoutinho cbcoutinho merged commit e4cd494 into master Jul 1, 2026
20 checks passed
cbcoutinho added a commit to dismantl/nextcloud-mcp-server that referenced this pull request Jul 1, 2026
… path

Rebased onto master (which now carries the OpenLDAP cbcoutinho#980 reproduction lane from
cbcoutinho#997). With this PR's BaseNextcloudClient._ensure_principal_id fix present, the
divergent-principal WebDAV round-trip lands in alice's real home, so the test
passes — remove the xfail(strict=True) marker and reword the docstring from
"reproduction (xfails until the fix)" to a passing regression guard.

Verified locally: `pytest -m ldap` → 1 passed with the fix (was 1 xfailed on
master without it).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant